Transcription is a step backward | Voters

Transcription is a step backward

complete

Tanay M

Transcription is broken for me as well.

Can we please use the more sophisticated transcription models now? Whisper v1 isn't cutting it
Sometimes I don't want to chain AI actions. I just need the raw transcription. How can I disable this feature?
As you know, transcription isn't reliable (fails 1. I don't want to record a 5 min voice note, then have the app crash and then all the effort is lost. Is there an easy way to locate the recorded voice note without fiddling around inside Libraries Containers - maybe an easy button to locate the recording, or a button to "Try Again"

May 22, 2024

Naveen

marked this post as

complete

Voice-to-text functionality has been improved in version 0.30. Please submit a new request if you encounter any issues. Thank you!

Naveen

marked this post as

in progress

Improved voice to text feature in v0.21

You can now select Local or Deepgram whisper provider
Added Recording History view where you can see all your previous recordings, listen to the recorded audio(Hoping to improve this further)

Tanay M

Naveen Naveen this is awesome!!!

Naveen

marked this post as

planned

Naveen

Hey Tanay,

Unfortunately, OpenAI whisper API only supports whisper v2. I'm hoping they would add support for v3. I'm thinking of integrating Deepgram(https://deepgram.com) and AssemblyAI(https://www.assemblyai.com) in the mean time. Are you interested in any other whisper providers?
I added "None" option in v0.20(released today). Hope this resolves the issue
Sorry about this. I will handle the failure case properly and make sure that all old transcriptions are easily viewable in the future release. At the moment, I suggest a maximum of 3-4 minutes (which will be updated to support up to 1 hour of audio as well in future).

Tanay M

Naveen thanks for fixing this quite quickly! I can resume work now - I'm using Friday literally all the time for transcribing my rough voice notes into beautiful well structured client facing messages, so this really helps. The app shows that its using Whisper v1 which is why I was confused (screenshot attached)

Here's an idea for Whisper v3. If you want, you can contact Groq they have a speech2text API in private beta using Whisper 3 large, I'm so excited at the speed we'll get when we finally get access. Here is the link https://console.groq.com/docs/speech-text truly believe this will be the best thing in the market for speech to text real soon

Naveen

Tanay M: thank you
OpenAI whisper-1 points to Large Whisper V2 model. 
I'm hoping to add Groq support as soon as they make the API public so that all the FridayGPT users can use it.