wav file "invalid data received" #378
-
Are there any specific settings that deepgram expects for a wav file? frequency? mono/stereo? Getting an invalid data received "bad request" error using boilerplate code seen here, shared june 6 by Jason Maldonis. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 3 replies
-
Hi @dvorhes, you shouldn't need to match any particular specifications for audio frequency or number of channels. Deepgram accepts a wide variety of audio. There can be several reasons why this "invalid data received" error occurs. One quick troubleshooting spot, are you using the code exactly as-is? That code sample is for an mp4 file type. Make sure to specify your audio mimetype as If you can share a request ID from one of those "bad requests", we can look into it further as well. Ditto to sharing the exact code you're using, even if it's almost identical to the code sample in the other post you linked. This is likely to be some small tweak needed so Deepgram is receiving the type of audio it's expecting. |
Beta Was this translation helpful? Give feedback.
-
Hi Julia,
Thanks for your reply. Finally back on this problem and I think I got it to
work. The solution did indeed involve re-encoding via ffmpeg. My hunch is
that the ffmpeg settings, mono/stereo or otherwise don't matter that much,
rather it's the wav file's encoder metadata that is producing a problem.
Does the deepgram API look for a specific encoder?
For reference, the problem-file was encoded directly from Adobe Premiere.
The working file was a re-encoded version of that file via ffmpeg.
ffprobe data from both files here:
NOT WORKING:
Input #0, wav, from 'vo_transcription.wav':
Metadata:
encoded_by : Adobe Premiere Pro 2024.0 (Macin
encoder : Adobe Photoshop 23.2 (20220128.orig.527 28d5e1a)
(Macintosh)
date : 2023-11-02
creation_time : 18:07:27
time_reference : 0
Duration: 00:02:38.20, bitrate: 836 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, 1
channels, s16, 768 kb/s
DOES WORK:
Input #0, wav, from 'vo_transcription_mono_test2.wav':
Metadata:
date : 2023-11-02
encoder : Lavf60.3.100
encoded_by : Adobe Premiere Pro 2024.0 (Macin
Duration: 00:02:38.20, bitrate: 768 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, 1
channels, s16, 768 kb/s
Any ideas?
…On Thu, Oct 19, 2023 at 6:45 PM Julia Kroll ***@***.***> wrote:
Unfortunately I'm not able to replicate your error off the bat. One test I
did was to download one of our sample files (
https://static.deepgram.com/examples/interview_speech-analytics.wav) and
isolate just one channel for testing (ffmpeg -i
interview_speech-analytics.wav -af "pan=mono|FC=FR" right_mono.wav).
That's also a 48khz 16bit mono wav file and it transcribed fine with your
code as-is.
With a longer file, I ran into Error: ('Connection aborted.',
TimeoutError('The write operation timed out')) due to that timeout=10
setting, but when I bumped it to timeout=60, then longer files worked
fine as well.
Looking at your request ID, I do see the "invalid data received" in our
logs, but I'm still not able to reproduce the issue.
Have you tried running this code with only one file, or are you seeing it
on multiple files?
Are you still able to reproduce it now? Or was it only happening
yesterday, and perhaps we had some transient issue on our side that's since
been resolved, and is no longer occurring?
—
Reply to this email directly, view it on GitHub
<#378 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AORTNQFEJLVNNID2G2HRW5TYAGUPXAVCNFSM6AAAAAA6GGAZ7SVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TGMZTGA3TK>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
Hi @dvorhes, thanks for sharing this update, and I'm glad to hear you did get this working with re-encoding. It does look like an issue originating from Adobe itself. See how the original file has a bitrate of 836 kb/s, while the re-encoded one has a bitrate of 768 kb/s, indicating that the original bitrate is incorrect.
If you're able to specify your preferred output encoding, we recommend MP3 with a bitrate of 192 kb/s, and constant bitrate (CBR; as opposed to VBR variable bitrate).