audio mp4 support? #188
-
Which Deepgram product are you using?Deepgram API DetailsHi all, does Deepgram support audio/mp4 files? Recording from the mic in safari, which only saves as an audio/mp4? I'm getting an error when submit the mp4 audio file to deepgram's api: Error: {'err_code': 'Bad Request', 'err_msg': 'Bad Request: failed to process audio: corrupt or unsupported data', 'request_id': '60cbf5f5-125b-4c7e-8009-5ea18c8a682a'} If you are making a request to the Deepgram API, what is the full Deepgram URL you are making a request to?No response If you are making a request to the Deepgram API and have a request ID, please paste it below:No response If possible, please attach your code or paste it into the text box.def send_audio_to_whisper(self, blob: Any, audio_type) -> str:
If possible, please attach an example audio file to reproduce the issue.No response |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hey @vivekr93, rather than use the from typing import Any
import requests
import os
DEEPGRAM_API_KEY = os.environ["DEEPGRAM_API_KEY"] # Your Deepgram API Key
def send_audio_to_nova(blob: Any, audio_type) -> str:
url = "https://api.deepgram.com/v1/listen?model=nova"
headers = {}
headers["Authorization"] = f"Token {DEEPGRAM_API_KEY}"
headers["Content-Type"] = audio_type
print("audio_type: ", audio_type)
try:
response = requests.post(url, headers=headers, data=blob, timeout=10)
if response.status_code == 200:
data = response.json()
print("Transcript:", data["results"]["channels"][0]["alternatives"][0]["transcript"])
return data["results"]["channels"][0]["alternatives"][0]["transcript"]
else:
print("Error:", response.json())
return ""
except Exception as error:
print("Error:", error)
return ""
def main():
# Download the audio from: https://static.deepgram.com/examples/en_NatGen_Medical_DocDictation.m4a
with open("./test-audio-files/en_NatGen_Medical_DocDictation.m4a", "rb") as f:
blob = f.read()
audio_type = "audio/mp4"
send_audio_to_nova(blob=blob, audio_type=audio_type)
if __name__ == "__main__":
main() Alternatively, the Deepgram Python SDK makes it much easier to send requests to Deepgram. Below are two code snippets (one async and one not) that might be helpful: def upload_audio_file() -> str:
deepgram = Deepgram(DEEPGRAM_API_KEY)
options = {"smart_format": True, "model": "nova"}
# Download the audio from: https://static.deepgram.com/examples/en_NatGen_Medical_DocDictation.m4a
with open("./test-audio-files/en_NatGen_Medical_DocDictation.m4a", "rb") as f:
source = {"buffer": f, "mimetype": "audio/mp4"}
data = deepgram.transcription.sync_prerecorded(source, options)
print(data["results"]["channels"][0]["alternatives"][0]["transcript"])
if __name__ == "__main__":
upload_audio_file() async def upload_audio_file() -> str:
deepgram = Deepgram(DEEPGRAM_API_KEY)
options = {"smart_format": True, "model": "nova"}
# Download the audio from: https://static.deepgram.com/examples/en_NatGen_Medical_DocDictation.m4a
with open("./test-audio-files/en_NatGen_Medical_DocDictation.m4a", "rb") as f:
source = {"buffer": f, "mimetype": "audio/mp4"}
data = await deepgram.transcription.prerecorded(source, options)
print(data["results"]["channels"][0]["alternatives"][0]["transcript"])
if __name__ == "__main__":
asyncio.run(upload_audio_file()) |
Beta Was this translation helpful? Give feedback.
Hey @vivekr93, rather than use the
files=...
parameter inrequests.post
, you can use thedata=<binary audio data>
parameter. Below is a complete python script that loads an audio file from disk and sends it to Deepgram for processing (you will need to replace the API key with your own).