toSrt(): max amount of words per SRT item instead of using the utterances array #308

carstenschaefer · 2023-08-04T14:06:36Z

carstenschaefer
Aug 4, 2023

Good Day!

Is there a way to get the SRT items based on a maximum amount of characters or words, instead of using the utterances?

Thanks for any hint!

Best, Carsten

Aug 4, 2023

Hi @carstenschaefer, what programming language / SDK are you working with? We have some support for this in Python and Node SDKs. For instance, the Python SDK has a line_length parameter in the to_SRT() and to_WebVTT() methods, which still uses utterances if available, but then splits the utterances by a maximum number of words in the line length. For instance:

from deepgram import Deepgram
deepgram = Deepgram("<API key>")
response = deepgram.transcription.sync_prerecorded({"url": "https://static.deepgram.com/examples/interview_speech-analytics.wav"}, {'punctuate': True, 'smart_format': True, 'model': 'nova', 'utterances': True})

Then setting a maximum number of 15 words returns results …

View full answer

jkroll-deepgram · 2023-08-04T15:34:12Z

jkroll-deepgram
Aug 4, 2023
Collaborator

Hi @carstenschaefer, what programming language / SDK are you working with? We have some support for this in Python and Node SDKs. For instance, the Python SDK has a line_length parameter in the to_SRT() and to_WebVTT() methods, which still uses utterances if available, but then splits the utterances by a maximum number of words in the line length. For instance:

from deepgram import Deepgram
deepgram = Deepgram("<API key>")
response = deepgram.transcription.sync_prerecorded({"url": "https://static.deepgram.com/examples/interview_speech-analytics.wav"}, {'punctuate': True, 'smart_format': True, 'model': 'nova', 'utterances': True})

Then setting a maximum number of 15 words returns results like:

deepgram.extra.to_WebVTT(response, line_length=15)

WEBVTT

1
00:00:00.160 --> 00:00:03.859
- Another big problem in the speech analytics space,

2
00:00:04.480 --> 00:00:08.099
- when customers first bring the software on is that they

3
00:00:08.724 --> 00:00:14.664
- they are blown away by the fact that an engine can monitor 100 of KPIs.

Versus 4 words:

deepgram.extra.to_WebVTT(response, line_length=4)

WEBVTT

1
00:00:00.160 --> 00:00:02.319
- Another big problem in

2
00:00:02.319 --> 00:00:03.859
- the speech analytics space,

3
00:00:04.480 --> 00:00:05.839
- when customers first bring

4
00:00:05.839 --> 00:00:07.200
- the software on is

5
00:00:07.200 --> 00:00:08.099
- that they

6
00:00:08.724 --> 00:00:10.564
- they are blown away

7
00:00:10.564 --> 00:00:11.445
- by the fact that

8
00:00:11.445 --> 00:00:13.144
- an engine can monitor

9
00:00:13.365 --> 00:00:14.664
- 100 of KPIs.

A note that if your response does not include utterances (i.e. you're using 'utterances': False), we do still return results that would be split at closer to exactly your maximum line_length. However, a warning is also raised, because there is a regulation that captions are supposed to be split by speaker, and the utterance feature ensures that.

We don't have a way to set a character-length limit, but you could achieve a similar result fairly easily by hacking/extending the SDK.

1 reply

carstenschaefer Aug 4, 2023
Author

Hey Julia!

Thanks for your response! It seems this is what I am looking for.
I am using C# SDK, but this enhancement is only available in the SDKs you mentioned.
We try to enhance the C# SDK and send a PR.

Best, Carsten

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deepgram

toSrt(): max amount of words per SRT item instead of using the utterances array #308

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Deepgram

toSrt(): max amount of words per SRT item instead of using the utterances array #308

carstenschaefer Aug 4, 2023

Replies: 1 comment · 1 reply

jkroll-deepgram Aug 4, 2023 Collaborator

carstenschaefer Aug 4, 2023 Author

carstenschaefer
Aug 4, 2023

Replies: 1 comment 1 reply

jkroll-deepgram
Aug 4, 2023
Collaborator

carstenschaefer Aug 4, 2023
Author