Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we summarise results as time or mentions? #23

Open
dcabo opened this issue Oct 3, 2019 · 0 comments
Open

Should we summarise results as time or mentions? #23

dcabo opened this issue Oct 3, 2019 · 0 comments

Comments

@dcabo
Copy link
Member

dcabo commented Oct 3, 2019

When searching for a term, like "Rajoy", we initially thought we'd summarise the results as "Rajoy was talked about during 32 minutes". But, if we're not able to segment the full transcription is meaningful blocks (#7), what we have left are sentences. We currently add the duration of each sentence, taken from the start/end times of the captions, but I'm not sure this is too meaningful, because not every sentence in a given block is going to mention "Rajoy" again and again.

When Vox analised the media coverage of Democratic candidates it didn't overcomplicate it: it used as unit the 15-second clips that the Internet Archive offers, and instead of talking of "minutes and seconds" it talks about "mentions".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant