Skip to content

Detect hot topics from news tweets using k-means on Word2Vec representations

Notifications You must be signed in to change notification settings

dzubo/twitter-hot-topics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Twitter Hot Topics Detection

A small project to demonstrate the usage of Twitter API and NLP techniques. The idea is to download tweets from specified accounts (news companies), cluster tweets into topics, detect the hottest topic, and output the most relevant news tweet from that topic.

This is not production-ready code, more like a proof of concept.

Concepts

The project was done using the following tools and techiques:

Files

The code split into separate files to make debugging and testing easier.

get_tweets.py - downloads news tweets.
create_hist_dataset.py - cleans and saves dataset.
save_vectors.py - converts sentences to vectors and save result for further modelling.
detect_hot.py - prints out 'hot' tweets.

Some (dirty) exploration

There is some thought process recorded in the Jupyter Notebooks.

NLP explore.ipynb - some exploration on clustering.
Tune parameters.ipynb - some exploration on tuning heuristic parameters.

About

Detect hot topics from news tweets using k-means on Word2Vec representations

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published