Skip to content

Latest commit

 

History

History
96 lines (50 loc) · 2.17 KB

README.md

File metadata and controls

96 lines (50 loc) · 2.17 KB

TwitterSentiment

forthebadge

Maintenance made-with-python GitHub followers GitHub contributors

TwitterSentiment is a Keyword based Twitter Sentiment Analyzer. It uses tweets based on the input provided by the user to generate a Rudimentry Sentiment Report.

https://twittersa-hyo.oa.r.appspot.com/

UI

Table of Contents

Dataset

The Sentiment Model is trained on sentiment140 dataset. https://www.kaggle.com/kazanova/sentiment140

The dataset contains:

  • 800k Positive Tweets
  • 800k Negative Tweets

Methodology

The 1.4 Million tweets are preprocessed using the following steps:

  • LowerCase the letters
  • Replacing the urls with "URL"
  • Removing Usernames (@donaldtrump)
  • Removing Special Characters i.e: Non-Alpha Numeric Characters
  • Removing Stopwords
  • Word lemmatization

The preprocessed tweets are then vectorized using Tf-idf. The vectorized tweets are used as a input for the Support Vector Machine Classifer.

Docker

⚠️ Docker Linux is needed for this!

docker pull realdexter/twitter_sentiment:v1

docker run -p local_port:8501 realdexter/twitter_sentiment:v1

local_port is the Port you want to map to the exposed port of the container.

Visit https://localhost:8501

Usage (Locally)

⚠️ Model files not included in git repo.

git clone https://github.com/AsadAliDD/TwitterSentiment

pip3 install -r requirements.txt

streamlit run SentimentAnalysis.py

Dependencies

  • Python3
  • Streamlit
  • Nltk
  • Sklearn
  • Pandas
  • Numpy
  • GetOldTweets3
  • Plotly