Hindi Text Analysis: NLP-Powered Insights & Sentiment Detection

This project preprocesses Hindi text using the IndicNLP library for normalization and tokenization. A custom tokenizer enhances this process by cleaning text, removing stop words, and handling language-specific nuances.

Key Steps:

Text Preprocessing:
- Normalize and tokenize Hindi text with IndicNLP.
- Clean the text and remove stop words using a custom tokenizer.
Feature Extraction:
- Apply TF-IDF vectorization with bigrams to extract key terms and phrases.
- Capture the semantic structure of dialogues.
Sentiment Analysis:
- Utilize a labeled Hindi word list to determine sentiment scores.
- Analyze emotional tones for individual speakers and the overall conversation.
Conversation Insights:
- Summarize key themes and interaction dynamics using extracted terms and sentiment analysis.

This pipeline provides a structured approach to analyzing Hindi conversations, making it useful for linguistic research, sentiment analysis, and dialogue summarization. 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
README.md		README.md
final.ipynb		final.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hindi Text Analysis: NLP-Powered Insights & Sentiment Detection

Key Steps:

About

Releases

Packages

Languages

sai80082/lang-analysis

Folders and files

Latest commit

History

Repository files navigation

Hindi Text Analysis: NLP-Powered Insights & Sentiment Detection

Key Steps:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages