Skip to content

shantanucoder/Text-Analytics-and-Data-Extraction-Automation

Repository files navigation

Assignment Objective :- This assignment is to extract textual data articles from the given URL in excel sheet and perform sentimental text analysis and to compute variables using python.

Folder view - 

Project/
├── data/
│   ├── Input.xlsx
│   ├── extracted_articles/
├── StopWords/
│   ├── StopWords_Auditor.txt
│   ├── StopWords_Currencies.txt
│   ├── StopWords_DatesandNumbers.txt
│   ├── StopWords_Generic.txt
│   ├── StopWords_GenericLong.txt
│   ├── StopWords_Geographic.txt
│   ├── StopWords_Names.txt
├── MasterDictionary/
│   ├── positive-words.txt
│   ├── negative-words.txt
├── main_script.py
-----readme-----


Steps to Set Up and Run
1. Install Necessary Dependencies

pip install pandas openpyxl nltk requests beautifulsoup4
2. Run the Script
To execute the script, follow these steps:



Dependencies:-
The following libraries are required:

pandas: For reading/writing Excel files and managing DataFrames.
openpyxl: For Excel file operations.
nltk: For text processing (requires additional downloads).
requests: For fetching data from URLs.
beautifulsoup4: For parsing HTML content.



Test Assignment Google colab link:- https://colab.research.google.com/drive/1Vz5oBqt4PnSeKEwtZH9MxrtPE4Jeina3?usp=sharing

Follow the objective of the assignment to understand it thoroughly. Details about the text analysis process are provided in a separate document.

About

Text Analytics and Data Extraction Automation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages