A Python Program that use BDFugue Website with data grab by Selenium and FireBase Admin API to connect them to automatically deliver information from BDFugue Website to A Real Time DataBase on FireBase.
For this project I use:
- Python 3, personally I use Python 3.11.5 .
- Selenium, you can install it with
pip install selenium
. - A FireBase Real Time DataBase, you can create one here.
- FireBase Admin, you can install it with
pip install firebase-admin
. - Json, you can install it with
pip install json
. - Tkinter, you can install it with
pip install tkinter
for the interface. - [Optional] ChromeDriver, you can download it here (you need to have the same version of Chrome and ChromeDriver) This is needed only if you don't use the last Selenium version.
Go to FireBase , log-in into your Google Account and create a new project, then go to the project and click on "Real Time DataBase" and create a new one.
Go to the parameter of your FireBase Project then, go Account Service and click on "SDK Admin Firebase" then Python and click on "Generate new private key" and save the file in the PROJECT_FOLDER/path/to/
, with serviceAcoountKey.json
as name and then your file could be used to send data to your FireBase Real Time DataBase.
Install Python 3, you can download it here on your computer and then go to the project folder and open a terminal and type :
pip install selenium
pip install firebase-admin
pip install json
pip install tkinter
In your Project folder, you need to create url.json, url_ok.json, data.json and data_scrap.json with inside of them this: []
.
To use the project, you can do different things to start,
- You can grab the data you already have inside your real time database with
Firebase-to-Json.py
(if your collection name in your database is not "manga" then see how to change it inside this project on : How to change the FireBase Real Time DataBase collection name inside this project). - You can use
add_url.py
to use a little interface to add url tourl.json
. - [Warning: you must use
add_url.py
before running this] You can useurl_verifier.py
to launch the verifier who will launch the webdriver and check if the url inurl.json
to see if they are usable to get data or not and if they are, it will add them tourl_ok.json
and for the one that don't work they will be the only urls remaining onurl.json
and you should find yourself the url that work for the Manga you want to grab the data to change those url you should go tourl.json
then you must have the code below, than you should change the url who is between the quotes to the one who work, and then you can relaunchurl_verifier.py
to check if the url are usable or not by the bot (%manga_name%
is the manga name that you have set and%tome_number%
is the number set).
[
{
"url": "https://www.bdfugue.com/%anime_name%-tome-%tome_number%"
},
{
"url": "https://www.bdfugue.com/%anime_name%-tome-%tome_number%"
},
"..."
]
- [Warning: you must use
url_verifier.py
before running this] You can useFromWeb-to-Json
to launch the scraper who will launch the webdriver and get the data from the url inurl_ok.json
and then add the data scrap todata_scrap.json
.
To change the web driver for :
url_verifier.py
go to line 18 and change thewebdriver.Firefox()
to the one you want to use.FromWeb-to-Json.py
go to line 58 and change thewebdriver.Firefox()
to the one you want to use.
To change the collection name of your database inside this project you should change the collection name in those files and replace manga
by the name of you're collection on firebase :
Firebase-to-Json.py
on line 17collection_name = "manga"
and change the name who is between the quotesJson-to-Firebase.py
on line 14collection_name = "manga"
and change the name who is between the quotes