Steps to follow:
Using general methodology:
-
Clone the repo into formula_ocr_main directory:
git clone https://github.com/Trojan0101/formula_ocr.git
-
Create a directory for file downloading:
cd formula_ocr mkdir downloaded_images
-
Move tesseract trained datasets for english, japanese, korean, chinese traditional, and chinese simplified to tessdata path in server:
- Use WinScp to move files to the server from
https://github.com/tesseract-ocr/tessdata/tree/main
- Use WinScp to move files to the server from
-
Install dependencies:
pip install virtualenv
virtualenv formula_ocr_env
source formula_ocr_env/bin/activate
pip install -r ./requirements.txt
pip install uwsgi
-
Move modified utils.py to rapid_latex_ocr utils.py file:
mv modified_site_packages/rapid_latex_ocr/utils.py formula_ocr_env/<python_version>/site_packages/rapid_latex_ocr/utils.py
-
Install missing libraries:
sudo apt-get install libgl1-mesa-glx
-
Run:
nohup uwsgi --http :8080 --module app:app > formula_ocr_main.log 2>&1 &
disown
Using docker:
-
Load the Docker Image from the Tar File:
docker load -i docker_images/formula_ocr_docker_linux.tar
-
Verify the Image is Loaded:
docker images
-
Run a Container from the Image [Create
formula_ocr_log
directory]:docker run -p 8080:8080 -v /formula_ocr_log:/formula_ocr_docker --name ocr formula_ocr_docker
Notes:
- Path to tessdata can be
/usr/local/share/tessdata
or/usr/share/tessdata