Formula OCR

Steps to follow:

Using general methodology:

Clone the repo into formula_ocr_main directory:

 git clone https://github.com/Trojan0101/formula_ocr.git

Create a directory for file downloading:

 cd formula_ocr
 mkdir downloaded_images

Move tesseract trained datasets for english, japanese, korean, chinese traditional, and chinese simplified to tessdata path in server:
- Use WinScp to move files to the server from https://github.com/tesseract-ocr/tessdata/tree/main

Install dependencies:

pip install virtualenv

virtualenv formula_ocr_env

source formula_ocr_env/bin/activate

pip install -r ./requirements.txt

pip install uwsgi

Move modified utils.py to rapid_latex_ocr utils.py file:

 mv modified_site_packages/rapid_latex_ocr/utils.py formula_ocr_env/<python_version>/site_packages/rapid_latex_ocr/utils.py

Run:

nohup uwsgi --http :8080 --module app:app > formula_ocr_main.log 2>&1 &

disown

Using docker:

Load the Docker Image from the Tar File:

docker load -i docker_images/formula_ocr_docker_linux.tar

Run a Container from the Image [Create formula_ocr_log directory]:

docker run -p 8080:8080 -v /formula_ocr_log:/formula_ocr_docker --name ocr formula_ocr_docker

Notes:

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
.idea		.idea
Formula OCR		Formula OCR
data_extractors		data_extractors
downloaded_images		downloaded_images
modified_site_packages/rapid_latex_ocr		modified_site_packages/rapid_latex_ocr
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback