Leverage the robust capabilities of OpenAI's GPT-3.5 Turbo for advanced text translation from English to Arabic. This Python-based pipeline offers a comprehensive suite of features including text wrapping and tokenization, powered by TikToken.
- Python 3.x
- OpenAI API Key
- Pip package manager
- Clone the repository to your local environment:
git clone https://github.com/AlghamdiMuath/OpenAI_Translator.git
- Change into the project directory:
cd OpenAI_Translator
- Install necessary Python packages:
pip install -r requirements.txt
- Populate the
.env
file with your OpenAI API Key:echo "OPENAI_API_KEY=your_key_here" > .env
- Place the text file to be translated at the specified
INPUT_FILE
location. - Execute the Python script:
python main.py
- Retrieve the translated Arabic text from the designated
OUTPUT_FILE
location.
- Adjust the
translation_template
for specialized translation criteria. - Modify
max_chunk_size
to handle large text files in segmented portions.
wrap_text_to_fixed_width()
: Text wrapping according to specified width.tokenize_text_from_file()
: Tokenization of text files, supporting diverse language models.partition_tokens_into_chunks()
: Token segmentation to control overflow.convert_chunks_to_text()
: Conversion of token segments back into textual format.get_translated_text()
: Utilizes OpenAI's GPT-3.5 Turbo for high-quality translation.execute_translation()
: Coordinates the complete translation and tokenization process.
Contributions are welcome. Please feel free to fork this repository, submit pull requests, or report issues to improve the project.