Ensure pipenv installs packages in the virtual environment of the current project. This forces the virtual environment to be created inside the project directory (in .venv
).
export PIPENV_VENV_IN_PROJECT=1
Modify the value for HF_TOKEN to reflect your Hugging Face token and replace value for PROJECT_PATH to reflect where you have cloned this project
HF_TOKEN=hf_XXXX
PROJECT_PATH=D:\gitprojects
Create a Conda environment with Python 3.11.11
conda create --prefix ./.venv python=3.11.11
Install the required packages using pipenv.
pipenv install
llm_apps\oh_sheet_its_spark\app.ipynb
Gradio app will then open in the browser . Use upload button to select spreadsheet you want to convert to pyspark & the app immediately generates the pyspark equivalent code.If you want some more ad-hoc changes , use the message textarea to describe your specific changes
Simply change value of the variable inf_mdl_nm currently to "Qwen/Qwen2.5-Coder-32B-Instruct"
inf_mdl_nm = "Qwen/Qwen2.5-Coder-32B-Instruct"
If you want to switch to your own privately deployed or locally quantised model simply replace the below 2 method invocations with your own method which invokes the LLM passing the message variables initial_msg and msg_history to the 2 different calls
cd_resp = generate_responses_from_inf(hf_token, initial_msg, inf_mdl_nm, 5000)
bot_response = generate_responses_from_inf(hf_token, msg_history, inf_mdl_nm, 5000)
You can use the below file path in the project to test uploads:
datasets\test_datasets\SalesData1.xlsx