Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small notes on next steps #746

Merged
merged 1 commit into from
Oct 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions recipes/quickstart/NotebookLlama/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ Note 1: In Step 1, we prompt the 1B model to not modify the text or summarize it

Note 2: For Step 2, you can also use `Llama-3.1-8B-Instruct` model, we recommend experimenting and trying if you see any differences. The 70B model was used here because it gave slightly more creative podcast transcripts for the tested examples.

Note 3: For Step 4, please try to extend the approach with other models. These models were chosen based on a sample prompt and worked best, newer models might sound better. Please see [Notes](./TTS_Notes.md) for some of the sample tests.

### Detailed steps on running the notebook:

Requirements: GPU server or an API provider for using 70B, 8B and 1B Llama models.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2696,6 +2696,16 @@
"print(processed_text[-1000:])"
]
},
{
"cell_type": "markdown",
"id": "3d996ac5",
"metadata": {},
"source": [
"### Next Notebook: Transcript Writer\n",
"\n",
"Now that we have the pre-processed text ready, we can move to converting into a transcript in the next notebook"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down
10 changes: 10 additions & 0 deletions recipes/quickstart/NotebookLlama/Step-2-Transcript-Writer.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -302,6 +302,16 @@
" pickle.dump(save_string_pkl, file)"
]
},
{
"cell_type": "markdown",
"id": "dbae9411",
"metadata": {},
"source": [
"### Next Notebook: Transcript Re-writer\n",
"\n",
"We now have a working transcript but we can try making it more dramatic and natural. In the next notebook, we will use `Llama-3.1-8B-Instruct` model to do so."
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down
10 changes: 10 additions & 0 deletions recipes/quickstart/NotebookLlama/Step-3-Re-Writer.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -253,6 +253,16 @@
" pickle.dump(save_string_pkl, file)"
]
},
{
"cell_type": "markdown",
"id": "2dccf336",
"metadata": {},
"source": [
"### Next Notebook: TTS Workflow\n",
"\n",
"Now that we have our transcript ready, we are ready to generate the audio in the next notebook."
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down
25 changes: 18 additions & 7 deletions recipes/quickstart/NotebookLlama/Step-4-TTS-Workflow.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,9 @@
"\n",
"In this notebook, we will learn how to generate Audio using both `suno/bark` and `parler-tts/parler-tts-mini-v1` models first. \n",
"\n",
"After that, we will use the output from Notebook 3 to generate our complete podcast"
"After that, we will use the output from Notebook 3 to generate our complete podcast\n",
"\n",
"Note: Please feel free to extend this notebook with newer models. The above two were chosen after some tests using a sample prompt."
]
},
{
Expand Down Expand Up @@ -117,11 +119,7 @@
"id": "50b62df5-5ea3-4913-832a-da59f7cf8de2",
"metadata": {},
"source": [
"Generally in life, you set your device to \"cuda\" and are happy. \n",
"\n",
"However, sometimes you want to compensate for things and set it to `cuda:7` to tell the system but even more-so the world that you have 8 GPUS.\n",
"\n",
"Jokes aside please set `device = \"cuda\"` below if you're using a single GPU node."
"Please set `device = \"cuda\"` below if you're using a single GPU node."
]
},
{
Expand Down Expand Up @@ -161,7 +159,7 @@
],
"source": [
"# Set up device\n",
"device = \"cuda:7\" if torch.cuda.is_available() else \"cpu\"\n",
"device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n",
"\n",
"# Load model and tokenizer\n",
"model = ParlerTTSForConditionalGeneration.from_pretrained(\"parler-tts/parler-tts-mini-v1\").to(device)\n",
Expand Down Expand Up @@ -639,6 +637,19 @@
" parameters=[\"-q:a\", \"0\"])"
]
},
{
"cell_type": "markdown",
"id": "c7ce5836",
"metadata": {},
"source": [
"### Suggested Next Steps:\n",
"\n",
"- Experiment with the prompts: Please feel free to experiment with the SYSTEM_PROMPT in the notebooks\n",
"- Extend workflow beyond two speakers\n",
"- Test other TTS Models\n",
"- Experiment with Speech Enhancer models as a step 5."
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down
Loading