Read this in other languages: English, 简体中文
Petto is an intelligent desktop assistant based on Live2DViewerEX. It supports streaming speech recognition, natural language, and voice conversations.
Use it to decorate your desktop pet!
At regular intervals, the desktop pet will:
- Output famous quotes
- Output greetings based on current weather, season, and the window you are visiting using a large language model
- Output simple greetings based on the current time
Additionally, Petto supports streaming speech recognition, TTS speech-to-text, background wake-up, and more, allowing you to interact with your desktop pet via voice!
- English
- Simplified Chinese
- Download the Petto release package
- Extract it to any location on your computer
- Ensure your Live2DViewerEX is running, then open the extracted
petto.exe
- Done! Try talking to your desktop pet!
You can configure Petto's features in detail in the settings in the top right corner of the main interface.
Below are explanations for some settings:
Petto supports language model APIs compatible with OpenAI usage.
Petto is pre-configured with a public language model: https://api.cups.moe/api/chat, which is deployed based on the Duck2api project.
However, due to restrictions from Duckduckgo, the model may temporarily be unresponsive if there are too many requests in a short period. Consider using your own API or deploying the language model locally.
You can set the character's name, background, calling, and other information.
Write character settings that match your expectations based on the desktop pet character you are using!
The LLM responses will be influenced by the example messages.
This setting is used to provide references to the model. If you find the LLM responses do not meet your expectations, you can try writing a simple example message and pass it to the LLM.
Keep the default value unless there are special circumstances.
Required field, used for communication with Live2DViewerEX.
Required field, used to specify the Live2D model number Petto uses. The value is (the model number displayed in Live2DViewerEX - 1). For example, in the case shown below, the selected model's number is 0.
These commands can be used to start the language model and speech recognition model locally.
Petto provides reference scripts startmodel.ps1
and startserver.ps1
to start the RWKV model and MASR-based speech recognition model locally. More detailed information will be provided later.
Petto allows these two scripts to output a PID for managing process. When Petto exits, it will automatically kill the process to avoid resource occupation.
If the recognition address is left blank, the Whisper recognition mode will be used.
The background recognition service will always keep recording and send the speech content back to the set recognition address. Please be aware of potential privacy and security issues.
Currently, streaming recognition must use an interface compatible with the MASR server-side recognition project.
The project is pre-configured with a public streaming recognition: wss://api.cups.moe/api/asr/
The server performance is average, so please use it gently :) If used too heavily, my server might crash.
It is better to refer to the tutorial later in the document to deploy the MASR service yourself or use the Whisper mode.
After enabling background streaming recognition, Petto will always run the streaming recognition function in the background. When any speech containing the background wake-up keyword is detected, the desktop pet will send you a message and prompt you to talk to it:
User: Help me
Desktop Pet: Master, what do you need help with? Please tell me~
Then, Petto will automatically start a ten-second recording recognition, allowing you to interact with the desktop pet. After the recording recognition ends, the desktop pet will respond.
Unlike streaming recognition, Whisper recognition mode must complete the recording before obtaining the final text, which mainly affects the speed of background recognition.
Petto supports Whisper APIs compatible with OpenAI usage.
Keep the default value unless there are special circumstances.
Fill in the API address for requesting Hitokoto.
Petto supports TTS APIs compatible with OpenAI usage.
Although TTS information is not filled in by default, we actually provide a ready-to-use TTS service:
TTS Address: https://api.cups.moe/api/tts/
TTS Key: ecWdn$TJ&ktP#89
This service is deployed based on openai-edge-tts
Set character action groups. The character will automatically trigger actions each time a task is triggered.
In Live2DViewerEX, select the model you are using, click the custom button at the top right, and you will see a series of action groups.
- (Optional) In the settings, check "Hide window on startup"
- Press Win+R, type
shell:startup
. Create a shortcut forpetto.exe
and place it in this folder
Please check the following two files:
data\flutter_assets\scripts\startmodel.ps1
data\flutter_assets\scripts\startserver.ps1
They correspond to the scripts for starting the RWKV model and the local streaming speech recognition service, respectively.
You can also modify the contents of
startmodel.ps1
to start your own model.
To start the RWKV model, you need to:
- Download RWKV Runner and configure the environment as instructed.
- Then, download the RWKV model and place it in the
models/
directory under the RWKV Runner directory. - Adjust the contents of
startmodel.ps1
: follow thecd
command with the path to the RWKV Runner directory, and changeRWKV-x060-World-3B-v2.1-20240417-ctx4096.pth
to the actual downloaded model name. - In Petto settings, remove the
#
at the beginning of the "Pre-execution LLM Command" and then restart Petto.
The model provided below is trained only on Chinese corpus. You can manually train models that support more languages and have higher accuracy, or use pre-trained models or Whisper interfaces for speech recognition.
To start the local streaming speech recognition service, go to the data\flutter_assets\speech\models
directory and:
- Create a directory named
conformer_streaming_fbank
and download inference.pt into it. - Go to the
pun_models
directory and download model.pdiparams into it. - In Petto settings, remove the
#
at the beginning of the "Pre-execution ASR Command" and then restart Petto.
- Support more languages
- MacOS and Linux support
- Add voice authentication
- Optimize UI