You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
from TTS.api import TTS
import torch
import os
# Load the model to GPU
# Bark is really slow on CPU, so we recommend using GPU.
os.environ["SUNO_USE_SMALL_MODELS"] = "True"
os.environ["SUNO_OFFLOAD_CPU"] = "True"
CUDA_VISIBLE_DEVICES=0,1
device = "cuda" if torch.cuda.is_available() else "cpu"
tts = TTS("tts_models/multilingual/multi-dataset/bark").to(device)
# Cloning a new speaker
# This expects to find a mp3 or wav file like `bark_voices/new_speaker/speaker.wav`
# It computes the cloning values and stores in `bark_voices/new_speaker/speaker.npz`
tts.tts_to_file(text="我家的后面有一个很大的园,相传叫作百草园。现在是早已并屋子一起卖给朱文公的子孙了,连那最末次的相见也已经隔了七八年,其中似乎确凿只有一些野草;但那时却是我的乐园。",
file_path="output.wav",
voice_dir="videos/bark_voices",
speaker="new_speaker")
result:
> tts_models/multilingual/multi-dataset/bark is already downloaded.
TTS.tts.configs bark_config
TTS.vocoder.configs bark_config
TTS.encoder.configs bark_config
TTS.vc.configs bark_config
> Using model: bark
/home/chatglm/miniconda3/envs/VideoReTalking/lib/python3.8/site-packages/torch/nn/utils/weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
> Text splitted to sentences.
['我家的后面有一个很大的园,相传叫作百草园。', '现在是早已并屋子一起卖给朱文公的子孙了,连那最末次的相见也已经隔了七八年,其中似乎确凿只有一些野草;但那时却是我的乐园。']
Some weights of the model checkpoint at facebook/hubert-base-ls960 were not used when initializing HubertModel: ['encoder.pos_conv_embed.conv.weight_g', 'encoder.pos_conv_embed.conv.weight_v']
- This IS expected if you are initializing HubertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing HubertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of HubertModel were not initialized from the model checkpoint at facebook/hubert-base-ls960 and are newly initialized: ['encoder.pos_conv_embed.conv.parametrizations.weight.original0', 'encoder.pos_conv_embed.conv.parametrizations.weight.original1']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Traceback (most recent call last):
File "bark_test.py", line 34, in <module>
tts.tts_to_file(text="我家的后面有一个很大的园,相传叫作百草园。现在是早已并屋子一起卖给朱文公的子孙了,连那最末次的相见也已经隔了七八年,其中似乎确凿只有一些野草;但那时却是我的乐园。",
File "/home/chatglm/ppt_and_human/TTS/api.py", line 403, in tts_to_file
wav = self.tts(text=text, speaker=speaker, language=language, speaker_wav=speaker_wav, **kwargs)
File "/home/chatglm/ppt_and_human/TTS/api.py", line 341, in tts
wav = self.synthesizer.tts(
File "/home/chatglm/ppt_and_human/TTS/utils/synthesizer.py", line 374, in tts
outputs = self.tts_model.synthesize(
File "/home/chatglm/ppt_and_human/TTS/tts/models/bark.py", line 219, in synthesize
history_prompt = load_voice(self, speaker_id, voice_dirs)
File "/home/chatglm/ppt_and_human/TTS/tts/layers/bark/inference_funcs.py", line 81, in load_voice
generate_voice(audio=audio_path, model=model, output_path=output_path)
File "/home/chatglm/ppt_and_human/TTS/tts/layers/bark/inference_funcs.py", line 145, in generate_voice
semantic_vectors = hubert_model.forward(audio[0], input_sample_hz=model.config.sample_rate) File "/home/chatglm/miniconda3/envs/VideoReTalking/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/chatglm/ppt_and_human/TTS/tts/layers/bark/hubert/kmeans_hubert.py", line 71, in forward
outputs = self.model.forward(
File "/home/chatglm/miniconda3/envs/VideoReTalking/lib/python3.8/site-packages/transformers/models/hubert/modeling_hubert.py", line 1091, in forward
encoder_outputs = self.encoder(
File "/home/chatglm/miniconda3/envs/VideoReTalking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/chatglm/miniconda3/envs/VideoReTalking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/home/chatglm/miniconda3/envs/VideoReTalking/lib/python3.8/site-packages/transformers/models/hubert/modeling_hubert.py", line 738, in forward
layer_outputs = layer(
File "/home/chatglm/miniconda3/envs/VideoReTalking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/chatglm/miniconda3/envs/VideoReTalking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/home/chatglm/miniconda3/envs/VideoReTalking/lib/python3.8/site-packages/transformers/models/hubert/modeling_hubert.py", line 589, in forward
hidden_states, attn_weights, _ = self.attention(
File "/home/chatglm/miniconda3/envs/VideoReTalking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/chatglm/miniconda3/envs/VideoReTalking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/home/chatglm/miniconda3/envs/VideoReTalking/lib/python3.8/site-packages/transformers/models/hubert/modeling_hubert.py", line 488, in forward
attn_weights = torch.bmm(query_states, key_states.transpose(1, 2))
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 64.72 GiB. GPU 0 has a total capacty of 23.48 GiB of which 9.36 GiB is free. Process 221492 has 2.96 GiB memory in use. Process 271045 has 720.00 MiB memory in use. Including non-PyTorch memory, this process has 10.42 GiB memory in use. Of the allocated memory 5.45 GiB is allocated by PyTorch, and 4.69 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
I downloaded the model after running the coqui TTS program directly, I think the gpu memory is unreasonable, is there a problem with my Settings? Can you tell me what's wrong, please
The text was updated successfully, but these errors were encountered:
CODE:
result:
I downloaded the model after running the coqui TTS program directly, I think the gpu memory is unreasonable, is there a problem with my Settings? Can you tell me what's wrong, please
The text was updated successfully, but these errors were encountered: