fffiloni/tts-hallo-talking-portrait · Can't get any result

It errors out when I try to create anything. When I duplicate the Space, it goes in a loop. Here's the log:

===== Application Startup at 2024-07-06 02:29:32 =====

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling transformers.utils.move_cache().

0it [00:00, ?it/s]
0it [00:00, ?it/s]
INFO:albumentations.check_version:A new version of Albumentations is available: 1.4.11 (you have 1.4.10). Upgrade using: pip install --upgrade albumentations
INFO:httpx:HTTP Request: GET https://checkip.amazonaws.com/ "HTTP/1.1 200 "
Running on local URL: http://0.0.0.0:7860
INFO:httpx:HTTP Request: GET http://localhost:7860/startup-events "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: HEAD http://localhost:7860/ "HTTP/1.1 200 OK"

To create a public link, set share=True in launch().
INFO:httpx:HTTP Request: GET https://api.gradio.app/pkg-version "HTTP/1.1 200 OK"
The file /tmp/gradio/6606c4228a8df69159eb70eb91f4f37c83138ae9/DebK3_20201008-232244.jpeg is not in WebP format.
PORTRAIT PNG FILE: /tmp/gradio/6606c4228a8df69159eb70eb91f4f37c83138ae9/DebK3_20201008-232244.jpeg
Loaded as API: https://parler-tts-parler-tts-mini.hf.space ✔
INFO:httpx:HTTP Request: GET https://parler-tts-parler-tts-mini.hf.space/config "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://parler-tts-parler-tts-mini.hf.space/info?serialize=False "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://parler-tts-parler-tts-mini.hf.space/queue/join "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://parler-tts-parler-tts-mini.hf.space/heartbeat/6c8eca51-0530-407a-be29-08ae2d3969dc "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://parler-tts-parler-tts-mini.hf.space/queue/data?session_hash=6c8eca51-0530-407a-be29-08ae2d3969dc "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: GET https://parler-tts-parler-tts-mini.hf.space/file=/tmp/gradio/c3fc527ba6dfe20c1087dae9f8dffd16d1cdc673/audio.wav "HTTP/1.1 200 OK"
/tmp/gradio/da8f35257e7c6d535eacdf48ae8239929a56417f/audio.wav
1 second of silence added to /tmp/gradio/da8f35257e7c6d535eacdf48ae8239929a56417f/audio.wav
WARNING:py.warnings:/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py:69: UserWarning: Specified provider 'CUDAExecutionProvider' is not in available provider names.Available providers: 'AzureExecutionProvider, CPUExecutionProvider'
warnings.warn(

Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis/models/1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis/models/2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis/models/genderage.onnx genderage ['None', 3, 96, 96] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis/models/glintr100.onnx recognition ['None', 3, 112, 112] 127.5 127.5
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: ./pretrained_models/face_analysis/models/scrfd_10g_bnkps.onnx detection [1, 3, '?', '?'] 127.5 128.0
set det-size: (640, 640)
WARNING:py.warnings:/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/insightface/utils/transform.py:68: FutureWarning: rcond parameter will change to the default of machine precision times max(M, N) where M and N are the input matrix dimensions.
To use the future default and silence this warning we advise to pass rcond=None, to keep using the old, explicitly pass rcond=-1.
P = np.linalg.lstsq(X_homo, Y)[0].T # Affine matrix. 3 x 4

WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1720233245.849194 1026 task_runner.cc:85] GPU suport is not available: INTERNAL: ; RET_CHECK failure (mediapipe/gpu/gl_context_egl.cc:77) display != EGL_NO_DISPLAYeglGetDisplay() returned error 0x300c
W0000 00:00:1720233245.849921 1026 face_landmarker_graph.cc:174] Sets FaceBlendshapesGraph acceleration to xnnpack by default.
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
W0000 00:00:1720233245.857058 1357 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
W0000 00:00:1720233245.867301 1357 inference_feedback_manager.cc:114] Feedback manager requires a model with a single signature inference. Disabling support for feedback tensors.
WARNING:py.warnings:/home/user/.pyenv/versions/3.10.14/lib/python3.10/site-packages/google/protobuf/symbol_database.py:55: UserWarning: SymbolDatabase.GetPrototype() is deprecated. Please use message_factory.GetMessageClass() instead. SymbolDatabase.GetPrototype() will be removed soon.
warnings.warn('SymbolDatabase.GetPrototype() is deprecated. Please '

Processed and saved: ./.cache/DebK3_20201008-232244_sep_background.png
Processed and saved: ./.cache/DebK3_20201008-232244_sep_face.png
Some weights of Wav2VecModel were not initialized from the model checkpoint at ./pretrained_models/wav2vec/wav2vec2-base-960h and are newly initialized: ['wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original0', 'wav2vec2.encoder.pos_conv_embed.conv.parametrizations.weight.original1', 'wav2vec2.masked_spec_embed']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
INFO:audio_separator.separator.separator:Separator version 0.17.2 instantiating with output_dir: ./.cache/audio_preprocess, output_format: WAV
INFO:audio_separator.separator.separator:Operating System: Linux #1 SMP Wed Sep 6 21:15:41 UTC 2023
INFO:audio_separator.separator.separator:System: Linux Node: r-johnblues-tts-hallo-talking-portrait-9dly9t49-f4bd6-uyncj Release: 5.10.192-183.736.amzn2.x86_64 Machine: x86_64 Proc: x86_64
INFO:audio_separator.separator.separator:Python Version: 3.10.14
INFO:audio_separator.separator.separator:PyTorch Version: 2.2.2+cu121
INFO:audio_separator.separator.separator:FFmpeg installed: ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
INFO:audio_separator.separator.separator:ONNX Runtime CPU package installed with version: 1.18.0
INFO:audio_separator.separator.separator:CUDA is available in Torch, setting Torch device to CUDA
WARNING:audio_separator.separator.separator:CUDAExecutionProvider not available in ONNXruntime, so acceleration will NOT be enabled
INFO:audio_separator.separator.separator:Loading model Kim_Vocal_2.onnx...
INFO:audio_separator.separator.separator:Load model duration: 00:00:00
INFO:audio_separator.separator.separator:Starting separation process for audio_file_path: /tmp/gradio/da8f35257e7c6d535eacdf48ae8239929a56417f/audio.wav
INFO:audio_separator.separator.separator:Saving Vocals stem to audio_(Vocals)_Kim_Vocal_2.wav...
INFO:audio_separator.separator.separator:Clearing input audio file paths, sources and stems...
INFO:audio_separator.separator.separator:Separation duration: 00:00:37
The config attributes {'center_input_sample': False, 'out_channels': 4} were passed to UNet2DConditionModel, but are not expected and will be ignored. Please verify your config.json configuration file.
Some weights of the model checkpoint were not used when initializing UNet2DConditionModel:
['conv_norm_out.bias, conv_norm_out.weight, conv_out.bias, conv_out.weight']
INFO:hallo.models.unet_3d:loaded temporal unet's pretrained weights from pretrained_models/stable-diffusion-v1-5/unet ...
The config attributes {'center_input_sample': False} were passed to UNet3DConditionModel, but are not expected and will be ignored. Please verify your config.json configuration file.
Load motion module params from pretrained_models/motion_module/mm_sd_v15_v2.ckpt
INFO:hallo.models.unet_3d:Loaded 453.20928M-parameter motion module
loaded weight from ./pretrained_models/hallo/net.pth