File size: 1,862 Bytes
3bc9eff
 
 
 
 
4ce5ba8
3bc9eff
84377a8
3bc9eff
236de10
3bc9eff
 
 
b469a9c
3bc9eff
 
 
4ce5ba8
 
 
 
 
253e2c9
da4e0e1
 
 
 
 
 
eae9c2c
da4e0e1
785209c
da4e0e1
eae9c2c
da4e0e1
785209c
da4e0e1
eae9c2c
da4e0e1
785209c
da4e0e1
eae9c2c
da4e0e1
785209c
da4e0e1
eae9c2c
da4e0e1
785209c
da4e0e1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
---
license: apache-2.0
language:
- en
base_model:
- parler-tts/parler-tts-mini-v1
pipeline_tag: text-to-speech
---

# Parler TTS Mini v1 GGUF

This repository contains the GGUF files for Parler TTS Mini v0.1.
You can run this model by using [TTS.cpp](https://github.com/mmwillet/TTS.cpp).
**Note**: TTS.cpp currently only compiles on MacOS with Metal support.

```shell
./cli --model-path /model/path/to/gguf_file.gguf --prompt "I am saying some words" --save-path /tmp/test.wav
```

## Model Information

Model Checkpoint: [parler-tts/parler-tts-mini-v1](https://huggingface.co/parler-tts/parler-tts-mini-v1)

Voice Prompt: "female voice"

## Sample

Here is sample audio generated using the prompt: "This is speech generated by the TTS CPP library."

### FP32

<audio controls src="https://huggingface.co/ecyht2/parler-tts-mini-v1-GGUF/resolve/main/sample/fp32.wav"></audio>

### FP16

<audio controls src="https://huggingface.co/ecyht2/parler-tts-mini-v1-GGUF/resolve/main/sample/fp16.wav"></audio>

### Q4_0

<audio controls src="https://huggingface.co/ecyht2/parler-tts-mini-v1-GGUF/resolve/main/sample/Q4_0.wav"></audio>

### Q5_0

<audio controls src="https://huggingface.co/ecyht2/parler-tts-mini-v1-GGUF/resolve/main/sample/Q5_0.wav"></audio>

### Q8_0

<audio controls src="https://huggingface.co/ecyht2/parler-tts-mini-v1-GGUF/resolve/main/sample/Q8_0.wav"></audio>

### Generation Times

The audio is generated using Intel i5-14400 (16) @ 4.700GHz on CPU.
The generation time of the audio is shown below.

```
Testing fp32
real    0m39.792s
user    5m52.071s
sys     0m7.977s

Testing fp16
real    0m53.419s
user    8m13.402s
sys     0m7.623s

Testing Q4_0
real    5m24.418s
user    50m57.182s
sys     0m27.437s

Testing Q5_0
real    0m44.292s
user    6m40.970s
sys     0m7.971s

Testing Q8_0
real    0m40.479s
user    5m58.898s
sys     0m9.100s
```