MUSTAR
/

Rigel-rvc-base-pretrained-model

Model card Files Files and versions Community

YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Rigel Pretrained Model

Base and Fine tuned models

Dataset

Size: Total 1921 hours of speech and vocals.
Languages:
- Arabic: ~70 hours
- Chinese (Mandarin): ~70 hours
- English: ~800 hours
- French: ~42 hours
- German: ~35 hours
- Hindi: ~30 hours
- Indonesian: ~53 hours
- Japanese: ~140 hours
- Korean: ~80 hours
- Portuguese: ~40 hours
- Russian: ~188 hours
- Singing (all languages): ~190 hours
- Spanish: ~200 hours
- Tagalog: ~30 hours
- Common language: Unknown amount

Sampling Frequency

32kHz (Done)
40kHz (Retraining)

Models

Base Model

Data: Total 1921 hours of low-mid quality data.
Steps: 3,890,220
Batch: 40
Precision: FP32
Sampling Rate: 32k

Fine-Tuned Model

Data: 102 hours of high-quality data.
Steps: 2,854,856
Batch: 20
Precision: FP32
Sampling Rate: 32k

Hardware Used

CPU: AMD EPYC 9754
RAM: 256GB
GPUs:
- 1 x H100
- 4 x L40s
- 1 x RTX 4080
- 1 x RTX 4070 Ti

Expected Release Date

July 22nd

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.