Lyodos
/

classic-vc

+---
+license: mit
+---
+# ClassicVC
+ClassicVC is an any-to-any voice conversion model that enables users to design their original speaker styles
+by selecting the coordinates from the continuous latent spaces.
+The model components are implemented using PyTorch and fully compatible with ONNX.
+[MMCXLI](https://github.com/lyodos) provides the dedicated graphical user interface (GUI) for ClassicVC.
+It runs on wxPython and ONNX Runtime.
+Users can download the ONNX files and try out speech conversion
+without having to install PyTorch or train a model with their own voice data.
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** Lyodos (Lyodos the City of the Museum)
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [GitHub](https://github.com/lyodos/classic-vc)
+----
+## Uses
+Based on the MIT License, users can use the model codes and checkpoints for research purpose.
+It is provided with no guarantees.
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Out-of-Scope Use
+This model was prototyped as a hobbyist's research into any-to-any voice conversion,
+and we make no guarantees especially regarding its reliability or real-time operation.
+As for use in situations involving an unspecified number of people, such as web broadcasting,
+and mission-critical applications, including medical, transportation, infrastructure, and weapon systems,
+we do not prohibit such use as the developer since the MIT License is the only stated license, but we do not encourage it.
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+The [Notebook 01 of the ClassicVC repository](https://github.com/lyodos/classic-vc) provides the procedure for offline (non real-time) voice conversion.
+[The MMCXLI repository](https://github.com/lyodos/mmcxli) provides GUI, which depends on local Python environment.
+----
+## Training Details
+### Training Data
+The model checkpoints provided here were trained on the following three datasets.
+1. LibriSpeech ASR corpus
+* V. Panayotov, G. Chen, D. Povey and S. Khudanpur, "Librispeech: An ASR corpus based on public domain audio books," 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), South Brisbane, QLD, Australia, 2015, pp. 5206-5210, doi: 10.1109/ICASSP.2015.7178964.
+* https://ieeexplore.ieee.org/document/7178964
+* https://openslr.org/12/
+2. Samrómur Children 21.09
+* Mena, Carlos; et al., 2021, Samromur Children 21.09, CLARIN-IS, http://hdl.handle.net/20.500.12537/185.
+* https://repository.clarin.is/repository/xmlui/handle/20.500.12537/185
+* https://openslr.org/117/
+3. VoxCeleb 1 and 2
+* A. Nagrani*, J. S. Chung*, A. Zisserman, "VoxCeleb: a large-scale speaker identification dataset", Interspeech 2017
+* J. S. Chung*, A. Nagrani*, A. Zisserman, "VoxCeleb2: Deep Speaker Recognition", Interspeech 2018
+* A. Nagrani*, J. S. Chung*, W. Xie, A. Zisserman, "VoxCeleb: Large-scale speaker verification in the wild", Computer Speech and Language, 2019
+* https://huggingface.co/datasets/ProgramComputer/voxceleb/tree/main/vox2
+### Training Procedure
+The [Notebook 02 of the ClassicVC repository](https://github.com/lyodos/classic-vc) provides the procedure for data preparation.
+The [Notebook 03 of the ClassicVC repository](https://github.com/lyodos/classic-vc) provides the training code.