Apply for community grant: Personal project (gpu)

#1
by NeoPy - opened

This project is easy-to-use voice conversion tool foth lot feature.

Project Features

  • Music separation (MDX-Net/Demucs)

  • Voice conversion (File conversion/Batch conversion/Conversion with Whisper/Text-to-speech conversion)

  • Apply effects to audio

  • Generate training data (From linked paths)

  • Model training (v1/v2, high-quality encoders)

  • Model fusion

  • Read model information

  • Export models to ONNX

  • Download from pre-existing model repositories

  • Search for models on the web

  • Pitch extraction

  • Support for audio conversion inference using ONNX models

  • ONNX RVC models also support indexing for inference

  • Multiple model options:

F0: pm, dio, mangio-crepe-tiny, mangio-crepe-small, mangio-crepe-medium, mangio-crepe-large, mangio-crepe-full, crepe-tiny, crepe-small, crepe-medium, crepe-large, crepe-full, fcpe, fcpe-legacy, rmvpe, rmvpe-legacy, harvest, yin, pyin, swipe

F0_ONNX: Some models are converted to ONNX to support accelerated extraction

F0_HYBRID: Multiple options can be combined, such as hybrid[rmvpe+harvest], or you can try combining all options together

EMBEDDERS: contentvec_base, hubert_base, japanese_hubert_base, korean_hubert_base, chinese_hubert_base, portuguese_hubert_base

EMBEDDERS_ONNX: All the above embedding models have ONNX versions pre-converted for accelerated embedding extraction

EMBEDDERS_TRANSFORMERS: All the above embedding models have versions pre-converted to Hugging Face for use as an alternative to Fairseq

SPIN_EMBEDDERS: A new embedding extraction model that may provide higher quality than older extractions

NeoPy pinned discussion

If this activity is commercial. You should remove all code related to "audioldm2" and "Editing Soundtrack" in the project because the model weights of this feature are under cc-by-nc-sa-4.0 license and require non-commercial

Sign up or log in to comment