File size: 1,463 Bytes
9d93bcf |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
---
license: mit
---
# Audio Feature Extraction Models
This repository contains pre-trained models for audio feature extraction, specifically:
- **Key Detection:** Classifies the musical key of an audio track into relative key classes.
## Model Details
### Tempo Model
- **Model Type:** Custom CNN architecture for tempo classification.
- **Input:** Audio segments converted to Mel spectrograms followed by autocorrelation.
- **Output:** Predicts Beats Per Minute (BPM) in a range from [85, 170].
### Key Detection Models
- **Key Class Model:** Classifies into 12 relative key classes.
- **Key Quality Model:** Determines if the key is Major or Minor.
- **Input:** Audio segments converted to Mel spectrograms.
- **Output:**
- Key Class: One of 12 key signatures.
- Key Quality: Binary classification (0 for Major, 1 for Minor).
## Usage
### Prerequisites
- Python 3.7+
- PyTorch
- torchaudio
- transformers
### Loading Models
To use these models with Hugging Face's transformers library:
```python
from transformers import [AutoModelForAudioClassification](https://x.com/i/grok?text=AutoModelForAudioClassification)
# Load Tempo Model
tempo_model = AutoModelForAudioClassification.from_pretrained("your_username/tempo_model")
# Load Key Models
key_class_model = AutoModelForAudioClassification.from_pretrained("your_username/key_class_model")
key_quality_model = AutoModelForAudioClassification.from_pretrained("your_username/key_quality_model") |