--- license: mit --- # Audio Feature Extraction Models This repository contains pre-trained models for audio feature extraction, specifically: - **Key Detection:** Classifies the musical key of an audio track into relative key classes. ## Model Details ### Tempo Model - **Model Type:** Custom CNN architecture for tempo classification. - **Input:** Audio segments converted to Mel spectrograms followed by autocorrelation. - **Output:** Predicts Beats Per Minute (BPM) in a range from [85, 170]. ### Key Detection Models - **Key Class Model:** Classifies into 12 relative key classes. - **Key Quality Model:** Determines if the key is Major or Minor. - **Input:** Audio segments converted to Mel spectrograms. - **Output:** - Key Class: One of 12 key signatures. - Key Quality: Binary classification (0 for Major, 1 for Minor). ## Usage ### Prerequisites - Python 3.7+ - PyTorch - torchaudio - transformers ### Loading Models To use these models with Hugging Face's transformers library: ```python from transformers import [AutoModelForAudioClassification](https://x.com/i/grok?text=AutoModelForAudioClassification) # Load Tempo Model tempo_model = AutoModelForAudioClassification.from_pretrained("your_username/tempo_model") # Load Key Models key_class_model = AutoModelForAudioClassification.from_pretrained("your_username/key_class_model") key_quality_model = AutoModelForAudioClassification.from_pretrained("your_username/key_quality_model")