Update README.md
Browse files
README.md
CHANGED
@@ -25,27 +25,20 @@ Notable differences from other available models include:
|
|
25 |
1. Performance: CED with 10M parameters outperforms the majority of previous approaches (~80M).
|
26 |
|
27 |
### Model Sources
|
28 |
-
- **
|
29 |
-
- **Repository:** https://github.com/jimbozhang/hf_transformers_custom_model_ced
|
30 |
- **Paper:** [CED: Consistent ensemble distillation for audio tagging](https://arxiv.org/abs/2308.11957)
|
31 |
- **Demo:** https://huggingface.co/spaces/mispeech/ced-base
|
32 |
|
33 |
-
## Install
|
34 |
-
```bash
|
35 |
-
pip install git+https://github.com/jimbozhang/hf_transformers_custom_model_ced.git
|
36 |
-
```
|
37 |
-
|
38 |
## Inference
|
39 |
```python
|
40 |
-
>>> from
|
41 |
-
>>> from ced_model.modeling_ced import CedForAudioClassification
|
42 |
|
43 |
>>> model_name = "mispeech/ced-tiny"
|
44 |
-
>>> feature_extractor =
|
45 |
-
>>> model =
|
46 |
|
47 |
>>> import torchaudio
|
48 |
-
>>> audio, sampling_rate = torchaudio.load("
|
49 |
>>> assert sampling_rate == 16000
|
50 |
>>> inputs = feature_extractor(audio, sampling_rate=sampling_rate, return_tensors="pt")
|
51 |
|
|
|
25 |
1. Performance: CED with 10M parameters outperforms the majority of previous approaches (~80M).
|
26 |
|
27 |
### Model Sources
|
28 |
+
- **Repository:** https://github.com/RicherMans/CED
|
|
|
29 |
- **Paper:** [CED: Consistent ensemble distillation for audio tagging](https://arxiv.org/abs/2308.11957)
|
30 |
- **Demo:** https://huggingface.co/spaces/mispeech/ced-base
|
31 |
|
|
|
|
|
|
|
|
|
|
|
32 |
## Inference
|
33 |
```python
|
34 |
+
>>> from transformers import AutoModelForAudioClassification, AutoFeatureExtractor
|
|
|
35 |
|
36 |
>>> model_name = "mispeech/ced-tiny"
|
37 |
+
>>> feature_extractor = AutoFeatureExtractor.from_pretrained(model_name, trust_remote_code=True)
|
38 |
+
>>> model = AutoModelForAudioClassification.from_pretrained(model_name, trust_remote_code=True)
|
39 |
|
40 |
>>> import torchaudio
|
41 |
+
>>> audio, sampling_rate = torchaudio.load("/path-to/JeD5V5aaaoI_931_932.wav")
|
42 |
>>> assert sampling_rate == 16000
|
43 |
>>> inputs = feature_extractor(audio, sampling_rate=sampling_rate, return_tensors="pt")
|
44 |
|