morenolq
/

SSL4PR-hubert-base

dysarthric speech

classification

audio classification

Model card Files Files and versions Community

morenolq commited on Feb 7

Commit

3354fe9

verified ·

1 Parent(s): 5a15d6b

Update README.md

Browse files

Files changed (1) hide show

README.md +38 -8

README.md CHANGED Viewed

@@ -1,26 +1,56 @@
 ---
 license: mit
 ---
-# SSL4PR WavLM Base
-This repository hosts the pre-trained SSL4PR WavLM Base models for Parkinson's Disease detection from speech in real-world operating conditions. These models are based on the work titled "Exploiting Foundation Models and Speech Enhancement for Parkinson’s Disease Detection from Speech in Real-World Operative Conditions" by Moreno La Quatra et al.
 ## Repository Link
-[GitHub Repository](https://github.com/K-STMLab/SSL4PR/)
 ## Pre-trained Models
-Pre-trained models are available on the Hugging Face model hub. To use the SSL4PR WavLM Base models, please clone the repository by running the following command:
 ```bash
 git clone https://huggingface.co/morenolq/SSL4PR-hubert-base
 ```
-Ensure you have git lfs installed. Each repository contains the pre-trained models, one per fold, named `fold_1.pt`, `fold_2.pt`, ..., `fold_10.pt`.
-The models are available in PyTorch format.
-- [SSL4PR WavLM Base](https://huggingface.co/morenolq/SSL4PR-wavlm-base)
-- [SSL4PR HuBERT Base](https://huggingface.co/morenolq/SSL4PR-hubert-base) - **this repository**
 ## Citation

 ---
 license: mit
+tags:
+- dysarthric speech
+- classification
+- audio classification
 ---
+# SSL4PR WavLM Base and HuBERT Base Models
+This repository hosts the pre-trained SSL4PR models for Parkinson's Disease detection from speech in real-world operating conditions. These models are based on the work titled "Exploiting Foundation Models and Speech Enhancement for Parkinson's Disease Detection from Speech in Real-World Operative Conditions" by La Quatra et al. published at Interspeech 2024. [Paper Link](https://www.isca-archive.org/interspeech_2024/laquatra24_interspeech.pdf)
 ## Repository Link
+[GitHub Repository](https://github.com/K-STMLab/SSL4PR/) please refer to the repository for all details on the models, training and usage.
 ## Pre-trained Models
+Pre-trained models are available on the Hugging Face model hub. To use the SSL4PR models, please clone the desired repository by running one of the following commands:
 ```bash
+# For fold-based models (10-fold cross-validation)
+git clone https://huggingface.co/morenolq/SSL4PR-wavlm-base
 git clone https://huggingface.co/morenolq/SSL4PR-hubert-base
+# For full training models (trained on complete s-PC-GITA)
+git clone https://huggingface.co/morenolq/SSL4PR-wavlm-base-full
+git clone https://huggingface.co/morenolq/SSL4PR-hubert-base-full
 ```
+Ensure you have git lfs installed.
+### Fold-based Models
+The fold-based repositories contain models trained using 10-fold cross-validation on s-PC-GITA.
+Each repository contains 10 pre-trained models, one per fold, named `fold_1.pt`, `fold_2.pt`, ..., `fold_10.pt`.
+- [SSL4PR WavLM Base](https://huggingface.co/morenolq/SSL4PR-wavlm-base): using as base model [WavLM Base](https://huggingface.co/microsoft/wavlm-base)
+- [SSL4PR HuBERT Base](https://huggingface.co/morenolq/SSL4PR-hubert-base): using as base model [HuBERT Base](https://huggingface.co/facebook/hubert-base-ls960)
+### Full Training Models
+The full training repositories contain models trained on the complete s-PC-GITA dataset and tested on
+enhanced e-PC-GITA (as reported in Table 3 of the paper). Each repository contains a single model file
+named `model.pt`.
+- [SSL4PR WavLM Base Full](https://huggingface.co/morenolq/SSL4PR-wavlm-base-full): using as base model [WavLM Base](https://huggingface.co/microsoft/wavlm-base) - trained on complete s-PC-GITA and tested on enhanced e-PC-GITA
+- [SSL4PR HuBERT Base Full](https://huggingface.co/morenolq/SSL4PR-hubert-base-full): using as base model [HuBERT Base](https://huggingface.co/facebook/hubert-base-ls960) - trained on complete s-PC-GITA and tested on enhanced e-PC-GITA
+All models are available in PyTorch format.
+⚠️ Please note that the models are not directly compatible with the Hugging Face Transformers library because they are trained using specific head components (i.e., attention pooling, layer weighting...) as you can find in the [model class](https://github.com/K-STMLab/SSL4PR/blob/main/models/ssl_classification_model.py)
+An image of the model architecture below:
+![Model Architecture](dspeech_arch.png)
 ## Citation