nvidia
/

ssl_en_nest_large_v1.0

Self-supervised Learning

Model card Files Files and versions Community

steveheh commited on 16 days ago

Commit

a969af3

•

1 Parent(s): 999faf8

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -58,12 +58,12 @@ License to use this model is covered by the [CC-BY-4.0](https://creativecommons.
 ## How to Use the Model
 The model is available for use in the NVIDIA NeMo Framework [2], and can be used as weight initialization for downstream tasks or as a frozen feature extractor.
-### Loading the whole model
 ```python
 from nemo.collections.asr.models import EncDecDenoiseMaskedTokenPredModel
 nest_model = EncDecDenoiseMaskedTokenPredModel.from_pretrained(model_name="nvidia/ssl_en_nest_large_v1.0")
 ```
-### Using NEST as weight initialization for downstream tasks
 ```bash
 # use ASR as example:
 python <NeMo Root>/examples/asr/asr_ctc/speech_to_text_ctc_bpe.py \
@@ -93,7 +93,7 @@ More details can be found at [maybe_init_from_pretrained_checkpoint()](https://g
 NEST can also be used as a frozen feature extractor for downstream tasks. For example, in the case of speaker verification, embeddings can be extracted from different layers of the NEST model, and a learned weighted combination of those embeddings can be used as input to the speaker verification model.
 Please refer to this example [script](https://github.com/NVIDIA/NeMo/blob/main/examples/asr/speech_pretraining/downstream/speech_classification_mfa_train.py) and [config](https://github.com/NVIDIA/NeMo/blob/main/examples/asr/conf/ssl/nest/multi_layer_feat/nest_titanet_small.yaml) for details.
-### Extracting Audio Features from NEST
 NEST supports extracting audio features from multiple layers of its encoder:
 ```bash

 ## How to Use the Model
 The model is available for use in the NVIDIA NeMo Framework [2], and can be used as weight initialization for downstream tasks or as a frozen feature extractor.
+### Automatically Instantiate the Model
 ```python
 from nemo.collections.asr.models import EncDecDenoiseMaskedTokenPredModel
 nest_model = EncDecDenoiseMaskedTokenPredModel.from_pretrained(model_name="nvidia/ssl_en_nest_large_v1.0")
 ```
+### Using NEST as Weight Initialization for Downstream Tasks
 ```bash
 # use ASR as example:
 python <NeMo Root>/examples/asr/asr_ctc/speech_to_text_ctc_bpe.py \
 NEST can also be used as a frozen feature extractor for downstream tasks. For example, in the case of speaker verification, embeddings can be extracted from different layers of the NEST model, and a learned weighted combination of those embeddings can be used as input to the speaker verification model.
 Please refer to this example [script](https://github.com/NVIDIA/NeMo/blob/main/examples/asr/speech_pretraining/downstream/speech_classification_mfa_train.py) and [config](https://github.com/NVIDIA/NeMo/blob/main/examples/asr/conf/ssl/nest/multi_layer_feat/nest_titanet_small.yaml) for details.
+### Extracting and Saving Audio Features from NEST
 NEST supports extracting audio features from multiple layers of its encoder:
 ```bash