Update README.md
Browse files
README.md
CHANGED
@@ -58,12 +58,12 @@ License to use this model is covered by the [CC-BY-4.0](https://creativecommons.
|
|
58 |
## How to Use the Model
|
59 |
The model is available for use in the NVIDIA NeMo Framework [2], and can be used as weight initialization for downstream tasks or as a frozen feature extractor.
|
60 |
|
61 |
-
###
|
62 |
```python
|
63 |
from nemo.collections.asr.models import EncDecDenoiseMaskedTokenPredModel
|
64 |
nest_model = EncDecDenoiseMaskedTokenPredModel.from_pretrained(model_name="nvidia/ssl_en_nest_large_v1.0")
|
65 |
```
|
66 |
-
### Using NEST as
|
67 |
```bash
|
68 |
# use ASR as example:
|
69 |
python <NeMo Root>/examples/asr/asr_ctc/speech_to_text_ctc_bpe.py \
|
@@ -93,7 +93,7 @@ More details can be found at [maybe_init_from_pretrained_checkpoint()](https://g
|
|
93 |
NEST can also be used as a frozen feature extractor for downstream tasks. For example, in the case of speaker verification, embeddings can be extracted from different layers of the NEST model, and a learned weighted combination of those embeddings can be used as input to the speaker verification model.
|
94 |
Please refer to this example [script](https://github.com/NVIDIA/NeMo/blob/main/examples/asr/speech_pretraining/downstream/speech_classification_mfa_train.py) and [config](https://github.com/NVIDIA/NeMo/blob/main/examples/asr/conf/ssl/nest/multi_layer_feat/nest_titanet_small.yaml) for details.
|
95 |
|
96 |
-
### Extracting Audio Features from NEST
|
97 |
|
98 |
NEST supports extracting audio features from multiple layers of its encoder:
|
99 |
```bash
|
|
|
58 |
## How to Use the Model
|
59 |
The model is available for use in the NVIDIA NeMo Framework [2], and can be used as weight initialization for downstream tasks or as a frozen feature extractor.
|
60 |
|
61 |
+
### Automatically Instantiate the Model
|
62 |
```python
|
63 |
from nemo.collections.asr.models import EncDecDenoiseMaskedTokenPredModel
|
64 |
nest_model = EncDecDenoiseMaskedTokenPredModel.from_pretrained(model_name="nvidia/ssl_en_nest_large_v1.0")
|
65 |
```
|
66 |
+
### Using NEST as Weight Initialization for Downstream Tasks
|
67 |
```bash
|
68 |
# use ASR as example:
|
69 |
python <NeMo Root>/examples/asr/asr_ctc/speech_to_text_ctc_bpe.py \
|
|
|
93 |
NEST can also be used as a frozen feature extractor for downstream tasks. For example, in the case of speaker verification, embeddings can be extracted from different layers of the NEST model, and a learned weighted combination of those embeddings can be used as input to the speaker verification model.
|
94 |
Please refer to this example [script](https://github.com/NVIDIA/NeMo/blob/main/examples/asr/speech_pretraining/downstream/speech_classification_mfa_train.py) and [config](https://github.com/NVIDIA/NeMo/blob/main/examples/asr/conf/ssl/nest/multi_layer_feat/nest_titanet_small.yaml) for details.
|
95 |
|
96 |
+
### Extracting and Saving Audio Features from NEST
|
97 |
|
98 |
NEST supports extracting audio features from multiple layers of its encoder:
|
99 |
```bash
|