superb
/

superb-submission

Model card Files Files and versions

xet

Community

superb-hidden-set commited on Apr 28, 2022

Commit

1ff2c02

1 Parent(s): c7d9dc5

move model interface functions description from website to here

Browse files

Files changed (1) hide show

README.md +61 -3

README.md CHANGED Viewed

@@ -19,13 +19,71 @@ If you are not feasible to submit the pre-trained model, please [fill this form]
 ## Quickstart
-### 1. Create an account and organization on the Hugging Face Hub
 First create an account on the Hugging Face Hub and you can sign up [here](https://huggingface.co/join) if you haven't already! Next, create a new organization and invite the SUPERB Hidden Set Committee to join. You will upload your model to a repository under this organization so that members inside it can access the model which is not publicly available.
 * [superb-hidden-set](https://huggingface.co/superb-hidden-set)
-### 2. Create a template repository on your machine
 The next step is to create a template repository on your local machine that contains various files and a CLI to help you validate and submit your pretrained models. The Hugging Face Hub uses [Git Large File Storage (LFS)](https://git-lfs.github.com) to manage large files, so first install it if you don't have it already. For example, on macOS you can run:
@@ -72,7 +130,7 @@ my-superb-submission
 └── model.pt                <- Your model weights
 ```
-### 3. Install the dependencies
 The final step is to install the project's dependencies:

 ## Quickstart
+### 1. Add model interfaces
+#### forward
+Extract features from waveforms.
+- **Input:** A list of waveforms in 16000 Hz
+    ```python
+    SAMPLE_RATE = 16000
+    BATCH_SIZE = 8
+    EXAMPLE_SEC = 10
+    wavs = [torch.randn(SAMPLE_RATE * EXAMPLE_SEC).cuda() for _ in range(BATCH_SIZE)]
+    results = upstream(wavs)
+    ```
+- **Output:** A dictionary with a key for each task. If any task-specific key is not presented, a "hidden_states" key should be provided as the default key. The value for each key is **a list** of padded sequences in the same shape of **(batch_size, max_sequence_length_of_batch, hidden_size)** for weighted-sum to work. It is welcome to perform some preprocessing on the upstream's raw hidden-sets, including upsampling and downsampling. However, all the values must come from **a single upstream model**:
+    ```python
+    assert isinstance(results, dict)
+    tasks = ["PR", "SID", "ER", "ASR", "ASV", "SD", "QbE", "ST", "SS", "SE"]
+    for task in tasks:
+        hidden_states = results.get(task, "hidden_states")
+        assert isinstance(hidden_states, list)
+        for state in hidden_states:
+            assert isinstance(state, torch.Tensor)
+            assert state.dim() == 3, "(batch_size, max_sequence_length_of_batch, hidden_size)"
+            assert state.shape == hidden_states[0].shape
+    ```
+#### get_downsample_rates
+Provide the downsample rate **from 16000 Hz waveforms** for each task's representation in the dict. For the standard 10ms stride representation, the downsample rate is 160.
+```python
+SAMPLE_RATE = 16000
+MSEC_PER_SEC = 1000
+downsample_rate = SAMPLE_RATE * 10 / MSEC_PER_SEC  # 160
+```
+The downsample rate will be used to:
+1. Calculate the valid representation length of each utterance in the output padded representation.
+2. Prepare the training materials according to the representation's downsample rate for frame-level tasks, e.g. SD, SE, and SS.
+- **Input:** the task key (str)
+- **Output:** the downsample rate (int) of the representation for that task
+```python
+for task in tasks:
+    assert isinstance(task, str)
+    downsample_rate = upstream.get_downsample_rate(task)
+    assert isinstance(downsample_rate, int)
+    print("The upstream's representation for {task}"
+        f" has the downsample rate of {downsample_rate}.")
+```
+### 2. Create an account and organization on the Hugging Face Hub
 First create an account on the Hugging Face Hub and you can sign up [here](https://huggingface.co/join) if you haven't already! Next, create a new organization and invite the SUPERB Hidden Set Committee to join. You will upload your model to a repository under this organization so that members inside it can access the model which is not publicly available.
 * [superb-hidden-set](https://huggingface.co/superb-hidden-set)
+### 3. Create a template repository on your machine
 The next step is to create a template repository on your local machine that contains various files and a CLI to help you validate and submit your pretrained models. The Hugging Face Hub uses [Git Large File Storage (LFS)](https://git-lfs.github.com) to manage large files, so first install it if you don't have it already. For example, on macOS you can run:
 └── model.pt                <- Your model weights
 ```
+### 4. Install the dependencies
 The final step is to install the project's dependencies: