Update README.md
Browse files
README.md
CHANGED
@@ -21,8 +21,11 @@ tags:
|
|
21 |
|
22 |
---
|
23 |
|
|
|
24 |
ASR model + pitch aware relative positional embeddings.
|
25 |
|
|
|
|
|
26 |
|
27 |
|
28 |
Questions:
|
@@ -36,8 +39,6 @@ Questions:
|
|
36 |
---
|
37 |
|
38 |
|
39 |
-
|
40 |
-
|
41 |
To explore the relationship between pitch and rotary embeddings, the model implements three complementary pitch based enhancements:
|
42 |
|
43 |
1. Pitch modulated theta Pitch f0 is used to modify the theta parameter, dynamically adjusting the rotary frequency.
|
@@ -47,7 +48,6 @@ To explore the relationship between pitch and rotary embeddings, the model imple
|
|
47 |
|
48 |
|
49 |
|
50 |
-
|
51 |
The function `torch.polar` constructs a complex tensor from polar coordinates:
|
52 |
|
53 |
````python
|
@@ -66,7 +66,7 @@ Here are the abbreviated steps for replacing theta and radius in the rotary forw
|
|
66 |
|
67 |
```python
|
68 |
|
69 |
-
|
70 |
if f0 is not None:
|
71 |
if f0.dim == 2:
|
72 |
f0 = f0.squeeze0
|
@@ -220,19 +220,14 @@ The Complex Frequency Result:
|
|
220 |
Transitions: Natural pitch changes
|
221 |
|
222 |
|
223 |
-
<img width="780" alt="cc4" src="https:github.comuser-attachmentsassets165a3f18-659a-4e2e-a154-a3456b667bae" >
|
224 |
|
225 |
|
226 |
|
227 |
-
----
|
228 |
|
229 |
-
|
230 |
|
231 |
-
https:github.comsine2piMaxfactor
|
232 |
|
233 |
-
MaxFactor is a custom PyTorch optimizer with adaptive learning rates and specialized handling for matrix parameters.
|
234 |
|
235 |
-
** model deviates from standard transformer models.
|
236 |
|
237 |
|
238 |
|
|
|
21 |
|
22 |
---
|
23 |
|
24 |
+
|
25 |
ASR model + pitch aware relative positional embeddings.
|
26 |
|
27 |
+
<img width="1431" height="636" alt="pitch_spectrogram" src="https://github.com/user-attachments/assets/2db80884-7e27-4a24-ad38-c9b8c28f26da" />
|
28 |
+
<img width="233" height="77" alt="legend" src="https://github.com/user-attachments/assets/fad84550-a199-43b3-8471-d011a9fd6f94" />
|
29 |
|
30 |
|
31 |
Questions:
|
|
|
39 |
---
|
40 |
|
41 |
|
|
|
|
|
42 |
To explore the relationship between pitch and rotary embeddings, the model implements three complementary pitch based enhancements:
|
43 |
|
44 |
1. Pitch modulated theta Pitch f0 is used to modify the theta parameter, dynamically adjusting the rotary frequency.
|
|
|
48 |
|
49 |
|
50 |
|
|
|
51 |
The function `torch.polar` constructs a complex tensor from polar coordinates:
|
52 |
|
53 |
````python
|
|
|
66 |
|
67 |
```python
|
68 |
|
69 |
+
|
70 |
if f0 is not None:
|
71 |
if f0.dim == 2:
|
72 |
f0 = f0.squeeze0
|
|
|
220 |
Transitions: Natural pitch changes
|
221 |
|
222 |
|
|
|
223 |
|
224 |
|
225 |
|
|
|
226 |
|
227 |
+
----
|
228 |
|
|
|
229 |
|
|
|
230 |
|
|
|
231 |
|
232 |
|
233 |
|