Sin2pi commited on
Commit
05b483e
·
verified ·
1 Parent(s): a479bac

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -10
README.md CHANGED
@@ -21,8 +21,11 @@ tags:
21
 
22
  ---
23
 
 
24
  ASR model + pitch aware relative positional embeddings.
25
 
 
 
26
 
27
 
28
  Questions:
@@ -36,8 +39,6 @@ Questions:
36
  ---
37
 
38
 
39
-
40
-
41
  To explore the relationship between pitch and rotary embeddings, the model implements three complementary pitch based enhancements:
42
 
43
  1. Pitch modulated theta Pitch f0 is used to modify the theta parameter, dynamically adjusting the rotary frequency.
@@ -47,7 +48,6 @@ To explore the relationship between pitch and rotary embeddings, the model imple
47
 
48
 
49
 
50
-
51
  The function `torch.polar` constructs a complex tensor from polar coordinates:
52
 
53
  ````python
@@ -66,7 +66,7 @@ Here are the abbreviated steps for replacing theta and radius in the rotary forw
66
 
67
  ```python
68
 
69
- f0 = f0.todevice, dtype # feature extracted during processing
70
  if f0 is not None:
71
  if f0.dim == 2:
72
  f0 = f0.squeeze0
@@ -220,19 +220,14 @@ The Complex Frequency Result:
220
  Transitions: Natural pitch changes
221
 
222
 
223
- <img width="780" alt="cc4" src="https:github.comuser-attachmentsassets165a3f18-659a-4e2e-a154-a3456b667bae" >
224
 
225
 
226
 
227
- ----
228
 
229
- This model sometimes uses :
230
 
231
- https:github.comsine2piMaxfactor
232
 
233
- MaxFactor is a custom PyTorch optimizer with adaptive learning rates and specialized handling for matrix parameters.
234
 
235
- ** model deviates from standard transformer models.
236
 
237
 
238
 
 
21
 
22
  ---
23
 
24
+
25
  ASR model + pitch aware relative positional embeddings.
26
 
27
+ <img width="1431" height="636" alt="pitch_spectrogram" src="https://github.com/user-attachments/assets/2db80884-7e27-4a24-ad38-c9b8c28f26da" />
28
+ <img width="233" height="77" alt="legend" src="https://github.com/user-attachments/assets/fad84550-a199-43b3-8471-d011a9fd6f94" />
29
 
30
 
31
  Questions:
 
39
  ---
40
 
41
 
 
 
42
  To explore the relationship between pitch and rotary embeddings, the model implements three complementary pitch based enhancements:
43
 
44
  1. Pitch modulated theta Pitch f0 is used to modify the theta parameter, dynamically adjusting the rotary frequency.
 
48
 
49
 
50
 
 
51
  The function `torch.polar` constructs a complex tensor from polar coordinates:
52
 
53
  ````python
 
66
 
67
  ```python
68
 
69
+
70
  if f0 is not None:
71
  if f0.dim == 2:
72
  f0 = f0.squeeze0
 
220
  Transitions: Natural pitch changes
221
 
222
 
 
223
 
224
 
225
 
 
226
 
227
+ ----
228
 
 
229
 
 
230
 
 
231
 
232
 
233