Spaces:
Running
Running
alessandro trinca tornidor
commited on
Commit
·
187549a
1
Parent(s):
595d5ff
doc: update README.md
Browse files
README.md
CHANGED
@@ -4,7 +4,7 @@ emoji: 🎤
|
|
4 |
colorFrom: red
|
5 |
colorTo: blue
|
6 |
sdk: gradio
|
7 |
-
sdk_version: 5.
|
8 |
app_file: app.py
|
9 |
pinned: false
|
10 |
license: mit
|
@@ -22,7 +22,7 @@ My [HuggingFace Space](https://huggingface.co/spaces/aletrn/ai-pronunciation-tra
|
|
22 |
## Installation
|
23 |
|
24 |
To run the program locally, you need to install the requirements and run the main python file.
|
25 |
-
These commands assume you have an active virtualenv (locally I'm using python 3.12, on HuggingFace the gradio SDK - version 5.
|
26 |
|
27 |
```bash
|
28 |
pip install -r requirements.txt
|
@@ -42,7 +42,7 @@ Currently the best way to exec the project is using the Gradio frontend:
|
|
42 |
python app.py
|
43 |
```
|
44 |
|
45 |
-
I upgraded the old custom frontend ([email protected], [email protected]) and backend (pytorch==2.
|
46 |
(see [this github issue](https://github.com/instructlab/instructlab/issues/1469) and [this deprecation notice](https://dev-discuss.pytorch.org/t/pytorch-macos-x86-builds-deprecation-starting-january-2024/1690)).
|
47 |
|
48 |
In case of missing TTS voices needed by the Text-to-Speech in-browser SpeechSynthesis feature (e.g. on Windows 11 you need to install manually the TTS voices for the languages you need), right now the Gradio frontend raises an alert message with a JavaScript message.
|
@@ -122,16 +122,16 @@ pnpm playwright test --workers 1 --retries 4 --project=chromium
|
|
122 |
- Upgraded Speech-to-Text German [Silero](https://github.com/snakers4/silero-models) model that blocked the upgrade to PyTorch > 2.x
|
123 |
- Upgraded PyTorch > 2.x
|
124 |
- Improved backend tests with the [mutation test suite](https://en.wikipedia.org/wiki/Mutation_testing) [Cosmic Ray](https://cosmic-ray.readthedocs.io)
|
125 |
-
- E2E [playwright](https://playwright.dev) tests
|
126 |
-
- Added a new frontend based on [Gradio](https://gradio.app)
|
127 |
-
-
|
128 |
-
-
|
129 |
-
-
|
130 |
-
- Gradio frontend version - re-added the Text-to-Speech in-browser (it works only if there are installed the required language packages. In case of failures there is the backend Text-to-Speech feature)
|
131 |
- Fixed a [bug](https://github.com/Thiagohgl/ai-pronunciation-trainer/issues/14) with [whisper](https://huggingface.co/docs/transformers/model_doc/whisper) not properly transcribing the end timestamp for the last word in the recorded audio (in the end I solved it switching to [whisper python pip package](https://pypi.org/project/openai-whisper/))
|
132 |
- Added [faster whisper](https://pypi.org/project/faster-whisper/) model support:
|
133 |
-
|
134 |
-
|
|
|
135 |
|
136 |
### TODO
|
137 |
|
|
|
4 |
colorFrom: red
|
5 |
colorTo: blue
|
6 |
sdk: gradio
|
7 |
+
sdk_version: 5.18.0
|
8 |
app_file: app.py
|
9 |
pinned: false
|
10 |
license: mit
|
|
|
22 |
## Installation
|
23 |
|
24 |
To run the program locally, you need to install the requirements and run the main python file.
|
25 |
+
These commands assume you have an active virtualenv (locally I'm using python 3.12, on HuggingFace the gradio SDK - version 5.6.0 at the moment - uses python 3.10):
|
26 |
|
27 |
```bash
|
28 |
pip install -r requirements.txt
|
|
|
42 |
python app.py
|
43 |
```
|
44 |
|
45 |
+
I upgraded the old custom frontend ([email protected], [email protected]) and backend (pytorch==2.5.1, torchaudio==2.5.1) libraries. On macOS intel it's possible to install from [pypi.org](https://pypi.org/project/torch/) only until the library version [2.2.2](https://pypi.org/project/torch/2.2.2/)
|
46 |
(see [this github issue](https://github.com/instructlab/instructlab/issues/1469) and [this deprecation notice](https://dev-discuss.pytorch.org/t/pytorch-macos-x86-builds-deprecation-starting-january-2024/1690)).
|
47 |
|
48 |
In case of missing TTS voices needed by the Text-to-Speech in-browser SpeechSynthesis feature (e.g. on Windows 11 you need to install manually the TTS voices for the languages you need), right now the Gradio frontend raises an alert message with a JavaScript message.
|
|
|
122 |
- Upgraded Speech-to-Text German [Silero](https://github.com/snakers4/silero-models) model that blocked the upgrade to PyTorch > 2.x
|
123 |
- Upgraded PyTorch > 2.x
|
124 |
- Improved backend tests with the [mutation test suite](https://en.wikipedia.org/wiki/Mutation_testing) [Cosmic Ray](https://cosmic-ray.readthedocs.io)
|
125 |
+
- Added E2E [playwright](https://playwright.dev) tests
|
126 |
+
- Added a new frontend based on [Gradio](https://gradio.app) with an updated online version ([HuggingFace Space](https://huggingface.co/spaces/aletrn/ai-pronunciation-trainer))
|
127 |
+
- It's possible to insert custom sentences to read and evaluate
|
128 |
+
- Play the isolated words in the recordings, to compare the 'ideal' pronunciation with the learner pronunciation
|
129 |
+
- re-added the Text-to-Speech in-browser (it works only if there are installed the required language packages; in case of failures there is the backend Text-to-Speech feature - Gradio frontend version)
|
|
|
130 |
- Fixed a [bug](https://github.com/Thiagohgl/ai-pronunciation-trainer/issues/14) with [whisper](https://huggingface.co/docs/transformers/model_doc/whisper) not properly transcribing the end timestamp for the last word in the recorded audio (in the end I solved it switching to [whisper python pip package](https://pypi.org/project/openai-whisper/))
|
131 |
- Added [faster whisper](https://pypi.org/project/faster-whisper/) model support:
|
132 |
+
- it avoids `None` values on `end_ts` timestamps for the last elements, unlike the HuggingFace Whisper's output
|
133 |
+
- it uses silero-vad to detect long silences within the audio
|
134 |
+
- webApp frontend - improved css on mobile devices
|
135 |
|
136 |
### TODO
|
137 |
|