Spaces:
Running
Running
File size: 4,261 Bytes
7c03685 20943e6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 |
---
title: Pronunciation Trainer
emoji: π£οΈ
colorFrom: blue
colorTo: red
sdk: gradio
app_file: src/pronunciation_trainer/app.py
---
# Pronunciation Trainer π£οΈ
This repository/app showcases how a [phoneme-based pronunciation trainer](docs/phoneme_based_solution.md) (including personalized LLM-based feedback) overcomes the limitations of a [grapheme-based approach](docs/grapheme_based_solution.md).
| Feature | Grapheme-Based Solution | Phoneme-Based Solution |
|-----------------------------------|----------------------------------------------------------|---------------------------------------------------------|
| **Input Type** | Text transcriptions of speech | Audio files and phoneme transcriptions |
| **Feedback Mechanism** | Comparison of grapheme sequences | Comparison of phoneme sequences and advanced LLM-based feedback |
| **Technological Approach** | Simple text comparison using `SequenceMatcher` | Advanced ASR models like Wav2Vec2 for phoneme recognition |
| **Feedback Detail** | Basic similarity score and diff | Detailed phoneme comparison, LLM-based feedback including motivational and corrective elements |
| **Error Sensitivity** | Sensitive to homophones and transcription errors | More accurate in capturing pronunciation nuances |
| **Suprasegmental Features** | Does not capture (stress, intonation) | Potentially captures through phoneme dynamics and advanced evaluation |
| **Personalization** | Limited to error feedback based on text similarity | Advanced personalization considering learner's native language and target language proficiency |
| **Scalability** | Easy to scale with basic text processing tools | Requires more computational resources for ASR and LLM processing |
| **Cost** | Lower, primarily involves basic computational resources | Higher, due to usage of advanced APIs and model processing |
| **Accuracy** | Lower, prone to misinterpretations of homophones | Higher, better at handling diverse pronunciation patterns (but LLM hallucinations) |
| **Feedback Quality** | Basic, often not linguistically rich | Rich, detailed, personalized, and linguistically informed |
| **Potential for Learning** | Limited to recognizing text differences | High, includes phonetic and prosodic feedback, as well as resource and practice recommendations |
## Quickstart π
### π Click here to try out the app directly:
[**Pronunciation Trainer App**](https://pwenker-pronunciation-trainer.hf.space/)
### π Inspect the code at:
- **GitHub:** [pwenker/pronunciation_trainer](https://github.com/pwenker/pronounciation_trainer)
- **Hugging Face Spaces:** [pwenker/pronunciation_trainer](https://huggingface.co/spaces/pwenker/pronounciation_trainer)
## Local Deployment π
### Prerequisites π
#### Rye πΎ
[Install `Rye`](https://rye-up.com/guide/installation/#installing-rye)
> Rye is a comprehensive tool designed for Python developers. It simplifies your workflow by managing Python installations and dependencies. Simply install Rye, and it takes care of the rest.
- Create a `.env` file in the `pronunciation_trainer` folder and add the following variable:
#### OPENAI API Token π
```
OPENAI_TOKEN=... # Token for the OpenAI API
```
### Set-Up π οΈ
Clone the repository:
```
git clone [repository-url] # Replace [repository-url] with the actual URL of the repository
```
Navigate to the directory:
```
cd pronunciation_trainer
```
Create a virtual environment in `.venv` and synchronize the repo:
```
rye sync
```
For more details, visit: [Basics - Rye](https://rye-up.com/guide/basics/)
### Start the App π
Launch the app using:
```
rye run python src/pronunciation_trainer/app.py
```
Then, open your browser and visit [http://localhost:7860](http://localhost:7860/) to start practicing!
|