Commit
·
5c6e992
1
Parent(s):
d35de09
Readme edit
Browse files
README.md
CHANGED
@@ -9,13 +9,29 @@ app_file: app.py
|
|
9 |
pinned: false
|
10 |
short_description: Convert text/image/audio/video from src language to English
|
11 |
---
|
|
|
|
|
|
|
|
|
12 |
|
13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
|
15 |
-
- Text translator - Input (Text), Output (Translated text in English)
|
16 |
-
- Image translator - Input (Image with any text), Output (English Translated text version of the text in the image)
|
17 |
-
- Audio translator - Input (Audio in any language), Output (English Translated text version of the audio)
|
18 |
-
- Video translator - Input (Video), Output (English Translated text version of the audio) [Not yet implemented]
|
19 |
********************************************************
|
20 |
|
21 |
Demo
|
|
|
9 |
pinned: false
|
10 |
short_description: Convert text/image/audio/video from src language to English
|
11 |
---
|
12 |
+
****************************
|
13 |
+
<p align="center">
|
14 |
+
Liked the setup? Put a like on top left, it takes only 2 seconds.
|
15 |
+
</p>
|
16 |
|
17 |
+
****************************
|
18 |
+
Replication
|
19 |
+
- Requirements
|
20 |
+
- Free API Key from https://detectlanguage.com/ for automatic language detection from text.
|
21 |
+
- GPU for `Whisper` model inference. It's slower in CPU.
|
22 |
+
- Notes
|
23 |
+
- `pytesseract` library (For image-to-text) is easier to install in linux machines.
|
24 |
+
- If you have GPU, you can go for more sophisticated image-to-text models.
|
25 |
+
- The image-to-text setup works best for non-decorative and normal sized fonts.
|
26 |
+
*******
|
27 |
+
|
28 |
+
The space consists of 3-4 parts: -
|
29 |
+
|
30 |
+
- Text translator - Input (Input Text, Target language), Output (Translated text in target language, Source language name)
|
31 |
+
- Image translator - Input (Image with any text, Source language, Target language), Output (Image text in source language, Image text translated to target language)
|
32 |
+
- Audio translator - Input (Audio in any language, Model size, Target language), Output (Transcribed original text, Transcribed text translated to target language, Original language name)
|
33 |
+
- Video translator - Input (Video, Model size, Target language), Output (Translated text version of the audio) [Not yet implemented]
|
34 |
|
|
|
|
|
|
|
|
|
35 |
********************************************************
|
36 |
|
37 |
Demo
|