DebasishDhal99 commited on
Commit
61bf6ca
·
1 Parent(s): 7eeef17

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -9
README.md CHANGED
@@ -14,7 +14,10 @@ Use this application on HuggingFace🤗 :- https://huggingface.co/spaces/Debasis
14
  Blog discussing the results :- https://medium.com/@debasishdhaldd99/simplifying-language-through-python-aae6ee7113d9
15
 
16
  This space is aimed at helping people with getting familiarized with Polish, Turkish, Hungarian, Serbo-Croatian-Bosniak (both Latin and Cyrillic based) and Romanian spelling system.
17
- These languages use a modified Latin script with a lot of diacritic marks and digraphs, thus making them difficult for non-native speakers to pronounce or read the words
 
 
 
18
  properly. This space offers simplified spelling of words/sentence in the said languages. More languages are on the pipeline.
19
 
20
  For example, the Polish word Jarosław, an English speaker who isn't familiar with Polish orthography will pronounce it as Jaroslav, while its actual Polish pronunciation
@@ -29,16 +32,17 @@ Features added as of now:-
29
  - Option for the user to generate a random but coherent sentence and pass it as input to the model. Acts as a nice playground for the user.
30
 
31
  # Results in brief
 
32
 
33
  ## Polish
34
  Polish spelling => Simplified form
35
 
36
  - Wojciech Szczęsny => Voytsiekh Shensny
37
- - Grzegorz Krychowiak => Gzhegozh Krykhoviak
38
- - Żółć => Zhuwch
39
  - Szeleścić => Sheleshtsich
40
 
41
- # Hungarian
42
  Hungarian spelling => Simplified form
43
 
44
  - Dominik Szoboszlai => Dominik Soboslai
@@ -46,7 +50,7 @@ Hungarian spelling => Simplified form
46
  - Debrecen => Debretsen
47
  - Pozsony => Pozhony
48
 
49
- # Turkish
50
  Turkish spelling => Simplified form
51
 
52
  - Azerbaycan => Azerbayjan
@@ -54,7 +58,7 @@ Turkish spelling => Simplified form
54
  - Recep Tayyip Erdoğan => Rejep Tayyip Erdo’an
55
  - Barış Alper Yılmaz => Barış Alper Yelmaz
56
 
57
- # Serbo-Croatian-Bosnian
58
  Serbo-Croatian-Bosnian spelling => Simplified form
59
 
60
  - Novak Đoković => Novak Jokovich
@@ -62,14 +66,14 @@ Serbo-Croatian-Bosnian spelling => Simplified form
62
  - Edin Džeko => Edin Jeko
63
  - Artiljerija => Artilyeriya
64
 
65
- # Romanian
66
  Romanian spelling => Simplified form
67
 
68
  - Cluj-Napoca => Kluzh Napoka
69
- - București - Bukureshti (Bucharest)
70
  - Angela Gheorghiu => Anjela Georgiu
71
  - Constantin Brâncuși => Konstantin Brunkushi
72
 
73
 
74
  *************************************************************************************************
75
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
14
  Blog discussing the results :- https://medium.com/@debasishdhaldd99/simplifying-language-through-python-aae6ee7113d9
15
 
16
  This space is aimed at helping people with getting familiarized with Polish, Turkish, Hungarian, Serbo-Croatian-Bosniak (both Latin and Cyrillic based) and Romanian spelling system.
17
+
18
+ **Why?**
19
+
20
+ The languages mentioned above, use a modified Latin script with a lot of diacritic marks and digraphs, thus often making them difficult for non-native speakers to pronounce or read the words
21
  properly. This space offers simplified spelling of words/sentence in the said languages. More languages are on the pipeline.
22
 
23
  For example, the Polish word Jarosław, an English speaker who isn't familiar with Polish orthography will pronounce it as Jaroslav, while its actual Polish pronunciation
 
32
  - Option for the user to generate a random but coherent sentence and pass it as input to the model. Acts as a nice playground for the user.
33
 
34
  # Results in brief
35
+ For each language, some names/placenames in that language were given to this web app as input, the simplified outputs are presented below.
36
 
37
  ## Polish
38
  Polish spelling => Simplified form
39
 
40
  - Wojciech Szczęsny => Voytsiekh Shensny
41
+ - Grzegorz Krychowiak => Gzhegozh Krykhoviak (zh is pronounced like the "s" in measure/vision)
42
+ - Łódź => Wuj
43
  - Szeleścić => Sheleshtsich
44
 
45
+ ## Hungarian
46
  Hungarian spelling => Simplified form
47
 
48
  - Dominik Szoboszlai => Dominik Soboslai
 
50
  - Debrecen => Debretsen
51
  - Pozsony => Pozhony
52
 
53
+ ## Turkish
54
  Turkish spelling => Simplified form
55
 
56
  - Azerbaycan => Azerbayjan
 
58
  - Recep Tayyip Erdoğan => Rejep Tayyip Erdo’an
59
  - Barış Alper Yılmaz => Barış Alper Yelmaz
60
 
61
+ ## Serbo-Croatian-Bosnian
62
  Serbo-Croatian-Bosnian spelling => Simplified form
63
 
64
  - Novak Đoković => Novak Jokovich
 
66
  - Edin Džeko => Edin Jeko
67
  - Artiljerija => Artilyeriya
68
 
69
+ ## Romanian
70
  Romanian spelling => Simplified form
71
 
72
  - Cluj-Napoca => Kluzh Napoka
73
+ - București => Bukureshti (Bucharest)
74
  - Angela Gheorghiu => Anjela Georgiu
75
  - Constantin Brâncuși => Konstantin Brunkushi
76
 
77
 
78
  *************************************************************************************************
79
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference