Update README.md
Browse files
README.md
CHANGED
@@ -7,23 +7,19 @@ sdk: static
|
|
7 |
pinned: false
|
8 |
---
|
9 |
|
10 |
-
**
|
11 |
|
12 |
-
**
|
13 |
|
14 |
-
|
15 |
-
πηγές για τη συγκέντρωση ενός εκτενούς σώματος κειμένων υψηλής ποιότητας τα οποία παρέχονται με άδεια Creative Commons.
|
16 |
-
Το glossAPI καλύπτει ένα ευρύ φάσμα θεματικών περιοχών, από την επιστήμη και τη λογοτεχνία έως τα νομικά κείμενα,
|
17 |
-
με δεδομένα που υφίστανται επιμελή επεξεργασία και αποδελτίωση.
|
18 |
|
19 |
-
|
20 |
-
Όλα τα εργαλεία που αναπτύσσει διατίθενται ελεύθερα μέσω του αποθετηρίου του στο Github.
|
21 |
|
22 |
-
|
23 |
-
|
24 |
-
|
|
|
25 |
|
26 |
-
|
27 |
|
28 |
-
|
29 |
-
Επικοινωνία/ contact at: [email protected]
|
|
|
7 |
pinned: false
|
8 |
---
|
9 |
|
10 |
+
# **GlossAPI**
|
11 |
|
12 |
+
GlossAPI is a project by [GFOSS – Open Technologies Alliance](https://gfoss.eu), focused on building foundational infrastructure for Greek Natural Language Processing. Our work centers on the **creation of high-quality, open-access datasets** and the development of a robust, modular **processing pipeline** tailored for academic and domain-specific documents.
|
13 |
|
14 |
+
We aim to lay the groundwork for **open, collaborative, and reproducible NLP research** in the Greek language, supporting researchers, students, and developers in the digital humanities, computational linguistics, and AI communities.
|
|
|
|
|
|
|
15 |
|
16 |
+
Our pipeline covers every stage of document processing—from **automated downloading and text extraction**, to **section segmentation, classification**, and **annotation**. It supports documents in multiple formats and includes dedicated tools for Greek-language content, preserving structure and metadata throughout.
|
|
|
17 |
|
18 |
+
GlossAPI contributes to the long-term vision of a sustainable, open ecosystem for Greek NLP by:
|
19 |
+
- Publishing open-source tools and datasets under permissive licenses
|
20 |
+
- Promoting interoperability and data transparency
|
21 |
+
- Encouraging community contributions and reuse
|
22 |
|
23 |
+
📂 All datasets are released under **Creative Commons licenses**, and our source code is publicly available on [GitHub](https://github.com/eellak/glossapi).
|
24 |
|
25 |
+
📬 Contact: [email protected]
|
|