Spaces:
Sleeping
Sleeping
File size: 909 Bytes
fc731db |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
Stopwords Corpus
This corpus contains lists of stop words for several languages. These
are high-frequency grammatical words which are usually ignored in text
retrieval applications.
They were obtained from:
http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/snowball/stopwords/
The stop words for the Romanian language were obtained from:
http://arlc.ro/resources/
The English list has been augmented
https://github.com/nltk/nltk_data/issues/22
The German list has been corrected
https://github.com/nltk/nltk_data/pull/49
A Kazakh list has been added
https://github.com/nltk/nltk_data/pull/52
A Nepali list has been added
https://github.com/nltk/nltk_data/pull/83
An Azerbaijani list has been added
https://github.com/nltk/nltk_data/pull/100
A Greek list has been added
https://github.com/nltk/nltk_data/pull/103
An Indonesian list has been added
https://github.com/nltk/nltk_data/pull/112
|