File size: 909 Bytes
fc731db
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
Stopwords Corpus

This corpus contains lists of stop words for several languages.  These
are high-frequency grammatical words which are usually ignored in text
retrieval applications.

They were obtained from:
http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/src/backend/snowball/stopwords/

The stop words for the Romanian language were obtained from:
http://arlc.ro/resources/

The English list has been augmented
https://github.com/nltk/nltk_data/issues/22

The German list has been corrected
https://github.com/nltk/nltk_data/pull/49

A Kazakh list has been added
https://github.com/nltk/nltk_data/pull/52

A Nepali list has been added
https://github.com/nltk/nltk_data/pull/83

An Azerbaijani list has been added
https://github.com/nltk/nltk_data/pull/100

A Greek list has been added
https://github.com/nltk/nltk_data/pull/103

An Indonesian list has been added
https://github.com/nltk/nltk_data/pull/112