Skip to contents

A dataset containing a character vector of stopwords based on Python's NLTK library augmented by Michail Kalinskiy using filtered stopwords-ru list.

Usage

data(sw_nltk_ru)

Format

A character vector with 259 elements.

Source

Complex source, see References

License

The Python's NLTK library and JavaScript's stopwords-iso NPM package stopwords lists are published under MIT License.

References

Python's NLTK library stopwords list: https://github.com/mitmedialab/DataBasic/blob/master/nltk_data/corpora/stopwords/russian

JavaScript's stopwords-iso NPM package stopwords list: https://github.com/stopwords-iso/stopwords-ru

Filtered by Michail Kalinskiy version of JavaScript's stopwords-iso NPM package stopwords list: https://dev.kmint21.info/posts/python-summa