Skip to contents

The polarity table of filtered general Russian sentiment lexicon - RuSentiLex, version 2017. Table contains 13342 words or phrases, and 12508 has non-neutral sentiment scores.

Usage

data(hash_sentiment_rusentilex_2017)

Format

A data table with 13342 rows and 2 variables:

token

the textual token (word or phrase)

score

the sentiment score: −1 for negative, 0 for neutral, 1 for positive

Details

The polarity table was generated from original lexicon table (see hash_rusentilex_2017) based on the following rules:

  • only first occurrence of unique lemmatized token was used (another duplicated tokens which has different emotion source and sense was deleted);

  • the "positive/negative" sentiment (indefinite, depends on the context) was deleted for security;

  • the positive sentiment was mapped to +1 score, the negative sentiment - to -1 score, the neutral - to 0 score.

Also some minor mistakes of the original lexicon table was fixed by rulexicon package maintainer.

License

According to information from Natalya Loukachevitch the base lexicon RuSentiLex is published under Creative Commons "Attribution-NonCommercial-ShareAlike" 4.0 International License (CC BY-NC-SA 4.0).

References

Description of the original lexicon table of RuSentiLex (version 2017): hash_rusentilex_2017

Loukachevitch N., Levchik A., 2016. Creating a General Russian Sentiment Lexicon. In Proceedings of Language Resources and Evaluation Conference LREC-2016. URL: http://www.lrec-conf.org/proceedings/lrec2016/pdf/285_Paper.pdf

RuSentiLex project web-page: https://www.labinform.ru/pub/rusentilex/index.htm