Skip to contents

A dataset containing the dictionary (taxonomy) for innovation activities disclosure estimation in companies’ annual reports. Table contains 46 words and phrases.

Usage

data(key_innov_activ)

Format

A data frame with 46 rows and 3 variables:

main_token

main or grouping token (word or phrase)

token

token (word or phrase), this is equal to or synonym of main_token

regex

regular expression for all word forms of the token (including "е-ё" dualism)

Details

The dictionary enables to estimate the level of innovation disclosure in companies’ annual reports. Lexicons from Garechana et al. 2017 and Libaers et al. 2016 were used as a basis. In addition, after thorough examination of annual reports of 74 Russian publicly traded companies for the period 2013-2019, through expert analysis some more words related to innovation as well as synonyms to these words were added by Fedorova et al. (including universal regular expression).

Languages

English: key_innov_activ_en

License

The dictionary is published under Creative Commons "Attribution-NonCommercial-ShareAlike" 4.0 International License (CC BY-NC-SA 4.0). For additional permissions (including the commercial use) please contact to Elena Fedorova <ecolena@mail.ru>.

References

Garechana, G., Río-Belver, R., Bildosola, I., Rodríguez-Salvador, M. (2017). Effects of innovation management system standardization on firms: evidence from text mining annual reports. Scientometrics, 111(3), 1987–1999. DOI: https://doi.org/10.1007/s11192-017-2345-7.

Libaers, D., Hicks, D., Portery, A.L. (2016). A taxonomy of small firm technology commercialization. Industrial and Corporate Change, 25(3), 371–405. DOI: https://doi.org/10.1093/icc/dtq039.