Milkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL] Milkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL]нетDOI: 10.48550/arXiv.2312.08968Размещена на сайте: 28.12.23 Поискать полный текст на Google AcademiaСсылка при цитировании:Milkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL] DOI: 10.48550/arXiv.2312.08968.Milkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL] DOI: 10.48550/arXiv.2312.08968.Авторы:Милкова М., Руднев М.Г., Окольская Л.А.АннотацияBasic values are concepts or beliefs which pertain to desirable end-states and transcend specific situations. Studying personal values in social media can illuminate how and why societal values evolve especially when the stimuli-based methods, such as surveys, are inefficient, for instance, in hard-to-reach populations. On the other hand, user-generated content is driven by the massive use of stereotyped, culturally defined speech constructions rather than authentic expressions of personal values. We aimed to find a model that can accurately detect value-expressive posts in Russian social media VKontakte. A training dataset of 5,035 posts was annotated by three experts, 304 crowd-workers and ChatGPT. Crowd-workers and experts showed only moderate agreement in categorizing posts. ChatGPT was more consistent but struggled with spam detection. We applied an ensemble of human- and AI-assisted annotation involving active learning approach, subsequently trained several LLMs and selected a model based on embeddings from pre-trained fine-tuned rubert-tiny2, and reached a high quality of value detection with F1 = 0.75 (F1-macro = 0.80). This model provides a crucial step to a study of values within and between Russian social media users.Ключевые слова:values ChatGPT social media crowd-workers annotation NLP active learning Рубрики: Социология коммуникацийСоциология культурыMilkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL]нетDOI: 10.48550/arXiv.2312.08968Размещена на сайте: 28.12.23 Поискать полный текст на Google AcademiaСсылка при цитировании:Milkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL] DOI: 10.48550/arXiv.2312.08968.Milkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL] DOI: 10.48550/arXiv.2312.08968.Авторы:Милкова М., Руднев М.Г., Окольская Л.А.Милкова М., Руднев М.Г., Окольская Л.А.АннотацияBasic values are concepts or beliefs which pertain to desirable end-states and transcend specific situations. Studying personal values in social media can illuminate how and why societal values evolve especially when the stimuli-based methods, such as surveys, are inefficient, for instance, in hard-to-reach populations. On the other hand, user-generated content is driven by the massive use of stereotyped, culturally defined speech constructions rather than authentic expressions of personal values. We aimed to find a model that can accurately detect value-expressive posts in Russian social media VKontakte. A training dataset of 5,035 posts was annotated by three experts, 304 crowd-workers and ChatGPT. Crowd-workers and experts showed only moderate agreement in categorizing posts. ChatGPT was more consistent but struggled with spam detection. We applied an ensemble of human- and AI-assisted annotation involving active learning approach, subsequently trained several LLMs and selected a model based on embeddings from pre-trained fine-tuned rubert-tiny2, and reached a high quality of value detection with F1 = 0.75 (F1-macro = 0.80). This model provides a crucial step to a study of values within and between Russian social media users.Ключевые слова:values ChatGPT social media crowd-workers annotation NLP active learning Рубрики: Социология коммуникацийСоциология культурыMilkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL]нетDOI: 10.48550/arXiv.2312.08968Размещена на сайте: 28.12.23 Поискать полный текст на Google AcademiaСсылка при цитировании:Milkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL] DOI: 10.48550/arXiv.2312.08968.Milkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL] DOI: 10.48550/arXiv.2312.08968.Авторы:Милкова М., Руднев М.Г., Окольская Л.А.Милкова М., Руднев М.Г., Окольская Л.А.Милкова М., Руднев М.Г., Окольская Л.А.АннотацияBasic values are concepts or beliefs which pertain to desirable end-states and transcend specific situations. Studying personal values in social media can illuminate how and why societal values evolve especially when the stimuli-based methods, such as surveys, are inefficient, for instance, in hard-to-reach populations. On the other hand, user-generated content is driven by the massive use of stereotyped, culturally defined speech constructions rather than authentic expressions of personal values. We aimed to find a model that can accurately detect value-expressive posts in Russian social media VKontakte. A training dataset of 5,035 posts was annotated by three experts, 304 crowd-workers and ChatGPT. Crowd-workers and experts showed only moderate agreement in categorizing posts. ChatGPT was more consistent but struggled with spam detection. We applied an ensemble of human- and AI-assisted annotation involving active learning approach, subsequently trained several LLMs and selected a model based on embeddings from pre-trained fine-tuned rubert-tiny2, and reached a high quality of value detection with F1 = 0.75 (F1-macro = 0.80). This model provides a crucial step to a study of values within and between Russian social media users.Ключевые слова:values ChatGPT social media crowd-workers annotation NLP active learning Рубрики: Социология коммуникацийСоциология культурыMilkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL]нетDOI: 10.48550/arXiv.2312.08968Размещена на сайте: 28.12.23 Поискать полный текст на Google AcademiaСсылка при цитировании:Milkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL] DOI: 10.48550/arXiv.2312.08968.Milkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL] DOI: 10.48550/arXiv.2312.08968.Авторы:Милкова М., Руднев М.Г., Окольская Л.А.Милкова М., Руднев М.Г., Окольская Л.А.Милкова М., Руднев М.Г., Окольская Л.А.Милкова М., Руднев М.Г., Окольская Л.А.АннотацияBasic values are concepts or beliefs which pertain to desirable end-states and transcend specific situations. Studying personal values in social media can illuminate how and why societal values evolve especially when the stimuli-based methods, such as surveys, are inefficient, for instance, in hard-to-reach populations. On the other hand, user-generated content is driven by the massive use of stereotyped, culturally defined speech constructions rather than authentic expressions of personal values. We aimed to find a model that can accurately detect value-expressive posts in Russian social media VKontakte. A training dataset of 5,035 posts was annotated by three experts, 304 crowd-workers and ChatGPT. Crowd-workers and experts showed only moderate agreement in categorizing posts. ChatGPT was more consistent but struggled with spam detection. We applied an ensemble of human- and AI-assisted annotation involving active learning approach, subsequently trained several LLMs and selected a model based on embeddings from pre-trained fine-tuned rubert-tiny2, and reached a high quality of value detection with F1 = 0.75 (F1-macro = 0.80). This model provides a crucial step to a study of values within and between Russian social media users.Ключевые слова:values ChatGPT social media crowd-workers annotation NLP active learning Рубрики: Социология коммуникацийСоциология культурыВозможно, вам будут интересны другие публикации:Magun V., Rudnev M. Basic Human Values of the Russians: Both Different from and Similar to Other Europeans // Culture Matters in Russia and Everywhere: Backdrop for the Russia-Ukraine Conflict / Ed. by L. Harrison, E. G. Yasin. Lexington Books, 2015, p. 431-450Окольская Л. А.Родительские ценности жителей российских регионов и их трансформация с 1990 по 2017 г. // Будущее социологического знания и вызовы социальных трансформаций (к 90-летию со дня рождения В. А. Ядова). [Электронный ресурс]. Международная научная конференция (Москва, 28–30 ноября 2019 г.). Сборник материалов / Отв. ред. М. К. Горшков; ФНИСЦ РАН. – М.: ФНИСЦ РАН, 2019. С. 654-657.Окольская Л. А.Ориентация на воспитание религиозности у детей в семье в разных странах мира // Вестник общественного мнения. Данные. Анализ. Дискуссии. 2021. № 1-2 (132). С. 107-118.Магун В. С., Руднев М. Г.Динамика базовых ценностей российского населения: 2006–2018 // Будущее социологического знания и вызовы социальных трансформаций (к 90-летию со дня рождения В. А. Ядова). [Электронный ресурс]. Международная научная конференция (Москва, 28–30 ноября 2019 г.). Сборник материалов / Ответственный редактор. М. К. Горшков; ФНИСЦ РАН. – М.: ФНИСЦ РАН, 2019. С. 651-653.Магун В. С., Руднев М. Г.Альтернативные структуры ценностных переменных Ш. Шварца в Европе // XV апрельская международная научная конференция по проблемам развития экономики и общества: в 4-х книгах / Отв. ред.: Е. Г. Ясин. Кн. 4. М.: Издательский дом НИУ ВШЭ, 2015. С. 488-499