Milkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL]



Milkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL]
нет
DOI: 10.48550/arXiv.2312.08968

Размещена на сайте: 28.12.23

Поискать полный текст на Google Academia

Ссылка при цитировании:

Milkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL] DOI: 10.48550/arXiv.2312.08968.
Milkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL] DOI: 10.48550/arXiv.2312.08968.

Авторы:

Милкова М., Руднев М.Г., Окольская Л.А.

Аннотация

Basic values are concepts or beliefs which pertain to desirable end-states and transcend specific situations. Studying personal values in social media can illuminate how and why societal values evolve especially when the stimuli-based methods, such as surveys, are inefficient, for instance, in hard-to-reach populations. On the other hand, user-generated content is driven by the massive use of stereotyped, culturally defined speech constructions rather than authentic expressions of personal values. We aimed to find a model that can accurately detect value-expressive posts in Russian social media VKontakte. A training dataset of 5,035 posts was annotated by three experts, 304 crowd-workers and ChatGPT. Crowd-workers and experts showed only moderate agreement in categorizing posts. ChatGPT was more consistent but struggled with spam detection. We applied an ensemble of human- and AI-assisted annotation involving active learning approach, subsequently trained several LLMs and selected a model based on embeddings from pre-trained fine-tuned rubert-tiny2, and reached a high quality of value detection with F1 = 0.75 (F1-macro = 0.80). This model provides a crucial step to a study of values within and between Russian social media users.

Ключевые слова:

values ChatGPT social media crowd-workers annotation NLP active learning

Рубрики:

Социология коммуникаций
Социология культуры

Milkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL]
нет
DOI: 10.48550/arXiv.2312.08968

Размещена на сайте: 28.12.23

Поискать полный текст на Google Academia

Ссылка при цитировании:

Milkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL] DOI: 10.48550/arXiv.2312.08968.
Milkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL] DOI: 10.48550/arXiv.2312.08968.

Авторы:

Милкова М., Руднев М.Г., Окольская Л.А.Милкова М., Руднев М.Г., Окольская Л.А.

Аннотация

Basic values are concepts or beliefs which pertain to desirable end-states and transcend specific situations. Studying personal values in social media can illuminate how and why societal values evolve especially when the stimuli-based methods, such as surveys, are inefficient, for instance, in hard-to-reach populations. On the other hand, user-generated content is driven by the massive use of stereotyped, culturally defined speech constructions rather than authentic expressions of personal values. We aimed to find a model that can accurately detect value-expressive posts in Russian social media VKontakte. A training dataset of 5,035 posts was annotated by three experts, 304 crowd-workers and ChatGPT. Crowd-workers and experts showed only moderate agreement in categorizing posts. ChatGPT was more consistent but struggled with spam detection. We applied an ensemble of human- and AI-assisted annotation involving active learning approach, subsequently trained several LLMs and selected a model based on embeddings from pre-trained fine-tuned rubert-tiny2, and reached a high quality of value detection with F1 = 0.75 (F1-macro = 0.80). This model provides a crucial step to a study of values within and between Russian social media users.

Ключевые слова:

values ChatGPT social media crowd-workers annotation NLP active learning

Рубрики:

Социология коммуникаций
Социология культуры

Milkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL]
нет
DOI: 10.48550/arXiv.2312.08968

Размещена на сайте: 28.12.23

Поискать полный текст на Google Academia

Ссылка при цитировании:

Milkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL] DOI: 10.48550/arXiv.2312.08968.
Milkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL] DOI: 10.48550/arXiv.2312.08968.

Авторы:

Милкова М., Руднев М.Г., Окольская Л.А.Милкова М., Руднев М.Г., Окольская Л.А.Милкова М., Руднев М.Г., Окольская Л.А.

Аннотация

Basic values are concepts or beliefs which pertain to desirable end-states and transcend specific situations. Studying personal values in social media can illuminate how and why societal values evolve especially when the stimuli-based methods, such as surveys, are inefficient, for instance, in hard-to-reach populations. On the other hand, user-generated content is driven by the massive use of stereotyped, culturally defined speech constructions rather than authentic expressions of personal values. We aimed to find a model that can accurately detect value-expressive posts in Russian social media VKontakte. A training dataset of 5,035 posts was annotated by three experts, 304 crowd-workers and ChatGPT. Crowd-workers and experts showed only moderate agreement in categorizing posts. ChatGPT was more consistent but struggled with spam detection. We applied an ensemble of human- and AI-assisted annotation involving active learning approach, subsequently trained several LLMs and selected a model based on embeddings from pre-trained fine-tuned rubert-tiny2, and reached a high quality of value detection with F1 = 0.75 (F1-macro = 0.80). This model provides a crucial step to a study of values within and between Russian social media users.

Ключевые слова:

values ChatGPT social media crowd-workers annotation NLP active learning

Рубрики:

Социология коммуникаций
Социология культуры

Milkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL]
нет
DOI: 10.48550/arXiv.2312.08968

Размещена на сайте: 28.12.23

Поискать полный текст на Google Academia

Ссылка при цитировании:

Milkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL] DOI: 10.48550/arXiv.2312.08968.
Milkova M., Rudnev M., Okolskaya L. Detecting value-expressive text posts in Russian social media. arXiv:2312.08968 [cs.CL] DOI: 10.48550/arXiv.2312.08968.

Авторы:

Милкова М., Руднев М.Г., Окольская Л.А.Милкова М., Руднев М.Г., Окольская Л.А.Милкова М., Руднев М.Г., Окольская Л.А.Милкова М., Руднев М.Г., Окольская Л.А.

Аннотация

Basic values are concepts or beliefs which pertain to desirable end-states and transcend specific situations. Studying personal values in social media can illuminate how and why societal values evolve especially when the stimuli-based methods, such as surveys, are inefficient, for instance, in hard-to-reach populations. On the other hand, user-generated content is driven by the massive use of stereotyped, culturally defined speech constructions rather than authentic expressions of personal values. We aimed to find a model that can accurately detect value-expressive posts in Russian social media VKontakte. A training dataset of 5,035 posts was annotated by three experts, 304 crowd-workers and ChatGPT. Crowd-workers and experts showed only moderate agreement in categorizing posts. ChatGPT was more consistent but struggled with spam detection. We applied an ensemble of human- and AI-assisted annotation involving active learning approach, subsequently trained several LLMs and selected a model based on embeddings from pre-trained fine-tuned rubert-tiny2, and reached a high quality of value detection with F1 = 0.75 (F1-macro = 0.80). This model provides a crucial step to a study of values within and between Russian social media users.

Ключевые слова:

values ChatGPT social media crowd-workers annotation NLP active learning

Рубрики:

Социология коммуникаций
Социология культуры



Возможно, вам будут интересны другие публикации: