Журнал «Современная Наука»

Russian (CIS)English (United Kingdom)
MOSCOW +7(495)-142-86-81


Goryachkin Boris Sergeevich  (candidate of technical Sciences, associate Professor, Moscow State Technical University names Bauman)

Korenkova Tatyana Vyacheslavovna  (Moscow State Technical University names Bauman)

Chernykh Yulia Sergeevna  (Moscow State Technical University names Bauman)

One of the most important sources of analysis of social events and processes is news, since they reflect almost all their aspects and afford the opportunity to build a complete picture of social reality. To do this, it is necessary to carry out a preliminary classification of news by social topics, and the original news headings in various news resources are not well suited for this task. Therefore, in this paper it was developed and tested in practice a methodology for determining the optimal categories for classifying news texts, in particular, for the social sphere. The methodology includes the definition of new preliminary news categories by the Word2Vec algorithm, multiple thematic modeling using Zero-Shot classification and semi-automatic modification of categories until the desired thresholds of the derived metric are reached. As a result, an optimal list of categories reflecting social reality was obtained, and its advantage over the initial categories was proved.

Keywords:News topic modeling, news classification, social modeling, Word2Vec, Zero-Shot classification, NLI


Read the full article …

Citation link:
Goryachkin B. S., Korenkova T. V., Chernykh Y. S. METHODOLOGY FOR DETERMINING OPTIMAL CATEGORIES FOR CLASSIFYING A NEWS ARRAY // Современная наука: актуальные проблемы теории и практики. Серия: Естественные и Технические Науки. -2024. -№04. -С. 55-61 DOI 10.37882/2223-2966.2024.04.08
Reproduction of materials is permitted only for non-commercial purposes with reference to the original publication. Protected by the laws of the Russian Federation. Any violations of the law are prosecuted.
© ООО "Научные технологии"