CLASSIFICATION STAGES OF “XALQ SOʻZI” NEWSPAPER TEXTS BY PERIOD
DOI:
https://doi.org/10.47390/SP1342V5SI3Y2025N16Keywords:
newspaper corpus, representativeness, metadata, lemmatization, morphological analysis, text classification, language change, newspaper linguistics.Abstract
This article provides a detailed analysis of the process of classifying the texts of the Xalq Soʻzi newspaper
by period. The study covers the stages of creating a newspaper corpus, including text collection, editing,
morphological analysis, classification, and metadata creation. Periodic classification of texts helps to examine
various historical, political, and cultural changes and determine how linguistic changes have influenced newspaper
materials. This process also analyzes the methods of information dissemination in different periods of the Xalq
Soʻzi newspaper and its role in society. The study also addresses issues such as ensuring the representativeness of
the newspaper corpus, monitoring lexical changes, and preserving texts in electronic format. The research findings
provide valuable information for linguistic, historical, and sociological studies.
References
1. https://www.nb.no/sprakbanken/en/resource-catalogue/oai-nb-no-sbr-4/
2. Andersen, Gisle and Hofland, Knut. "Building a large corpus based on newspapers from
the web". Exploring Newspaper Language: Using the web to create and investigate a large
corpus of modern Norwegian, edited by Gisle Andersen, John Benjamins Publishing
Company, 2012, pp. 1-28. https://doi.org/10.1075/scl.49.01
3. Заморщикова Л.С. Ассоциативный тезаурус якутского языка // Гуманитарные
научные исследования. – 2014. – № 2 [Электронный ресурс]. URL:
http://human.snauka.ru/2014/02/6027 (дата обращения: 19.06.2015).
4. Fairclough, N.. Critical discourse analysis: the critical study of language. Second
Editon,Routledge, http://dx.doi.org/10.4324/9781315834368
5. S. Kübler, H. Zinsmeister. “Corpus Linguistics and Linguistically Annotated Corpora.”
Bloomsbury. 312 pp, 2015. ISBN: 978-1-4411-6447-6