Dzhurov A. A. (Postgraduate student,
Don State Technical University
)
|
This article discusses the use of the sklearn library and the WordNet database for text classification. The principle of operation of stemming is shown and an example of its implementation in python is shown. Pipeline steps for building a model and processing data are described. Various approaches to text preprocessing were considered, including tokenization, stopword removal, and lemmatization. The use of the WordNet database made it possible to carry out semantic analysis of the text and improve the quality of text classification. Experimental results showed that combining sklearn methods and the WordNet database is an effective approach for text classification. A demonstration of the operation of the developed module with conclusions of the results is shown, and a general diagram of the operation of the module is shown with a description of its operation.
Keywords:sklearn, WordNet, Stemmer, text classification, Python, Pipeline.
|
|
|
Read the full article …
|
Citation link: Dzhurov A. A. SEARCHING FOR DESTRUCTIVE CONTENT IN TEXT // Современная наука: актуальные проблемы теории и практики. Серия: Естественные и Технические Науки. -2024. -№03/2. -С. 46-50 DOI 10.37882/2223-2966.2024.3-2.07 |
|
|