Thematic Clustering and Classification of Research in Digital Library Perspectives (2000–2024): A Machine Learning Approach
Loading...
Date
2025
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Indian Institute of Technology Kharagpur
Abstract
The purpose of this study is to identify and classify research themes from the journal “Digital Library Perspectives” (2000–2024) using k-means clustering and machine learning–based classification models. Bibliographic data were retrieved from Dimensions (n = 715). Especially, abstracts were considered for analysis. The results show the trends of research publication in the journal with an annual average of 28.6. Cluster analysis reveals five clusters, and “Digitization and Metadata” emerged as the top cluster in the dataset. The cluster remained dominant throughout the years. SVM is recognized as the most effective model in terms of classifying clusters. Additionally, the confusion matrix has been included to explore correct classifications and misclassifications made by the classifiers. The study’s results are unique and offer implications for librarians, researchers, and policymakers.
Description
Keywords
SOCIAL SCIENCES::Statistics, computer and systems science::Informatics, computer and systems science, SOCIAL SCIENCES::Other social sciences::Library and information science