An Analysis the Document Classification by Using Learning Techniques

Enhancing Document Retrieval with Learning Techniques

Authors

  • Ankur Pandey Author
  • Dr. Anoop Kumar Chaturvedi Author

Keywords:

document classification, learning techniques, textual representation, Machine learning classifiers, supervised, semi-supervised, unsupervised, Random forest, Xgboost, Naive Bayes Classifier, Logistic regression, Neural network, training, large datasets, precision, query, features, evaluation metrics, retrieval performance

Abstract

Text classification intends to provide high quality textual representation accessed from thedigital forms of document available online and build high quality classifiers. Current research explores theMachine learning classifiers for text classification. In order to extract models, classification algorithms areused to describe important data classes. Classification of documents can be supervised, semi-supervisedor unsupervised. Using text classification methods such as Random forest (RF), Xgboost, Naive BayesClassifier, Logistic regression. Neural network based models are widely used and outperforms othermodels but they take more time for training, thereby limiting their usage on large datasets. The precision isdefined as the percentage of properly retrieved documents that are related to the query. The process oftext classification begins with identifying ideal features and selection of machine learning classifiers. Allevaluation metrics show that the proposed work improves retrieval performance.

Downloads

Download data is not yet available.

Downloads

Published

2021-09-01