Text categorization using bibliographic recordsbeyond document content

  1. Montejo Ráez, Arturo
  2. Ureña López, Luis Alfonso
  3. Steinberger, Ralf
Aldizkaria:
Procesamiento del lenguaje natural

ISSN: 1135-5948

Argitalpen urtea: 2005

Zenbakia: 35

Orrialdeak: 119-126

Mota: Artikulua

Beste argitalpen batzuk: Procesamiento del lenguaje natural

Laburpena

This paper studies the use of different sources of information for performing a text classification task. The growing number of digital libraries imposes a review of the available data from those databases. Some experiments applying different base classifiers for a multi-label classifier in the domain of High Energy Physics on several of these possible sources have been carried out. Results show that the use of metadata is almost as good as the full-text version of papers