Statistical models for language representation

  1. Dorado, Rubén
Revista:
Revista ONTARE

ISSN: 2745-2220 2382-3399

Any de publicació: 2013

Títol de l'exemplar: Avances tecnológicos en ingeniería

Volum: 1

Número: 1

Pàgines: 29-39

Tipus: Article

DOI: 10.21158/23823399.V1.N1.2013.1208 DIALNET GOOGLE SCHOLAR lock_openDialnet editor

Altres publicacions en: Revista ONTARE

Objectius de Desenvolupament Sostenible

Resum

ONTARE. REVISTA DE INVESTIGACIÓN DE LA FACULTAD DE INGENIERÍA This paper discuses several models for the computational representation of language. First, some n-gram models that are based on Markov models are introduced. Second, a family of models known as the exponential models is taken into account. This family in particular allows the incorporation of several features to model. Third, a recent current of research, the probabilistic Bayesian approach, is discussed. In this kind of models, language is modeled as a probabilistic distribution. Several distributions and probabilistic processes, such as the Dirichlet distribution and the Pitman- Yor process, are used to approximate the linguistic phenomena. Finally, the problem of sparseness of the language and its common solution known as smoothing is discussed.