Language modeling using structured penalties.

Authors
  • NELAKANTI Anil kumar
  • BACH Francis
  • BACH A renseigner
  • ARCHAMBEAU A renseigner
  • ARTIERES A renseigner
  • AMINI A renseigner
  • BOUCHARD A renseigner
Publication date
2014
Publication type
Thesis
Summary Natural language modeling is one of the fundamental challenges in artificial intelligence and interactive system design, with applications in dialogue systems, text generation and machine translation. We propose a discriminative log-linear model that gives the distribution of words that follow a given context. Due to data sparsity, we propose a penalty term that correctly encodes the structure of the functional space to avoid overlearning and improve generalization, while appropriately capturing long-term dependencies. The result is an efficient model that sufficiently captures long dependencies without causing a large increase in space or time resources. In a log-linear model, the learning and testing phases become more and more expensive with an increasing number of classes. The number of classes in a language model is the size of the vocabulary, which is usually very large. A common trick is to apply the model in two steps: the first step identifies the most likely cluster and the second step takes the most likely word from the chosen cluster. This idea can be generalized to a deeper hierarchy with multiple levels of clustering. However, the performance of the resulting hierarchical clustering system depends on the application domain and the construction of a good hierarchy. We investigate different strategies for constructing the category hierarchy of their observations.
Topics of the publication
Themes detected by scanR from retrieved publications. For more information, see https://scanr.enseignementsup-recherche.gouv.fr