Language modeling using structured penalties.

Authors

NELAKANTI Anil kumar
BACH Francis
BACH A renseigner
ARCHAMBEAU A renseigner
ARTIERES A renseigner
AMINI A renseigner
BOUCHARD A renseigner

Publication date

2014

Publication type

Thesis

Summary Natural language modeling is one of the fundamental challenges in artificial intelligence and interactive system design, with applications in dialogue systems, text generation and machine translation. We propose a discriminative log-linear model that gives the distribution of words that follow a given context. Due to data sparsity, we propose a penalty term that correctly encodes the structure of the functional space to avoid overlearning and improve generalization, while appropriately capturing long-term dependencies. The result is an efficient model that sufficiently captures long dependencies without causing a large increase in space or time resources. In a log-linear model, the learning and testing phases become more and more expensive with an increasing number of classes. The number of classes in a language model is the size of the vocabulary, which is usually very large. A common trick is to apply the model in two steps: the first step identifies the most likely cluster and the second step takes the most likely word from the chosen cluster. This idea can be generalized to a deeper hierarchy with multiple levels of clustering. However, the performance of the resulting hierarchical clustering system depends on the application domain and the construction of a good hierarchy. We investigate different strategies for constructing the category hierarchy of their observations.

See the publication

Topics of the publication

Themes detected by scanR from retrieved publications. For more information, see https://scanr.enseignementsup-recherche.gouv.fr