Gene expression modeling from DNA sequence data.

Authors
  • TAHA May
  • BESSIERE Chloe
  • PETITPREZ Florent
  • VANDEL Jimmy
  • MARIN Jean michel
  • BREHELIN Laurent
  • LEBRE Sophie
  • LECELLIER Charles henri
Publication date
2017
Publication type
Proceedings Article
Summary Gene expression is tightly controlled to ensure a wide variety of functions and cell types. The development of diseases, especially cancers, is invariably linked to the deregulation of these controls. Our goal is to model the link between gene expression and the nucleotide composition of different regulatory regions of the genome. We propose to address this problem in a regression framework with a Lasso approach coupled to a regression tree. We use exclusively sequence data and learn a different model for each cell type. We show (i) that the different regulatory regions provide different and complementary information and (ii) that the sole information of their nucleotide composition allows us to predict gene expression with an error comparable to that obtained using experimental data. Furthermore, the learned linear model does not perform as well for all genes, but better models certain classes of genes with particular nucleotide compositions.
Topics of the publication
  • ...
  • No themes identified
Themes detected by scanR from retrieved publications. For more information, see https://scanr.enseignementsup-recherche.gouv.fr