The U1162 Bio-Informatical Expertise 

Team Leader

PhD - CR2

Past Projects

In the bioinformatics group, we analyze different genomic datasets (whole-exome and whole-genome sequencing, RNA-seq, DNA methylation data), and we develop innovative computational tools to better understand the origin and the molecular diversity of tumors. We explore several research axes, as described below.


Identification of genes driving hepatocellular carcinogenesis

Tumor cells harbor numerous molecular alterations (mutations, chromosome gains and losses, translocations) that can alter the function or the activity of target genes. We develop computational approaches to integrate these alterations and identify genes recurrently altered, which are likely to play a key role in oncogenesis. By analyzing the exome sequences of 250 hepatocellular carcinomas, we recently identified 161 putative driver genes belonging to 11 major cellular pathways (Schulze, Imbeaud, Letouzé et al., Nat Genet 2015). We now analyze whole-genome sequencing data to identify non-coding mutations likely to affect the regulatory sequences of target genes, like the activating mutations of TERT promoter (Nault et al., Nat Commun 2013).



Mutational signatures

Somatic mutations that drive cancer progression are the consequence of spontaneous enzymatic conversions, replication errors, or mutagenic exposures like tobacco or UV light. These mutational processes leave imprints on the tumor genome that can be identified as mutational signatures, caracterized by specific types of mutations or mutations occurring in specific genomic contexts. For instance, tobacco carcinogenes induce mostly C>A mutations, whereas defects in DNA mismatch repair genes lead to an enriched frequency of C>T mutations in NCG trinucleotide context. By analyzing a large series of liver tumors by whole-exome sequencing, we have identified 2 new mutational signatures caracteristic of liver tumors (Schulze, Imbeaud, Letouzé et al., Nat Genet 2015). One of these signatures, caracterized by frequent C>A mutations at GCC trinucleotides, could be related to exposure to aflatoxin B1, a toxin produced by a mushroom in warm and wet countries of Africa and Asia. We now analyze whole-genome sequencing data from a new series of tumors, associated with diverse risk factors, to identify new signatures and distinguish mutational process operative in the early and late steps of oncogenesis. In collaboration with Pr Pierre Laurent-Puig’s team (UMR-S 1147), we also analyze the mutational signatures of other tumor types, like lung and colorectal cancer.


Clonal evolution of liver tumors

Tumors develop through the expansion of cell clones having acquired genomic alterations that confer them a proliferative advantage over surrounding cells. Several clones may coexist within a single tumor, including a dominant clone and one or more minor subclones. Understanding tumor heterogeneity is essential as subclones may harbor specific genetic defects conferring resistance to treatment. Besides, reconstructing the clonal architecture of a tumor allows to distinguish early from late genetic alterations, and to better understand the role of each driver gene. We currently analyze whole-genome sequencing data to reconstruct the clonal architecture of 50 liver tumors. In particular, we analyze cases of adenomas having progressed to carcinomas (Pilati et al., Cancer Cell 2014), to identify the molecular events triggering malignant transformation.



Epigenetic signatures of oncogenic processes

Epigenetic profiles (DNA methylation, histone modifications, chromatin conformation) are highly rearranged in tumor cells. Besides, numerous epigenetic regulators, implicated in DNA (de)methylation or chromatin remodeling, are frequently mutated, in particular in liver cancers. However, the mechanistic link between these molecular alterations and the epigenetic profiles of tumors remains poorly understood. By analyzing the methylome of a large series of liver tumors previously characterized by whole-exome sequencing, we wish to identify specific DNA methylation signatures. We will then correlate these signatures with clinical (exposure to risk factors) and molecular annotations (mutations in epigenetic regulators) to identify the cause of each pattern. This project aims at understanding how altered epigenetic profiles are established, and how the affect the transcriptome of tumor cells. In collaboration with INSERM UMR970 team (Dr Judith Favier), we also study the hypermethylator phenotype induced by succinate dehydrogenase mutations in paraganglioma (Letouzé et al., Cancer Cell 2013).



Insertional mutagenesis by AAV and HBV virus

Virus, including hepatitis B virus (HBV) and hepatitis C, are major causes of hepatocellular carcinomas (HCC) worldwide. HBV is a well-known oncogenic DNA virus in liver tumors that induces insertional mutagenesis, chromosome instability and expression of oncogenic viral proteins. Currently only 2 other DNA viruses (merkel polyomavirus and human papilloma virus) are known to induce oncogenic insertional mutagenesis in human. Recently we obtained very innovative results showing that genomic approaches can help to identify new risk factors of HCC and modified our understanding of the disease. We showed that the adeno-associated virus type 2 (AAV2) is involved in HCC development on normal liver due to insertional mutagenesis in cancer genes (1). Currently, AAV is the 8th virus known associated to human cancer. The first identification of recurrent oncogenic AAV2 insertions in liver tumors leads us to develop a research project aiming to identify viral sequences and possibly new AAV integration sites in different types of liver tumors. As recombinant AAV is used as a vector for gene therapy in human clinical trials, a fine understanding of the mechanisms of carcinogenesis of the wild type virus is mandatory to assess the potential risk of cancer development after gene therapy. The objective is to quantify the contribution of AAV infection to the development of liver tumors in patients with liver disease related to various etiologies or without liver disease. We aim to identify new risk factors of hepatocellular carcinoma by analyzing tumor genomes and to validate them in epidemio-molecular studies through a comprehensive analysis of the tumor, host and viral genomes.



Transcriptome deregulation in liver tumors

Transcriptome profiling of liver tumors using DNA microarrays allowed us to identify homogeneous molecular groups of tumors, associated with distinct molecular alterations and risk factors (Boyault et al., Hepatology 2007). With the advent of high-throughput sequencing, we can now study the transcriptome by directly sequencing the RNAs extracted from a tumor sample (RNA-seq technique). These data give access to an unprecedented wealth of information. In addition to gene expression levels, RNA-seq data give access to the structure and sequence of transcripts, and allow the discovery of new genes. We have generated in the lab the RNA-seq profiles of a large series of liver tumors. We now develop innovative bioinformatic approaches to (1) refine the molecular classfication of liver tumors, (2) analyze the deregulation of non-coding RNAs, (3) detect alternative splicing and allele specific expression in tumors, and (4) identify the mutations that are actually expressed (more likely to play an oncogenic role) and RNA editing events.



Development of an annotation database

The development of genetics and new in-depth and high throughput technologies results in massive data production. Moreover, the multiplicity of programs and genomic annotations through “genome browsers” has created a “jungle” of terminology which requires the establishment and use of a unified vocabulary. In this context, we design and develop a database for structuring, annotation and exploitation of this mass of data generated by genomics programs. This project will, in collaboration with the EBCI Company, the implementation of an efficient IT infrastructure to meet our mixed data model, including clinic, histological and molecular datasets, to lead (1) the optimization of the architecture of our data model, (2) the management of the annotation including updates, and (3) optimizing a Web server to allow biologists to efficiently query data.



PhD - CR2
Sandrine IMBEAUD
Post Doc
Post Doc Cancéropole
PhD student
University Paris 7
Quentin BAYARD
PhD student
University Paris 7


More on Pubmed