Supplementary material for the manuscript:
"Motif discovery in promoters of genes co-localized and co-expressed during myeloid cells differentiation."
Alessandro Coppe, Francesco Ferrari, Andrea Bisognin, Gian Antonio Danieli, Sergio Ferrari, Silvio Bicciato and Stefania Bortoluzzi.Submitted to NAR
Co-regulated genes may be co-expressed because under similar genetic, promoter-based, and/or epigenetic, position-based, regulatory circuits. Although data about expression, position, regulation and function of most human genes are available, the true integration of different levels of information still represents a challenge for computational biology hampering the identification of regulatory circuits.
We developed a computational framework for the integrative analysis of gene expression, genomic position, functional annotation and regulatory sequences in gene promoters. Promoter sequence analysis was conducted by a novel multi-step method for the discovery of putative regulatory elements overrepresented in a selected set of promoters, as compared with a background model. The integration of transcriptional, structural and functional data allowed to define sets of promoters derived from groups of genes co-expressed and co-localized in specific regions of the human genome. Moreover, co-expressed and co-localized gene sets could be grouped in two main co-expressed genomic meta-regions, possibly representing functional domains of a higher-level expression regulation.
The motif discovery and analysis procedure was applied to:
- 44 groups of genes co-expressed in myeloid cells (CEG: Co-Expressed Genes);
- 26 groups of genes co-expressed during myeloid cells differentiation and co-localized in specific chromosomal regions (CER: Co-Expressed Regions);
- 2 groups of CER, showing similar expression profiles (CEMR: Co-Expressed Meta-Regions).
A higher number of significantly overrepresented motifs were found in promoters of co-expressed and co-localized genes than in those of simply co-expressed genes.
Motifs were associated to information about their similarity to known transcription factors binding sequences, non uniformity of their distribution along promoter sequences and/or occurrence in highly co-expressed subset of genes.
Contact: Stefania Bortoluzzi <stefibo@bio.unipd.it>
Software
The scripts allowing to perform the motif discovery described above are freely available and can be downloaded here. The archive contains the scripts, documentation and datasets to carry out sample analyses.
Das Annotations
Using the Distributed Annotation System (DAS) we have created an annotation resource
available at our web page:
http://compgen.bio.unipd.it/Annotations/das
where users can access information about CERs and CEMRs by quering the MyDas server.
For example to obtain the list of CERs across chromosome 12 the following query could be used:
http://compgen.bio.unipd.it/Annotations/das/cer/features?segment=12
To restrict the list to only a specific segment of the chromosome:
http://compgen.bio.unipd.it/Annotations/das/cer/features?segment=12:1,90100998
Results of geneset analysis
|
Use the navigation tree on the left to browse the results |