Discovering molecular functions significantly related to phenotypes by combining gene expression data and biological information

TitleDiscovering molecular functions significantly related to phenotypes by combining gene expression data and biological information
Publication TypeJournal Article
Year of Publication2005
AuthorsAl-Shahrour F, Diaz-Uriarte R, Dopazo J
Journal TitleBioinformatics
Keywordsbabelomics, Biological Neoplasm Proteins/genetics/*metabolism Phenotype Software Structure-Activity Relationship Systems Integration Tumor Markers, Biological/genetics/*metabolism, Breast Neoplasms/genetics/*metabolism Computer Simulation *Database Management Systems *Databases, Protein Documentation/methods Gene Expression Profiling/*methods Humans *Models

MOTIVATION: The analysis of genome-scale data from different high throughput techniques can be used to obtain lists of genes ordered according to their different behaviours under distinct experimental conditions corresponding to different phenotypes (e.g. differential gene expression between diseased samples and controls, different response to a drug, etc.). The order in which the genes appear in the list is a consequence of the biological roles that the genes play within the cell, which account, at molecular scale, for the macroscopic differences observed between the phenotypes studied. Typically, two steps are followed for understanding the biological processes that differentiate phenotypes at molecular level: first, genes with significant differential expression are selected on the basis of their experimental values and subsequently, the functional properties of these genes are analysed. Instead, we present a simple procedure which combines experimental measurements with available biological information in a way that genes are simultaneously tested in groups related by common functional properties. The method proposed constitutes a very sensitive tool for selecting genes with significant differential behaviour in the experimental conditions tested. RESULTS: We propose the use of a method to scan ordered lists of genes. The method allows the understanding of the biological processes operating at molecular level behind the macroscopic experiment from which the list was generated. This procedure can be useful in situations where it is not possible to obtain statistically significant differences based on the experimental measurements (e.g. low prevalence diseases, etc.). Two examples demonstrate its application in two microarray experiments and the type of information that can be extracted.