maSigPro

maSigPro (MicroArray Significant Profiles) [1] applies linear regression to model gene expression in (multiple)series time course microarray data and selects differentially expressed genes through a two-steps algorithm. First, responsive genes are identified by fitting a generic regression model with time as quantitative variable and series as dummy variables. Second, step-wise regression is applied on selected genes to adjust models and identify gene-specific variation patterns. maSigPro returns lists of genes with statistically significant changes along time and across the different series. Each list can be further investigated on the maSigVisualization module where a cluster algorithm is applied on the gene selection to group genes of similar expression patterns and represent their profiles as trajectory charts.

Parameters for maSigPro gene selection:

  • Data: txt file with expression data, genes in rows, arrays in columns. The file must contain an additional row with arrays names and a column with gene names.


NameArray1Array2Array3Array4Array5Array6Array7Array8Array9
gene10.50.20.71.31.41.02.12.42.6
gene20.50.30.40.30.40.10.10.40.5


  • Covariates: txt file with experimental design information, containing as many columns as arrays and as many rows as experimental factors. Each cell contains the value of the array in the experimental factor. For single series only one experimental factor (i.e. Time) is included. Multiple series are indicated by more than one level on the second experimental factor (i.e. Treatment):


Time333999272727
TreatmentCtrTrATrBCtrTrATrBCtrTrATrB
  • Quantitative factor: name of the numerical variable of the experimental design, normally the time.
  • Qualitative factor: name of the categorical variable of the experimental design.
  • Control group: name of the reference series in the regression model (in the example is Ctr).
  • Polynomial degree: degree for the regression model. The maximum allowed degree is #time_points – 1.
  • Alpha: significant level for gene selection.
  • R-Squared cut-off: required level of the goodness of fit of the regression model. This parameter is between 0 and 1. Higher values indicate well fitted models. We recommend values between [0.4,0.8].


Parameters for maSigPro visualization
To show the trajectories in plots maSig visualization applies clustering methods to group genes with similar trends and summarize the graphical display.

  • Series to see: name of the series to visualize from the available series.
  • clustering method: available methods are:
    • 'hclust': hierarchical clustering
    • 'kmeans': k-means
  • Number of clusters: groups to split gene selection to show results.


masigpro.txt · Last modified: 2010/05/04 23:22 by aconesa
CC Attribution-Noncommercial-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0