This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
pcamasigpro [2009/12/30 13:34] aconesa |
pcamasigpro [2014/05/12 12:42] (current) jcarbonell |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== PCA-maSigFun ====== | ====== PCA-maSigFun ====== | ||
- | **PCA-maSigFun** [4] identifies the major(s) gene expression changes within each functional class and evaluates whether these changes are significantly associated to the time. In summary, PCA is applied to the gene-expression submatrix associated to the genes belonging to each functional category. The scores of the relevant principal component(s) of these PCAs are taken as joined expression profile(s) for the functional class. The regression based time-course analysis methodology **maSigPro** [1] is then applied to the joined profiles (PC scores) to identify function-related subset of genes with expression changes significantly associated to the time. Note that each functional class can result in more than one joined profile when more than one subset of correlated genes exist within that functional category. that we consider that a functional block might contain several patterns of coordinative gene expression. This program returns lists of significant functional clases (an their representative joined profiles) for each of the series included in the experiment. \\ | + | **PCA-maSigFun** [[references |[4]]] identifies the major(s) gene expression changes within each functional class and evaluates whether these changes are significantly associated to the time. In summary, PCA is applied to the gene-expression submatrix associated to the genes belonging to each functional category. The scores of the relevant principal component(s) of these PCAs are taken as joined expression profile(s) for the functional class. The regression based time-course analysis methodology **maSigPro** [[references |[1]]] is then applied to the joined profiles (PC scores) to identify function-related subset of genes with expression changes significantly associated to the time. Note that each functional class can result in more than one joined profile when more than one subset of correlated genes exist within that functional category. that we consider that a functional block might contain several patterns of coordinative gene expression. This program returns lists of significant functional classes (an their representative joined profiles) for each of the series included in the experiment. \\ |
\\ | \\ | ||
- | __Parameters for **PCA-maSigPro** gene selection__: | + | __Parameters for //PCA-maSigFun gene selection//__: |
*//Data//: txt file with expression data, genes in rows, arrays in columns. The file must contain an additional row with arrays names and a column with gene names.\\ | *//Data//: txt file with expression data, genes in rows, arrays in columns. The file must contain an additional row with arrays names and a column with gene names.\\ | ||
+ | \\ | ||
+ | |Name|Array1|Array2|Array3|Array4|Array5|Array6|Array7|Array8|Array9|…| | ||
+ | |gene1|0.5|0.2|0.7|1.3|1.4|1.0|2.1|2.4|2.6|…| | ||
+ | |gene2|0.5|0.3|0.4|0.3|0.4|0.1|0.1|0.4|0.5|…| | ||
+ | |...|...|...|...|...|...|...|...|...|...|…| | ||
+ | \\ | ||
*//Covariates//: txt file with experimental design information, containing as many columns as arrays and as many rows as experimental factor. Each cell contains the value of the array in the experimental factor. E.g: | *//Covariates//: txt file with experimental design information, containing as many columns as arrays and as many rows as experimental factor. Each cell contains the value of the array in the experimental factor. E.g: | ||
\\ | \\ | ||
|Time|3|3|3|9|9|9|27|27|27|…| | |Time|3|3|3|9|9|9|27|27|27|…| | ||
|Treatment|Ctr|TrA|TrB|Ctr|TrA|TrB|Ctr|TrA|TrB|…| | |Treatment|Ctr|TrA|TrB|Ctr|TrA|TrB|Ctr|TrA|TrB|…| | ||
- | *//Annotations//: a two columns (gene tab annotation) txt file with functional data. | + | \\ |
- | *//Control.group//: name of the reference series in the model (in the example is Ctr) | + | *//Quantitative factor//: name of the numerical variable of the experimental design, normally the time. |
- | *//degree//: polynomial degree for the regression model (max. is # time-points – 1). \\ | + | *//Qualitative factor//: name of the categorical variable of the experimental design. |
- | *//alpha//: significant level for gene selection.\\ | + | *//Control group//: name of the reference series in the regression model (in the example is Ctr). |
- | *//rsq//: cut-off value at the R-squared (goodness of fit) regression parameter. \\ | + | * //Annotations//: the annotations can be uploaded in a two columns (gene tab annotation) txt file. |
- | *//var.cutoff//: Variability level to select Principal Components. | + | *//Polynomial degree//: degree for the regression model. The maximum allowed degree is #time_points – 1. \\ |
- | *//fac.sel//: criterion to select components can be: | + | *//Alpha//: significant level for gene selection.\\ |
- | *“%accum”: percentage of accumulated variability | + | *//R-Squared cut-off//: required level of the goodness of fit of the regression model. This parameter is between 0 and 1. Higher values indicate well fitted models. We recommend values between [0.4,0.8]. \\ |
- | *“single%”: percentage of variability of that PC | + | *//Cut-off//: Variability level to select Principal Components in each category. |
- | *“abs.val”: absolute value of the variabily of that PC | + | *//Selection factor//: criterion to select components can be: |
- | *“rel.abs”: fold variability of tot.var/rank(X) | + | * Proportion of acumulated variability. Posible cut-off values are in (0,1). |
- | \\ | + | * Proportion of variability of each PC. Posible cut-off values are in (0,1). |
- | __Parameters for PCA-maSigPro visualization__\\ | + | * Average: components are selected that explain more than "cut-off" times the average component variability. The recommended "cut-off" values are in [1,1.5]. |
- | *//k//: number of clusters to split gene selection. | + | \\ |
- | *//cluster.method//: clustering method. Possible values are: | + | __Parameters for //PCA-maSigFun visualization//__\\ |
- | * "hclust": hierarchical clustering\\ | + | To show the trajectories in plots //maSig visualization// applies clustering methods to group |
- | * "kmeans": k-means\\ | + | functional categories with similar trends and summarize the graphical display. |
- | *//series.to.see//: number of the series to visualize from the available series.\\ | + | *//Series to see//: name of the series to visualize from the available series.\\ |
+ | *//Clustering method//: available methods are: | ||
+ | * 'hclust': hierarchical clustering\\ | ||
+ | * 'kmeans': k-means\\ | ||
+ | *//Number of clusters//: groups to split gene selection to show results. | ||
+ | __PCA parameters:__ | ||
+ | Threshold for significant gene contribution for the PCA model. This threshold allows the identification | ||
+ | of the genes that most contribute to the selected components. It can be computed | ||
+ | by applying several procedures: | ||
+ | *//Resampling//: where a null Leverage distribution is created by permuting columns of expression data and genes are selected at the "alpha" percentile of the null distribution. | ||
+ | *//minAS//: where a density function is calculated on the data and genes are selected on a local minimum basis [[references |[7]]]. | ||
+ | *//Gamma//: where a gamma distribution is adjusted to the distributions of the gene loadings, and genes are selected at the "alpha" percentile of the gamma distribution [[references |[7]]]. | ||
+ | *//Custom//: where the user can decide the threshold. | ||
\\ | \\ | ||
+ | |||