Table of Contents


Meta-analysis of lung cancer studies

A. Description of study

Lung cancer is the most common cause of cancer death worldwide, and 85% of patients belongs to a subtype known as non small cell lung cancer. Although it is greatly associated with smoking, lung cancer in never smokers is more common in women than men. Understanding its biology and molecular mechanisms is crucial for the development of effective therapies and the improvement of its diagnosis.

To achieve this goal we have done a systematic review and we have selected several transcriptomic studies with similar characteristics:


B. Work plan

Now we want to combine these selected datasets at the same time by meta-analysis techniques from ImaGEO:

  1. GSE10072. Gene expression signature of cigarette smoking and its role in lung adenocarcinoma development and survival.
  2. GSE31210. Identification of genes up-regulated in ALK-positive and EGFR/KRAS/ALK-negative lung adenocarcinomas.
  3. GSE19188. Expression data for early stage NSCLC.

To start, we recommend you to have a look the description of each study (see previous GEO links). The next step would be to know how many samples there are for each study: detailed information.

After checking this information, you are ready to start to work in ImaGEO. We want to do these two approaches:

When finishing these activities in ImaGEO, we will assess meta-analysis results from several functional characterization approaches to understand the rol of these interesting genes.


C. Questions

C.1. Working from ImaGEO

To get each meta-analysis (men and women), first you have to select all three studies and its samples for each group (control and case), then you have to select values for all parameters and run.

Some questions about the report of results:

  1. Summary / Datasets info.
    • Any problem with selected platforms?
    • Did you have any difficulties to assign samples of interest?
    • What do you think about the number of samples for each group? Are they balanced?
  2. Interstudy Quality Control.
    • Datasets Boxplots. What is the rank for expression values?
    • Missing values. Is there a big percentage of missing values for each study?
  3. Results.
    • Differentially expressed genes summary. Are there big differences when comparing significant results for each study vs. combined studies at the same time?
    • Meta-analysis results. Could you explain the meaning of these indicators: fdr_pval, pval, zval?
    • Heatmaps. Explain a bit these graphical representations in the context of your analysis.
    • R code. We would like to repeat the process from the R code given by ImaGEO.

C.2. Comparing meta-analysis results between men and women

  1. Intersection. Are there common significant genes between men and women from meta-analysis? We need the number of common significant genes, specific significant genes from women and specific significant genes from men. Could you prepare a nice image for this intersection? (Clue: Venny)
  2. Functional Enrichment. For each specif group of significant genes in men (the same for women), we would like to know the functional profiling. Could you generate this information from:
  3. Conclusions. Are there differences in meta-analysis results when comparing men and women? What about functional characterization to explain better these differences?