Exploratory analysis

 

 

This script allows you to explore and display the data from the selected studies. selected. The raw data were downloaded from GEO and ArrayExpress, the image data from miroarrays were converted to unnormalized counts. The image data from the microarrays were converted to unnormalized counts.

Given the variability of the information, the studies were unified according to the following criteria, taking into account these characteristics. the following criteria, taking into account these characteristics and the levels present at the time and the levels present at that time, depending on the experiment:

  • Sample identifier:
    Name assigned to the sample according to a simple ascending numbering.
- Sample:  

Levels : S1...Sn
  • Diabetes:
    Medical condition that categorizes the patient based on response to insulin and and the presence of T2D
- Diabetes: 

  · IS: “Insulin Sensitive” (Insulin Sensitive or control)
  · IR: “Insulin Resistance” (insulin resistant or diabetic)

Levels : IS, IR
  • Obesity:
    Condition that categorizes the patient in terms of obesity. It is defined according to BMI; for this study, a limit of 30 kg/m2 has been set in all cases, based on previously defined and documented medical criteria.
- Obesity: 

  · Np: Normal weigth (BMI > 30kg/m2)
  · Ob: Obesity (BMI >= 30kg/m2)

Levels : Np, Ob
  • Group:
    It is defined as the combination of the patient’s obesity status and T2D, which facilitates the analysis and comparison of different subgroups within the patient. In this analysis, three levels are defined, obtained by the linear combination of the two previous variables and distributed among the different studies. These have been recoded to correspond to the nomenclature of the paper.
- Group:  

  · C: Control (Np_IS)
  · Ob: Obesity (Ob_IS) 
  · T2D: Type 2 diabetes (Ob_IR)

Levels : C, Ob, T2D
  • Sex:
    Condition representing the sex of the patient from which the sample originated.
- Sex: 

  · M: "Male"
  · F: "Female"

Levels : M, F
  • Tissue:
    Detailed classification of the tissue of origin of the sample. According to the origin of the sample and the previously described clinical and physiological differences between the different tissues and their disposition, two clinically relevant groups have been defined: SAT and VAT.
- Tissue: 

  · SAT: Subcutaneous Adipose Tissue 
  · VAT: Visceral Adipose Tissue 

Levels : SAT, VAT
  • Age Group:
    Classification of patients by age group. Because of the prevalence and importance of DM2 in advanced age and the metabolic and hormonal differences typical of age and development, patients were divided into two groups.
- AgeGroup: 

  · Adult: Adult
  · Non adult: Children under the age of 18

Levels : Adult, Non adult
  • Patient identifier:
    Identifier code assigned to each sample according to the patient of origin. In case two or more samples come from the same patient, they have the same identifier.
- Patient:  

Levels : P1...Pn

After unifying the information from the studies, they have been processed and stored in objects of the SummarizedExperiment and ExpressionSet classes.

 


 

Summary

Transcriptomic study using arrays on the Affymetrix platform. GeneChip Human Genome U133 Plus 2.0 [HG-U133_Plus_2].

The aim of this study was to investigate the global transcriptomic profiles of twin pairs with discordant body mass indexes. These profiles were analyzed on the basis of abdominal subcutaneous adipose tissue (SAT) and also provide information on BMI, allowing patients to be classified into obese and controls on this basis.

 

 

Summary

 

Figure 1: Summary of the number of samples comprising the E_MEXP_1425 study by group and sex.

 


 

 

Boxplot

 

Boxplot

 

Figure 2: Boxplot of the study E-MEXP-1425, samples are grouped by color based on their medical condition, with color intensity indicating sex.

 


 

PCA

 

PCA

 

Figure 3: Principal component analysis of the study E-MEXP-1425, samples are grouped by color based on their medical condition, with color intensity indicating sex.

 


 

Clustering

 

Clustering

 

Figure 4: Hierarchical clustering of the study E-MEXP-1425, samples are grouped by color based on their medical condition, with color intensity indicating sex. This cluster has been carried out both on the basis of the correlation between samples (B) and on the basis of the Euclidean distance (A).

 


 

Summary

Transcriptomic study by arrays using the platforms:

  • [HG_U95A] Affymetrix Human Genome U95A Array.
  • HG_U95B] Affymetrix Human Genome U95B Array.
  • HG_U95C] Affymetrix Human Genome U95C Array.
  • HG_U95D] Affymetrix U95D Human Genome Array.
  • HG_U95E] Affymetrix U95E Human Genome Array.
  • HG_U95Av2] Affymetrix Human Genome U95 version 2 array.

This study aims to study the differential transcriptomic profiles between healthy, non-insulin resistant patients, 20 obese and 19 with normal weight from adipocytes derived from abdominal subcutaneous adipose tissue, Indians, considering obese subjects those with a BMI greater than or equal to 30kg/m2.

 

 

Summary

 

Figure 5: Summary of the number of samples comprising the GSE2508 study by group and sex.

 


 

 

Boxplot

 

Boxplot

 

Figure 6: Boxplot of the study GSE2508, samples are grouped by color based on their medical condition, with color intensity indicating sex.

 


 

PCA

 

PCA

 

Figure 7: Principal component analysis of the study GSE2508, samples are grouped by color based on their medical condition, with color intensity indicating sex.

 


 

Clustering

 

Clustering

 

Figure 8: Hierarchical clustering of the study GSE2508, samples are grouped by color based on their medical condition, with color intensity indicating sex. This cluster has been carried out both on the basis of the correlation between samples (B) and on the basis of the Euclidean distance (A).

 


 

 

PCA

 

PCA

 

Figure 9: Principal component analysis of the study GSE2508, attending to the batch effect.

 


 

Clustering

 

Clustering

 

Figure 10: Hierarchical clustering of the study GSE2508, samples are grouped on the basis of their batch effect by color, while the shape denotes sex. This cluster has been carried out for each of the tissues, both on the basis of the correlation between the samples (B) and on the basis of the Euclidean distance (A).

 


 

 

Boxplot

 

Boxplot

 

Figure 11: Boxplot of the study GSE2508, samples are grouped by color based on their medical condition, with color intensity indicating sex.

 


 

PCA

 

PCA

 

Figure 12: Principal component analysis of the study GSE2508, samples are grouped by color based on their medical condition, with color intensity indicating sex.

 


 

Clustering

 

Clustering

 

Figure 13: Hierarchical clustering of the study GSE2508, samples are grouped by color based on their medical condition, with color intensity indicating sex. This cluster has been carried out both on the basis of the correlation between samples (B) and on the basis of the Euclidean distance (A).

 


 

Summary

Transcriptomic study by means of arrays using the [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array (GPL570) platform.

This study highlights Obesity as a risk factor in the development of diseases and metabolic disorders, more precisely in the development of insulin resistance, for which the Expression profiles of obese insulin resistant and insulin sensitive patients have been analyzed from samples of subcutaneous abdominal adipose tissue (SAT) and omental (VAT) of 20 adult patients, with an average age of 42 years.

 

 

Summary

 

Figure 14: Summary of the number of samples comprising the GSE20950 study by group and sex.

 


 

 

Boxplot

 

Boxplot

 

Figure 15: Boxplot of the study GSE20950, samples are grouped by color based on their medical condition, with color intensity indicating sex.

 


 

PCA

 

PCA

 

Figure 16: Principal component analysis of the study GSE20950, samples are grouped by color based on their medical condition, with color intensity indicating sex.

 


 

Clustering

 

Clustering

 

Figure 17: Hierarchical clustering of the study GSE20950, samples are grouped by color based on their medical condition, with color intensity indicating sex. This cluster has been carried out both on the basis of the correlation between samples (B) and on the basis of the Euclidean distance (A).

 


 

 

PCA

 

PCA

 

Figure 18: Principal component analysis of the study GSE20950, attending to the batch effect.

 


 

Clustering

 

Clustering

 

Figure 19: Hierarchical clustering of the study GSE20950, samples are grouped on the basis of their batch effect by color, while the shape denotes sex. This cluster has been carried out for each of the tissues, both on the basis of the correlation between the samples (B) and on the basis of the Euclidean distance (A).

 


 

 

Boxplot

 

Boxplot

 

Figure 20: Boxplot of the study GSE20950, samples are grouped by color based on their medical condition, with color intensity indicating sex.

 


 

PCA

 

PCA

 

Figure 21: Principal component analysis of the study GSE20950, samples are grouped by color based on their medical condition, with color intensity indicating sex.

 


 

Clustering

 

Clustering

 

Figure 22: Hierarchical clustering of the study GSE20950, samples are grouped by color based on their medical condition, with color intensity indicating sex. This cluster has been carried out both on the basis of the correlation between samples (B) and on the basis of the Euclidean distance (A).

 


 

Summary

Transcriptomic study using the [HuGene-1_0-st] Affymetrix Human Gene 1.0 ST Array [transcript (gene) version] platform (GPL6244).

In an attempt to elucidate the genetic mechanisms behind obesity, which may subsequently lead to the development of type 2 diabetes, this study aims to study the differences between obese children, with the goal of minimizing the effects of puberty.
For this purpose, 15 samples of abdominal subcutaneous adipose tissue (SAT) and 5 of visceral adipose tissue (VAT) were taken from children under 9 years of age.

 

 

Summary

 

Figure 23: Summary of the number of samples comprising the GSE29718 study by group and sex.

 


 

 

Boxplot

 

Boxplot

 

Figure 24: Boxplot of the study GSE29718, samples are grouped by color based on their medical condition, with color intensity indicating sex.

 


 

PCA

 

PCA

 

Figure 25: Principal component analysis of the study GSE29718, samples are grouped by color based on their medical condition, with color intensity indicating sex.

 


 

Clustering

 

Clustering

 

Figure 26: Hierarchical clustering of the study GSE29718, samples are grouped by color based on their medical condition, with color intensity indicating sex. This cluster has been carried out both on the basis of the correlation between samples (B) and on the basis of the Euclidean distance (A).

 


 

 

PCA

 

PCA

 

Figure 27: Principal component analysis of the study GSE29718, attending to the batch effect.

 


 

Clustering

 

Clustering

 

Figure 28: Hierarchical clustering of the study GSE29718, samples are grouped on the basis of their batch effect by color, while the shape denotes sex. This cluster has been carried out for each of the tissues, both on the basis of the correlation between the samples (B) and on the basis of the Euclidean distance (A).

 


 

 

Boxplot

 

Boxplot

 

Figure 29: Boxplot of the study GSE29718, samples are grouped by color based on their medical condition, with color intensity indicating sex.

 


 

PCA

 

PCA

 

Figure 30: Principal component analysis of the study GSE29718, samples are grouped by color based on their medical condition, with color intensity indicating sex.

 


 

Clustering

 

Clustering

 

Figure 31: Hierarchical clustering of the study GSE29718, samples are grouped by color based on their medical condition, with color intensity indicating sex. This cluster has been carried out both on the basis of the correlation between samples (B) and on the basis of the Euclidean distance (A).

 


 

Summary

Transcriptomic study through arrays using the Illumina HumanHT-12 V4.0 expression beadchip platform (GPL10558).
In an attempt to elucidate the changes at the transcriptional level that lead to the development of obesity and the possible future association between obesity and type 2 diabetes, in this study samples were taken from the abdominal subcutaneous tissue of 64 unrelated, non-diabetic adult subjects with different body mass indexes (BMI).
From the data collected in this study, based on BMI, and taking 30>= as a cut-off, we can separate the subjects into two groups: obese (n=35) and control (n=29).

 

 

Summary

 

Figure 32: Summary of the number of samples comprising the GSE64567 study by group and sex.

 


 

 

Boxplot

 

Boxplot

 

Figure 33: Boxplot of the study GSE64567, samples are grouped by color based on their medical condition, with color intensity indicating sex.

 


 

PCA

 

PCA

 

Figure 34: Principal component analysis of the study GSE64567, samples are grouped by color based on their medical condition, with color intensity indicating sex.

 


 

Clustering

 

Clustering

 

Figure 35: Hierarchical clustering of the study GSE64567, samples are grouped by color based on their medical condition, with color intensity indicating sex. This cluster has been carried out both on the basis of the correlation between samples (B) and on the basis of the Euclidean distance (A).

 


 

Summary

Transcriptomic study using the Affymetrix Human Genome U133 Plus 2.0 Array platform [CDF: Brainarray HGU133Plus2_Hs_ENTREZG_v18].

This study aims to study the global transcriptomic profiles of twin pairs with discordant body mass indexes (more than 3kg/m2). These profiles were analyzed based on abdominal subcutaneous adipose tissue (SAT).

 

 

Summary

 

Figure 36: Summary of the number of samples comprising the GSE92405 study by group and sex.

 


 

 

Boxplot

 

Boxplot

 

Figure 37: Boxplot of the study GSE92405, samples are grouped by color based on their medical condition, with color intensity indicating sex.

 


 

PCA

 

PCA

 

Figure 38: Principal component analysis of the study GSE92405, samples are grouped by color based on their medical condition, with color intensity indicating sex.

 


 

Clustering

 

Clustering

 

Figure 39: Hierarchical clustering of the study GSE92405, samples are grouped by color based on their medical condition, with color intensity indicating sex. This cluster has been carried out both on the basis of the correlation between samples (B) and on the basis of the Euclidean distance (A).

 


 

Summary

Transcriptomic study by high throughput sequencing.

In this case we studied the development of type 2 diabetes in relation to obesity, for this purpose we have taken samples of abdominal subcutaneous adipose tissue from three groups of patients, healthy subjects without obesity (controls, n=9), obese healthy subjects (n=8) and obese subjects with type 2 diabetes (n=8).

 

 

Summary

 

Figure 40: Summary of the number of samples comprising the GSE141432 study by group and sex.

 


 

 

Boxplot

 

Boxplot

 

Figure 41: Boxplot of the study GSE141432, samples are grouped by color based on their medical condition, with color intensity indicating sex.

 


 

PCA

 

PCA

 

Figure 42: Principal component analysis of the study GSE141432, samples are grouped by color based on their medical condition, with color intensity indicating sex.

 


 

Clustering

 

Clustering

 

Figure 43: Hierarchical clustering of the study GSE141432, samples are grouped by color based on their medical condition, with color intensity indicating sex. This cluster has been carried out both on the basis of the correlation between samples (B) and on the basis of the Euclidean distance (A).

 


 

Summary

Transcriptomic study by high throughput sequencing.

The main objective is to elucidate the mechanisms that generate the UPV (‘unexplained’ phenotypic variation) that was observed in previous human and murine studies and that in this case, could surround Obesity. For this purpose, abdominal subcutaneous tissue samples were taken from a total of 61 samples from children under 18 years of age belonging to two experimental groups, Obesity (n=26) and control (n=35).

 

 

Summary

 

Figure 44: Summary of the number of samples comprising the GSE205668 study by group and sex.

 


 

 

Boxplot

 

Boxplot

 

Figure 45: Boxplot of the study GSE205668, samples are grouped by color based on their medical condition, with color intensity indicating sex.

 


 

PCA

 

PCA

 

Figure 46: Principal component analysis of the study GSE205668, samples are grouped by color based on their medical condition, with color intensity indicating sex.

 


 

Clustering

 

Clustering

 

Figure 47: Hierarchical clustering of the study GSE205668, samples are grouped by color based on their medical condition, with color intensity indicating sex. This cluster has been carried out both on the basis of the correlation between samples (B) and on the basis of the Euclidean distance (A).

 


 

Summary of the gene identification and annotation.

 

Studies

Platform

Package

Origin

Nº de genes

Pre-Annotation

Post-Annotation

E_MEXP_1425

Affymetrix GeneChip Human Genome U133 Plus 2.0 [HG-U133_Plus_2]

hgu133plus2.db

Bioconductor

54.675

21.367

GSE2508

[HG_U95Av2] Affymetrix Human Genome U95 Version 2 Array
[HG_U95A] Affymetrix Human Genome U95A Array
[HG_U95B] Affymetrix Human Genome U95B Array
[HG_U95C] Affymetrix Human Genome U95C Array
[HG_U95D] Affymetrix Human Genome U95D Array
[HG_U95E] Affymetrix Human Genome U95E Array

hgu95av2.db

hgu95a.db

hgu95b.db

hgu95c.db

hgu95d.db

hgu95e.db

Bioconductor

62.881

19.891

GSE20950

[HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array

hgu133plus2.db

Bioconductor

54.675

21.367

GSE29718

[HuGene-1_0-st] Affymetrix Human Gene 1.0 ST Array [transcript (gene) version]

hugene10sttranscriptcluster.db

Bioconductor

32.321

19.975

GSE64567

Illumina HumanHT-12 V4.0 expression beadchip

illuminaHumanv4.db

Bioconductor

31.545

16.589

GSE92405

[HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array [CDF: Brainarray HGU133Plus2_Hs_ENTREZG_v18]

hgu133plus2.db

Bioconductor

54.675

21.367

GSE141432

Illumina NextSeq 500 (Homo sapiens)

org.Hs.eg.db

Bioconductor

58.298

35.795

GSE205668

Illumina HiSeq 2500 (Homo sapiens)

org.Hs.eg.db

Bioconductor

63.676

35.949

 


 

 

UpSet

 

Figure 48: UpSet plot summary of study annotations.

 

Summary of the studies and samples.

Tissue

Group

Studies

E_MEXP_1425

GSE20950

GSE29718

GSE64567

GSE92405

GSE141432

GSE205668

SAT

C

13

0

8

29

26

9

35

Ob

14

10

7

35

26

8

26

T2D

0

9

0

0

0

8

0

VAT

C

0

0

2

0

0

0

0

Ob

0

10

3

0

0

0

0

T2D

0

10

0

0

0

0

0

 


 

 

BarPlot

 

Figure 49: Summary BarPlot.