This is an old revision of the document!


2.1 Primary Analysis - Sequence processing

After sequencing, reads are mapped to the human genome reference (GRCh37). Then, there are some filtering steps in which the number of reads decreases until variant calling. Here, we describe how this reduction occurs in each pipeline.

BIER's pipeline

In this pipeline, there are four stages where the number of reads decreases (mapping, filter by mapping quality, remove duplicates and intervals realignment). This table shows reads remaining after these stages.

  • N_reads_forward and reverse: initial number of reads forward and reverse obtained in the exome sequencing process.
  • N_mapped_read_pairs: number of read pairs mapped to the human genome reference.
  • %_mapped_read_pairs: percentage of initial read pairs mapped to the human genome reference.
  • N_mapped_reads_mapq>10: number of mapped reads whose mapping quality (mapq) is higher than 10.
  • %_mapped_reads_mapq>10: percentage of initial mapped reads whose mapping quality (mapq) is higher than 10.
  • N_reads_single_hit: number of reads uniquely mapped to the human genome reference.
  • %_reads_single_hit: percentage of initial reads uniquely mapped to the human genome reference.
  • N_reads_single_hit_realigned: number of reads realigned in the exome capture kit targets.
  • %_reads_single_hit_realigned: percentage of initial reads realigned in the exome capture kit targets.

CNAG's pipeline

In contrast with BIER's pipeline, there are only two stages where the number of reads decreases (mapping and remove duplicates).

  • N_reads_forward and reverse: initial number of reads forward and reverse obtained in the exome sequencing process.
  • N_mapped_read_pairs: number of read pairs mapped to the human genome reference.
  • %_mapped_read_pairs: percentage of initial read pairs mapped to the human genome reference.
  • N_read_pairs_single_hit: number of read pairs uniquely mapped to the human genome reference.
  • %_read_pairs_single_hit: percentage of initial read pairs uniquely mapped to the human genome reference.
SampleN_reads_forwardN_reads_reverseN_mapped_read_pairs%_mapped_read_pairsN_read_pairs_single_hit%_reads_pairs_single_hit
SGT038 31471997 31471997 27098380 86.10 26533415 84.31
SGT077 27308034 27308034 23450991 85.88 22904031 83.87
SGT161 27566691 27566691 23668265 85.86 23170780 84.05
SGT187 29730857 29730857 25554609 85.95 24894597 83.73
SGT230 30415770 30415770 26257368 86.33 25639411 84.30
SGT238 29472514 29472514 25386147 86.13 24773292 84.06
SGT241 29513223 29513223 25365246 85.95 24539542 83.15
SGT274 30394832 30394832 26242677 86.34 25693406 84.53
espinos/results.primary.analysis.1438882988.txt.gz · Last modified: 2015/08/06 19:43 by cespinos
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0