This is an old revision of the document!
2.1 Primary Analysis - Sequence processing
After sequencing, reads are mapped to the human genome reference (GRCh37). Then, there are some filtering steps in which the number of reads decreases until variant calling. Here, we describe how this reduction occurs in each pipeline.
BIER's pipeline
In this pipeline, there are four stages where the number of reads decreases (mapping, filter by mapping quality, remove duplicates and intervals realignment). This table shows reads remaining after these stages.
N_reads_forward and reverse: initial number of reads forward and reverse obtained in the exome sequencing process.
N_mapped_read_pairs: number of read pairs mapped to the human genome reference.
%_mapped_read_pairs: percentage of initial read pairs mapped to the human genome reference.
N_mapped_reads_mapq>10: number of mapped reads whose mapping quality (mapq) is higher than 10.
%_mapped_reads_mapq>10: percentage of initial mapped reads whose mapping quality (mapq) is higher than 10.
N_reads_single_hit: number of reads uniquely mapped to the human genome reference.
%_reads_single_hit: percentage of initial reads uniquely mapped to the human genome reference.
N_reads_single_hit_realigned: number of reads realigned in the exome capture kit targets.
%_reads_single_hit_realigned: percentage of initial reads realigned in the exome capture kit targets.
CNAG's pipeline
In contrast with BIER's pipeline, there are only two stages where the number of reads decreases (mapping and remove duplicates).
N_reads_forward and reverse: initial number of reads forward and reverse obtained in the exome sequencing process.
N_mapped_read_pairs: number of read pairs mapped to the human genome reference.
%_mapped_read_pairs: percentage of initial read pairs mapped to the human genome reference.
N_read_pairs_single_hit: number of read pairs uniquely mapped to the human genome reference.
%_read_pairs_single_hit: percentage of initial read pairs uniquely mapped to the human genome reference.
| Sample | N_reads_forward | N_reads_reverse | N_mapped_read_pairs | %_mapped_read_pairs | N_read_pairs_single_hit | %_reads_pairs_single_hit |
| SGT038 | 31471997 | 31471997 | 27098380 | 86.10 | 26533415 | 84.31 |
| SGT077 | 27308034 | 27308034 | 23450991 | 85.88 | 22904031 | 83.87 |
| SGT161 | 27566691 | 27566691 | 23668265 | 85.86 | 23170780 | 84.05 |
| SGT187 | 29730857 | 29730857 | 25554609 | 85.95 | 24894597 | 83.73 |
| SGT230 | 30415770 | 30415770 | 26257368 | 86.33 | 25639411 | 84.30 |
| SGT238 | 29472514 | 29472514 | 25386147 | 86.13 | 24773292 | 84.06 |
| SGT241 | 29513223 | 29513223 | 25365246 | 85.95 | 24539542 | 83.15 |
| SGT274 | 30394832 | 30394832 | 26242677 | 86.34 | 25693406 | 84.53 |