Bioinformatics Advance Access originally published online on February 26, 2010
Bioinformatics 2010 26(8):1029-1035; doi:10.1093/bioinformatics/btq092
“We define sequence map-ability at any location as the number of different reads of 36 bases that can be uniquely mapped to cover this location. Considering forward and reverse complement, sequence map-ability for any location is an integer in the range of [0 … 72].
To facilitate identifying unique matches, we also define a sequence commonness factor as the number of bases starting at any chromosomal position that are needed to define a unique location on the reference. Reference sequence commonness and sequence map-ability for every point in the reference are calculated and stored during the reference database construction process. Figure 1 shows the percentage of human genome with an equals or higher sequence commonness.”
I agree this is OK for a SNP caller that is designed for any genomes. But what makes SNP calling wrong is not uniqueness but the degeneracy factor which can be more devastating yet computationally more expensive.