Advanced Dataset Quality Control
Successful discovery demands the highest quality data. For this reason, our genotyping and statistics teams conduct a range of rigorous assessments of the data we generate for you, ensuring that when we analyze it, you will get the clearest and strongest signals.
In every genotyping or sequencing run we do, we perform several steps of quality control and correction before we begin analysis for your project. These include:
• Identifying SNPs that may have low yield or are not in Hardy-Weinberg equilibrium. Where necessary we conduct manual inspection or adjustment of the cluster files used in calling the genotypes.
• Checking the gender and inheritance structure of the samples
• Performing principal component analysis (PCA) on the genotypes to identify population substructures in the sample set and comparing this to the reported ethnicity.
deCODE has many years of experience managing the secured IT systems required to safely house and protect your sensitive research information.Following these checks, we can identify low yield that might need to be rerun. After doing so, our criteria remain stringent. Samples with fewer than 98% of the SNPs called will be excluded from the association analysis, and SNPs will be excluded from imputation analysis if fail-rate for entire cohort exceeds 2%. We also exclude for association analysis SNPs with significant differences in yield between entire patient and entire control datasets (or the largest defined case-control groups). This is done to decrease the likelihood of artifacts.


Successful discovery demands the highest quality data. For this reason, our genotyping and statistics teams conduct a range of rigorous assessments of the data we generate for you, ensuring that when we analyze it, you will get the clearest and strongest signals.

