Abstract

Introduction
S
Methods
Sample preparation
Genomic DNA samples (n = 380) that were used for this study were obtained from the National Biobank of Korea. These were randomly selected in genomic DNA samples collected through the Korean Genome and Epidemiology Study. Genomic DNA was isolated from blood samples using the Gentra Puregene Blood Kit (Qiagen, Chatsworth, CA) in accordance with the manufacturer's instructions.
SNPs genotyping
We selected 20 SNPs included in the Korean Chip (produced through the Korea Biobank Array Project) from among 169 SNP markers for sample identification selected from 1000 genome data. 6 SNPs genotyping was performed using the SNPtype Assay (STA) (Fluidigm, San Francisco, CA), according to the manufacturer's instructions. In brief, the genomic DNA (50 ng) was amplified using a polymerase chain reaction (PCR) with Qiagen 2 × Mutiplex PCR Master Mix (Qiagen) and STA primer set (in final volume of 2.5 μL). The PCRs were carried out as follows: 15 minutes at 95°C for 1 cycle and 14 cycles on 95°C for 15 seconds and 60°C for 4 minutes. After amplification, 1.9 μL of the diluted STA products was added to a Sample Pre-Mix (containing 2.25 μL of 2 × Fast Probe Master Mix, 0.225 μL of the SNPtype 20 × Sample Loading Reagent, 0.075 μL of the SNPtype Reagent, and 0.027 μL of the ROX™). After the Assay Pre-Mix and the Sample Pre-Mix were loaded into the 192.24 Genotyping Dynamic Array, the SNPtype Assay reaction was performed. Analysis was performed through Fluidigm SNP Genotyping Analysis software (version 4.0.1; Fluidigm). We assessed genotype frequency, allele frequency, call rate, match probability, and heterozygosity for each SNP. The match probability of each SNP was calculated as previously described.7,8
Results
We performed genotyping of 22 SNPs using 380 blood-derived genomic DNA samples. SNPs included 20 candidate SNP markers for sample identification (rs10055677, rs2470209, rs6785504, rs11208131, rs6796688, rs12286769, rs6840524, rs12876644, rs2613019, rs7307697, rs12965342, rs2736966, rs9268831, rs1433811, rs4027132, rs970022, rs1487602, rs6106856, rs1790875, and rs6596805) and 2 SNP markers for sex determination (rs2563387 and rs2571795). Call rates of 2 SNP markers for sex determination were 1.0, and 244 of 380 samples were analyzed as samples of female donors (data not shown). Call rates of 20 candidate SNP markers for sample identification were >0.99 (Table 1). The match probability of each SNP ranged from 0.364 to 0.425 and the heterozygosity ranged from 0.419 to 0.500. The combined mean match probability of 20 SNPs was 4.51 × 10−9.
SNPs, single nucleotide polymorphisms.
Discussion
Blood-derived genomic DNA samples are increasingly utilized for genetic approaches using next-generation sequencing and array-based genotyping technologies. These technologies enable the production of high-throughput data for the discovery of disease markers and analysis of disease mechanisms. 9 However, preanalytical errors can occur because of DNA contamination or mistakes in labeling during sample processing. Genetic sample identification through analysis of SNPs will minimize or remove these preanalytical errors. For example, the results of analysis for sex-linked SNP markers using genomic DNA sample can be compared with sex information in donor's epidemiological questionnaire and report for clinical examination. Genetic sample identification through analysis of SNPs can also be used for quality control of next-generation sequencing and SNP genotyping data. Twenty SNPs tested in this study show slightly higher performance (combined mean match probability, 4.51 × 10−9) in sample identification, than other SNP markers. The combined mean match probability is <3.20 × 10−9 and 4.18 × 10−9 when each of the 20 SNPs are selected in SNPs for sample identification reported by Kim et al. 4 and Lee et al., 5 respectively. We propose these SNPs as sample identification markers for quality control of genomic DNA samples obtained from Korean individuals.
Footnotes
Acknowledgments
This work was supported by the Korea Biobank Project (Grant No. 4851-307-210-13) in the Korea National Institute of Health, Korea Centers for Disease Control and Prevention.
Author Disclosure Statement
No conflicting financial interests exist.
