Problem: Complex diseases are largely unsolved
Solution: Find causative & protective genetic variants
Hard-to-solve genetics, such as gene-environment interactions, can be more easily
Problem: Common variants are not the primary causes of disease
Solution: Find high effect size rare variants
CNV-based gene discovery dramatically reduces the ‘genome search space’
Problem: Large number of variants and most are benign
- Human diversity is 15X greater and more complex than previously thought, comprising a spectrum of variants that includes single nucleotide variants (SNVs), insertions/deletions (indels), copy number variants (CNVs), etc.
- Each individual harbors 3-4 million SNVs relative to the Reference genome, but the vast majority of these variants are benign and do not cause or contribute to disease.
- Most non-synonymous SNVs are difficult to interpret as disease-relevant without functional validation, while intronic and intergenic SNVs are even more challenging!
Solution: Reduce the genome search space
- Even with high-resolution CNV analysis methods, there are only 100-200 CNVs in an individual’s genome vs. 3-4 million SNVs.
- With less CNVs per genome, a much lower number of cases (‘disease’ genomes) and controls (‘normal’ genomes) can be assessed to discovery disease-relevant genes.
- CNVs, due to their large size (1,000 - 1,000,00 nucleotides vs. only 1 nucleotide for an SNV), are more likely to result in loss of function (LOF) of the gene and they are far easier to interpret as a benign or pathogenic variant.
Common, complex diseases are amalgams of rare diseases
Problem: Disease heterogeneity has been greatly underestimated
- Common diseases (e.g., Autism and Alzheimer’s) are not caused by mutations impacting one or a few genes. For example, Autism is now thought to be caused by mutations in 500-1,000 genes!
- It is likely that most individuals with a genetic disease will have one primary cause (i.e., 1-2 mutations impacting 1 gene), alongside numerous modifier variants of weak effect (can be deleterious or protective). For example, cystic fibrosis is a single gene disorder caused by mutations in CFTR, but the severity of lung function is quite variable even in patients with identical mutations (e.g., p.F508del homozygotes).
- In summary, a common disease can have several biologically unrelated causes and thus require different therapies. For example, anemia caused by iron deficiency needs a different treatment than the genetic disorder sickle cell anemia, which is caused by mutations in HBB.
Solution: Genetic subtyping to dissect a single disease into multiple rare diseases
- Since the majority of common diseases are 50-90% heritable (i.e., caused by genetics), there is a rationale for genetically dissecting them into multiple rare subtypes (e.g., Disease X = subtype X1 + subtype X2 + subtype X3, caused by mutations in gene 1, 2, or 3).
- Clinical trial cohorts assembled on the basis of genetic subtypes enables smaller, more cost-effective clinical trials that have higher success rates. [The 'likelihood of approval' for a drug is presently only ~10%, with >50% of failures due to lack of efficacy: Nat Biotechnol. 2014 Jan;32(1):40-51. PMID: 24406927]
STEP 1: Assemble disease cohort
STEP 2: Perform genome-wide CNV analysis
Gene X2, 1 of 5 cases
Gene X3, 3 of 6 cases
STEP 3: Perform targeted sequencing
Gene X2, 5 of 5 cases
Gene X3, 6 of 6 cases