The genetic dissection of complex traits
The genetic dissection of complex traits
The genetic dissection of complex traits
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>The</strong> <strong>genetic</strong> <strong>dissection</strong><br />
<strong>of</strong> <strong>complex</strong> <strong>traits</strong><br />
Karl W Broman<br />
Biostatistics & Medical Informatics<br />
University <strong>of</strong> Wisconsin – Madison<br />
www.biostat.wisc.edu/~kbroman
<strong>The</strong> goal<br />
Identify genes that contribute to <strong>complex</strong> diseases<br />
Complex disease = one that’s hard to figure out<br />
Many genes + environment + other stuff<br />
2
<strong>The</strong> <strong>genetic</strong> approach<br />
• Start with the trait; find genes that influence it<br />
– Allelic differences at the gene result in phenotypic differences<br />
• Value: Need not know anything in advance<br />
• Goal<br />
– Understand the disease etiology (pathways/mechanisms)<br />
– Prediction/prevention<br />
– Identify possible drug targets<br />
3
Approaches<br />
• Experimental crosses in model organisms<br />
• Mutagenesis in model organisms<br />
• Association analysis with inbred strains<br />
• Linkage analysis in human pedigrees<br />
– A few large pedigrees<br />
– Many small families (e.g., sib pairs)<br />
• Association analysis in human populations<br />
– Candidate genes vs. whole genome<br />
4
Human vs mouse<br />
www.daviddeen.com<br />
6
Mutagenesis<br />
Advantages<br />
+ Can find things<br />
+ At the gene<br />
Disadvantages<br />
– Need cheap phenotype screen<br />
– Mutations must have large effect<br />
– Genes found may be irrelevant<br />
– Still need to map the mutation<br />
– Recessive mutations are hard to see<br />
7
Intercross<br />
P 1 P 2<br />
F 1 F 1<br />
F 2<br />
8
LOD curves<br />
8<br />
6<br />
LOD score<br />
4<br />
5%<br />
2<br />
0<br />
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19<br />
Chromosome<br />
9
Joys <strong>of</strong> QTL mapping<br />
• Genotype → phenotype<br />
• Recombination is cool<br />
• Simple correlation structure<br />
• Lots <strong>of</strong> opportunity for collaboration<br />
• Lots <strong>of</strong> open problems<br />
• Not many competitors<br />
10
Chromosome 4<br />
8<br />
6<br />
LOD score<br />
4<br />
2<br />
0<br />
1.5−LOD support interval<br />
0 20 40 60<br />
Map position (cM)<br />
11
Traditional approach<br />
F 2 /BC −→ QTL −→ congenic −→ subcongenics<br />
−→ • coding/regulatory polymorphism<br />
• expression/function difference<br />
• knock-in / transgenic<br />
• knock-out<br />
• homology to other species<br />
Issues:<br />
• Large QTL regions<br />
• Time consuming and expensive<br />
12
A congenic line<br />
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 X Y<br />
13
More modern approaches<br />
• Gene expression data<br />
– and proteins, metabolites, epi<strong>genetic</strong> marks<br />
– and eQTL analyses<br />
• Consomics (aka chromosome substitution strains)<br />
• Panels <strong>of</strong> congenics<br />
• Multiple crosses + haplotype analysis<br />
• Cross-species comparisons<br />
• Advanced intercross lines / Heterogenous Stock<br />
• Recombinant inbred lines / Collaborative Cross<br />
• Association mapping with inbred lines<br />
14
Consomics<br />
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 X Y 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 X Y 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 X Y<br />
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 X Y 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 X Y 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 X Y<br />
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 X Y 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 X Y 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 X Y<br />
15
Consomics<br />
Advantages<br />
+ Just phenotyping can get you to<br />
the chromosomes<br />
+ Eliminate the effects <strong>of</strong> other QTL<br />
+ Easy to create congenics<br />
Disadvantages<br />
– Time-consuming, expensive to<br />
create<br />
– Lots <strong>of</strong> phenotyping required<br />
– Cannot see interactions<br />
16
B.A on low fat diet<br />
C57BL/6J − Chr A J / NaJ<br />
0.10<br />
0.08<br />
Weight gain<br />
0.06<br />
0.04<br />
0.02<br />
0.00<br />
A/J 7 10 14 11 6 1 Y 9 4 16 8 X 12 2 19 3 17 13 15 M 5 18<br />
B6<br />
Strain<br />
Shao et al., PNAS, 105:19910–19914, 2008<br />
17
More modern approaches<br />
• Gene expression data<br />
– and proteins, metabolites, epi<strong>genetic</strong> marks<br />
– and eQTL analyses<br />
• Consomics (aka chromosome substitution strains)<br />
• Panels <strong>of</strong> congenics<br />
• Multiple crosses + haplotype analysis<br />
• Cross-species comparisons<br />
• Advanced intercross lines / Heterogenous Stock<br />
• Recombinant inbred lines / Collaborative Cross<br />
• Association mapping with inbred lines<br />
18
Advanced intercross lines<br />
P<br />
A<br />
B<br />
F 2<br />
F 3<br />
F 4<br />
F 7<br />
F 10<br />
19
AIL<br />
Advantages<br />
+ Many more breakpoints =⇒<br />
more precise mapping<br />
+ Straightforward to create<br />
Disadvantages<br />
– Time and cost<br />
– Each individual <strong>genetic</strong>ally distinct<br />
– Useful largely for fine-mapping<br />
known loci<br />
– Relationships among individuals<br />
in final generation<br />
20
Sibships at F 8<br />
21
AIL: percent matching<br />
genotypes<br />
22
AIL: genotype data<br />
23
Heterogeneous stock<br />
G 0<br />
A B C D E F G H<br />
G 1<br />
G 2<br />
G 10<br />
G 20<br />
G 40<br />
24
HS<br />
Advantages<br />
+ Super-dense breakpoints<br />
+ Many alleles<br />
+ Heterozygous<br />
Disadvantages<br />
– Must be satisfied with what is<br />
available<br />
– Inbreeding: loss <strong>of</strong> alleles<br />
– Each individual unique<br />
– Like AIL, maybe best for<br />
fine-mapping known loci<br />
– Like AIL, relationships at last<br />
generations<br />
25
Recombinant inbred lines<br />
P 1 P 2<br />
F 1 F 1<br />
F 2<br />
F 3<br />
F 4<br />
●<br />
●<br />
●<br />
F ∞<br />
26
RIL<br />
Advantages<br />
+ High density <strong>of</strong> breakpoints<br />
+ Just genotype once<br />
+ Phenotype multiple individuals to<br />
reduce environmental/individual<br />
variation<br />
+ Multiple phenotypes on the same<br />
genomes<br />
+ Longitudinal phenotypes<br />
+ Genotype × environment<br />
interactions<br />
Disadvantages<br />
– Time-consuming, expensive to<br />
create<br />
– Available panels generally too<br />
small<br />
– Only homozygotes<br />
27
RIX<br />
1 2 3 4 5<br />
RI strains<br />
RIX<br />
1 x 2 1 x 3 1 x 4 1 x 5 2 x 3 2 x 4 2 x 5 3 x 4 3 x 5 4 x 5<br />
28
Collaborative Cross<br />
G 0<br />
A B C D E F G H<br />
G 1<br />
A B C D E F G H<br />
G 2<br />
ABCD EFGH<br />
G 3<br />
G 4<br />
●<br />
●<br />
●<br />
G ∞<br />
29
A CC genome<br />
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 X Y<br />
30
Association mapping<br />
• Phenotype available inbred strains<br />
• Make use <strong>of</strong> available SNP data<br />
• Need to account for the correlations among strains<br />
• Likely want to work with haplotypes rather than just individual SNPs<br />
• Be careful about wild-derived strains<br />
31
Association mapping<br />
Advantages<br />
+ Once you’ve done a strain survey,<br />
no further data needed<br />
+ Potentially very high resolution<br />
Disadvantages<br />
– All the usual problems with<br />
association mapping<br />
– Power is unpredictable<br />
– How to account for relationships<br />
among strains<br />
32
CC vs HS vs<br />
association mapping<br />
<strong>The</strong>se approaches have many similarites.<br />
Key differences:<br />
• CC, HS: pattern <strong>of</strong> association along chromosomes by design<br />
• HS: each individual unique<br />
33
Summary<br />
• QTL mapping in mice is fun and useful<br />
• But we need to be able to get to gene<br />
• <strong>The</strong>re are lots <strong>of</strong> new strategies to speed things along<br />
• Numerous interesting statistical problems remain<br />
• Need to consider human GWAS<br />
34