Developing 384-plex SNP marker sets for breeding and ... - icrisat
Developing 384-plex SNP marker sets for breeding and ... - icrisat
Developing 384-plex SNP marker sets for breeding and ... - icrisat
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Developing</strong> <strong>384</strong>-<strong>plex</strong> <strong>SNP</strong> <strong>marker</strong> <strong>sets</strong> <strong>for</strong><br />
<strong>breeding</strong> <strong>and</strong> genetics applications in rice<br />
Dr. Michael J. Thomson<br />
Molecular Geneticist<br />
International Rice Research Institute, Philippines<br />
m.thomson@cgiar.org<br />
2 nd National Workshop on Marker-Assisted Selection <strong>for</strong> Crop Improvement,<br />
ICRISAT, India<br />
October 27, 2010
What is the potential <strong>for</strong> molecular<br />
<strong>marker</strong> technology?<br />
• Enables novel <strong>breeding</strong> strategies<br />
o Novel genes <strong>and</strong> alleles with beneficial effects can be<br />
identified <strong>and</strong> quickly introduced into varieties<br />
o QTL pyramiding to combine essential traits<br />
o Genomic selection to increase rate of genetic gain<br />
• Increases speed <strong>and</strong> efficiency<br />
o Reduces time to remove negative linkage drag after<br />
transferring genes from unimproved sources<br />
o It can accelerate the <strong>breeding</strong> process: precise<br />
selection, improved screening, fewer generations
SSR <strong>marker</strong>s have some disadvantages<br />
• High polymorphism<br />
rate, but having<br />
many alleles makes<br />
precise scoring<br />
difficult<br />
• SSR data is difficult<br />
to merge across labs<br />
<strong>and</strong> groups<br />
• Not easy to run in a<br />
high-throughput<br />
system due to<br />
limitations in<br />
multi<strong>plex</strong> levels<br />
www.gramene.org
<strong>SNP</strong>s are now the <strong>marker</strong> of choice<br />
• <strong>SNP</strong>s are abundant<br />
across the genome<br />
• Large pools of <strong>SNP</strong>s can<br />
be used to identify <strong>sets</strong> of<br />
polymorphic <strong>marker</strong>s<br />
• <strong>SNP</strong> <strong>marker</strong>s are bi-allelic<br />
making allele calling more<br />
simple<br />
• <strong>SNP</strong> data from different<br />
systems or groups can be<br />
easily merged in a<br />
database<br />
• <strong>SNP</strong> genotyping can be<br />
automated, allowing <strong>for</strong><br />
rapid, high-throughput<br />
<strong>marker</strong> genotyping<br />
<strong>SNP</strong> locus<br />
genomic DNA<br />
<strong>SNP</strong> genotyping with allele specific oligos
Rice sequence has enabled <strong>SNP</strong> discovery<br />
High quality BAC-by-BAC<br />
O. sativa japonica (Nipponbare)<br />
(< 1 error in 10K bases)<br />
International Rice Genome Sequencing Project (IRGSP) 2005 Nature 436:793-800
Resources <strong>for</strong> <strong>SNP</strong> genotyping in rice<br />
<strong>SNP</strong><br />
discovery<br />
Nipponbare/<br />
93-11 <strong>SNP</strong>s<br />
(Feltus et al. 2004;<br />
<strong>384</strong>k <strong>SNP</strong>s)<br />
Oryza<strong>SNP</strong><br />
resequencing<br />
(20 rice varieties;<br />
160k <strong>SNP</strong>s)<br />
Next-gen<br />
resequencing<br />
(in progress;<br />
millions of <strong>SNP</strong>s)<br />
High density<br />
genotyping<br />
Association<br />
genetics<br />
44k <strong>SNP</strong> chip<br />
(S. McCouch et al.<br />
Cornell Univ.)<br />
Future high<br />
density <strong>SNP</strong><br />
chips<br />
Low density<br />
genotyping<br />
Illumina 1536<br />
(CIRAD)<br />
Illumina 1536<br />
(Cornell Univ.)<br />
Illumina BeadXpress<br />
96 <strong>and</strong> <strong>384</strong>-<strong>plex</strong><br />
(IRRI, Cornell, USDA)<br />
QTL mapping, <strong>marker</strong>-assisted <strong>breeding</strong>,<br />
genetic diversity analysis, DNA fingerprinting
IR64<br />
IAC 165<br />
M202<br />
Moroberkan<br />
Dom Sufid<br />
Cypress<br />
Pokkali<br />
Aswina<br />
Swarna<br />
Inia Tocuari<br />
Oryza<strong>SNP</strong>: 160K <strong>SNP</strong>s<br />
detected across 20 varieties<br />
McNally et al. 2009 (PNAS)<br />
Co 39 Patbyeo Gerdeh Dular Sadu-cho
Oryza<strong>SNP</strong> data available online<br />
www.oryzasnp.plantbiology.msu.edu
Illumina GoldenGate<br />
<strong>SNP</strong> genotyping<br />
Illumina Veracode Technology<br />
on the BeadXpress Reader
Genome coverage with rice <strong>SNP</strong> chips<br />
Chr<br />
<strong>384</strong>-<strong>plex</strong> (Thomson, in prep)<br />
44K (Tung et al., in prep.)<br />
1536-<strong>plex</strong> (Zhao et al. 2010)<br />
Mb
BeadXpress <strong>384</strong>-<strong>plex</strong> <strong>SNP</strong> <strong>sets</strong> <strong>for</strong> rice<br />
• 96 samples x <strong>384</strong><br />
<strong>SNP</strong> <strong>marker</strong>s per run<br />
• Less than $0.10 per<br />
<strong>marker</strong> data point<br />
Illumina BeadXpress Reader<br />
Working with Susan McCouch (Cornell<br />
University) to develop custom <strong>384</strong>-<strong>plex</strong><br />
<strong>SNP</strong> <strong>sets</strong> <strong>for</strong> different subgroups:<br />
• <strong>384</strong>-<strong>plex</strong> <strong>for</strong> indica x japonica<br />
populations<br />
• <strong>384</strong>-<strong>plex</strong> <strong>for</strong> indica <strong>and</strong> aus<br />
germplasm<br />
AA<br />
AB<br />
BB<br />
Automated <strong>marker</strong> scoring
Two custom <strong>384</strong>-<strong>plex</strong> <strong>SNP</strong> <strong>sets</strong> <strong>for</strong> optimal<br />
polymorphism rates in target germplasm<br />
Parent A Parent B Cross<br />
indica/japonica<br />
GS0011862<br />
indica-indica<br />
GS0011861<br />
93-11 Nipponbare indica x japonica 311 200<br />
IR 64 Moroberekan indica x japonica 256 191<br />
IR 64 Basmati 370 indica x aromatic 131 188<br />
IR 64 N 22 indica x aus 86 280<br />
IR 64 Dular indica x aus 80 278<br />
IR 64 Pokkali indica x indica 21 204<br />
IR 64 Mahsuri indica x indica 20 136<br />
Comparison of the number of polymorphic <strong>marker</strong>s (out of <strong>384</strong>)<br />
<strong>for</strong> 7 mapping populations across two OPAs
Diversity analysis with <strong>384</strong> indica/indica set<br />
indica<br />
Trop. japonica<br />
Temp.<br />
japonica<br />
Group V<br />
(aromatic)<br />
aus
Diversity analysis with <strong>384</strong> indica/japonica<br />
indica<br />
Temp.<br />
japonica<br />
Group V<br />
(aromatic)<br />
Trop. japonica<br />
aus<br />
• Each <strong>384</strong>-<strong>plex</strong> <strong>SNP</strong> set has<br />
inherent biases due to the <strong>SNP</strong><br />
selection process<br />
• The appropriate <strong>SNP</strong> set must<br />
be chosen <strong>for</strong> each group of<br />
germplasm <strong>and</strong> specific<br />
application
Diversity analysis <strong>for</strong> salinity tolerance<br />
118 rice<br />
varieties,<br />
including 62<br />
salt tolerant<br />
Bangladesh<br />
l<strong>and</strong>races<br />
aus<br />
Aswina<br />
HanHongKe<br />
TakRatia Kalarata<br />
Asha<br />
Hasawi<br />
Kalisaita<br />
Lalmota<br />
Binni<br />
FL478<br />
Swarna<br />
Harishankar<br />
Darial<br />
Depa Latisail<br />
Boilam<br />
BG90-2<br />
IR29<br />
PSBRc94<br />
93-11<br />
IR64<br />
IR54<br />
BRRIdhan4 5<br />
TangkaiRotan<br />
BRRIdhan2 8<br />
BRRIdhan2 9<br />
CR1009<br />
BR4<br />
BRRIdhan3 0<br />
BR10<br />
BRRIdhan4 0<br />
BR11<br />
Rajasail<br />
indica<br />
Moynamoti<br />
Nonasail<br />
Kajalsail<br />
Madhumoti<br />
Ghigoj Changa i<br />
SR26B<br />
Ashfol ChikiramPatnai<br />
Sadaba lam<br />
Akundi<br />
KutiPatnai Morichsail<br />
Pokkali-19354<br />
Patnai23<br />
NonaBokra<br />
Capsule<br />
Ashfalbalam Sadamota<br />
Kalamosa<br />
Jataibalam<br />
Pokkali-28609<br />
AusBako<br />
Dular<br />
Pokkali-15661<br />
Surjamukhi<br />
BRRIdhan4 7<br />
<strong>384</strong>-<strong>plex</strong><br />
indica/indica<br />
OPA could<br />
distinguish<br />
subgroups<br />
within indica<br />
germplasm<br />
Agrani<br />
N22<br />
JaliBoro<br />
Kaliboro<br />
Soloi<br />
FR13A Chiknol<br />
Kasalath<br />
Azucena<br />
Moroberekan<br />
japonica<br />
0.1<br />
Nipponba re<br />
Kalaboram<br />
Hanumanjata GopalBhog<br />
DomSufid<br />
Setkumra Koijuri Chapali Chinikanai<br />
Horcoach Gunshi Chapalia<br />
Dharikhachi Maidal Maitchal<br />
Kachra Hogla Ranisalute<br />
JolPaira Bamonkhir Salute<br />
Dorkumor Nonakochi Laxmikajal<br />
JamaiNadu Gadimuri<br />
Barisail<br />
Basmati370<br />
NoyonMoni<br />
Birpala<br />
Bazail<br />
Rayada<br />
aromatic<br />
Pokkali-15602<br />
Pokkali-15238<br />
Pokkaliyan-36351<br />
PokkalianBatticaloa<br />
Pokkali-117275<br />
Pokkali-8948 Pokkalian-15704<br />
Pokkalian-15507<br />
Pokkalian-47407<br />
Pokkali-IRRI<br />
Cheriviruppu<br />
Pokkali-108921<br />
Pokkaliyan-31513 Pokkali-15388<br />
Pokkali<br />
Pokkali-26869
Graphical genotyping of genetic lines<br />
<strong>384</strong>-<strong>plex</strong> mini-chips can be used to track introgressions<br />
Chr. 1 Chr. 2<br />
Chr. 3<br />
NILs<br />
N<br />
Chr. 4<br />
Chr. 5<br />
Chr. 6<br />
Chr. 7<br />
Chr. 8<br />
Chr. 9<br />
Chr. 10 Chr. 11<br />
Chr. 12<br />
Chr. 1 Chr. 2<br />
Chr. 3<br />
RILs<br />
Chr. 4<br />
Chr. 5<br />
Chr. 6<br />
Chr. 7<br />
Chr. 8<br />
Chr. 9<br />
Chr. 10 Chr. 11<br />
Chr. 12
<strong>SNP</strong> versus SSR throughput<br />
Manual PAGE genotyping:<br />
2 or 3 researchers<br />
BeadXpress genotyping:<br />
1 researcher<br />
16 PCR plates per day<br />
(8 gels morning, 8 afternoon)<br />
8 plates = 768 genotypes/day<br />
16 plates = 1536 genotypes/day<br />
QTL study: 288 lines x 100 SSRs<br />
= 300 PCR plates = 19 days<br />
96 x <strong>384</strong> <strong>SNP</strong>s in 2<br />
days = 18,432<br />
genotypes/day<br />
QTL study: 288 lines x<br />
<strong>384</strong> <strong>SNP</strong>s = 6 days
<strong>SNP</strong> <strong>marker</strong> throughput at IRRI<br />
Over 7,000 rice DNA samples<br />
(2.3 million <strong>SNP</strong> data points)<br />
genotyped in the past year<br />
170,000 data points:<br />
Outside partners<br />
510,000<br />
data points:<br />
Plant <strong>breeding</strong>,<br />
genetics, quality<br />
Other<br />
IRRI<br />
GRC<br />
1,640,000<br />
data points:<br />
Genetic resources center<br />
• Diversity analysis<br />
• DNA fingerprinting<br />
• QTL mapping<br />
• Marker-assisted <strong>breeding</strong><br />
• Testing genetic integrity<br />
of germplasm collection
Activities to develop <strong>SNP</strong> tools<br />
<strong>for</strong> <strong>breeding</strong> applications<br />
• We need to validate functional <strong>SNP</strong>s <strong>for</strong> important traits<br />
• Develop trait-specific <strong>SNP</strong>s diagnostic <strong>for</strong> desired alleles needed<br />
<strong>for</strong> <strong>breeding</strong> programs<br />
• Test functional <strong>SNP</strong>s using new high-throughput plat<strong>for</strong>ms<br />
• Identify targeted <strong>SNP</strong> haplotypes to select <strong>for</strong> specific QTL alleles<br />
• Establish a comprehensive “Rice Diversity Plat<strong>for</strong>m”<br />
• Interface with the Rice <strong>SNP</strong> Consortium to coordinate high<br />
density <strong>SNP</strong> genotyping of 2,000+ varieties <strong>and</strong> lines <strong>and</strong><br />
organize precise phenotyping ef<strong>for</strong>ts <strong>for</strong> association studies<br />
• Organize training <strong>and</strong> support <strong>for</strong> using <strong>SNP</strong>s in <strong>breeding</strong><br />
• Help design relevant tools <strong>for</strong> <strong>SNP</strong> data analysis<br />
• Offer workshops on <strong>SNP</strong> deployment <strong>and</strong> data analysis
Rice Genetic Diversity Plat<strong>for</strong>m<br />
• <strong>SNP</strong> discovery by re-sequencing<br />
• Illumina GAIIx re-sequencing of >80 varieties<br />
• High density genotyping <strong>for</strong> association studies<br />
www.ricesnp.org<br />
• Susan McCouch to develop a high density Affymetrix <strong>SNP</strong> chip<br />
• Select <strong>and</strong> purify >2,000 diverse accessions (IRRI, USDA)
Relevant in<strong>for</strong>matics tools are still needed<br />
Flapjack: http://bioinf.scri.ac.uk/flapjack/<br />
• Select optimal sub<strong>sets</strong> of <strong>SNP</strong>s from large data <strong>sets</strong><br />
• Filter data based on polymorphism rates, allele frequency,<br />
quality scores, distribution across the genome<br />
• Visualize <strong>SNP</strong> haplotypes at genomic regions<br />
• Integrate with gene annotation <strong>and</strong> expression data<br />
• Visualize donor introgressions<br />
• Develop tools to assist molecular <strong>breeding</strong> programs
Training <strong>for</strong> <strong>marker</strong> applications<br />
GAMMA Lab—a central facility <strong>for</strong> research <strong>and</strong> training:<br />
- Training <strong>for</strong> partners in national agricultural research institutes<br />
- Molecular Breeding, Rice: Research to Production courses<br />
- Future focus: <strong>SNP</strong> <strong>marker</strong> data analysis (March 2011)
For further in<strong>for</strong>mation:<br />
• Zhao et al. 2010 (PLoS ONE 5: e10780)<br />
• Illumina 1536 <strong>SNP</strong> chip on 395 O. sativa accessions<br />
• Tung et al. 2010 (RICE, online first)<br />
• Review of rice genetic diversity plat<strong>for</strong>m<br />
• Wright et al. 2010 (Bioin<strong>for</strong>matics Journal, online first)<br />
• ALCHEMY: improved algorithm <strong>for</strong> <strong>SNP</strong> allele calling<br />
• www.ricediversity.org<br />
• Susan McCouch’s diversity projects <strong>and</strong> data downloads<br />
• www.gramene.org<br />
• Genetic diversity module with <strong>SNP</strong> data <strong>sets</strong><br />
• www.ricesnp.org<br />
• Rice <strong>SNP</strong> consortium: updates on resequencing/<strong>SNP</strong> chips