plant dna barcoding
plant dna barcoding
plant dna barcoding
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
PLANT DNA BARCODING:<br />
PROBLEMS AND APPLICATIONS<br />
R<br />
Massimo Labra<br />
massimo.labra@unimib.it<br />
ZooPlantLab
PLANT BIODIVERSITY AND<br />
AGROBIODIVERSITY<br />
RELATIONSHIP BETWEEN PLANT AND<br />
ENVIRONMENT<br />
DEVELOPMENTAL BIOLOGY AND<br />
REGULATION MECHANISMS
TOOL FOR PLANT<br />
BIODIVERSITY ANALYSIS
RFLP<br />
RAPD<br />
AFLP<br />
SSR<br />
cpSSR<br />
SNP<br />
MOLECULAR MARKERS<br />
= Restriction Fragment length polymorphism<br />
= Random Amplification of Polymorphic DNA<br />
= Amplified Fragment Length Polymorphism<br />
= Simple Sequence Repeat<br />
= Chlorplast Simple Sequence Repeat<br />
= Single Nucleotide Polymorphisms
1<br />
2<br />
3<br />
THE MOLECULAR MARKERS<br />
ANALYSIS<br />
PLANT SELECTION<br />
IDENTIFICATION MARKER(S)<br />
SETTING TOOL(S)<br />
ANALYSIS<br />
DATA ANALYSIS
THE MOST FREQUENT QUESTIONS<br />
> PLANT IDENTIFICATIONS<br />
> RELATIONSHIPS AMONG<br />
CLOSE RELATED TAXA<br />
(SPECIES, CULTIVAR, ECC)
ANSWER<br />
ATCGGATCAATGA<br />
PLANT DNA BARCODING
Cytochrome c Oxidase 1<br />
NOT WORK IN PLANTS !<br />
The <strong>plant</strong> mitochondrial genome is structurally complex.<br />
Plant mitochondrial genomes frequently undergoes<br />
rearrangement.<br />
Wolfe et al., 1987 PNAS, 84: 9054; Mower et al., 2007 BMC Evol. Biol 7: 135.
IN FLOWERING PLANTS, THE MITOCHONDRIAL GENES EVOLVE:<br />
- A FEW TIMES MORE SLOWLY THAN CHLOROPLAST GENES;<br />
- ABOUT TEN TIMES MORE SLOWLY THAN PLANT AND MAMMAL<br />
NUCLEAR GENES;<br />
- 50100 TIMES MORE SLOWLY THAN MAMMALIAN<br />
MITOCHONDRIAL GENES.
must be universal for all <strong>plant</strong>s.<br />
IN PLANTS, LOW RATE OF DNA CHANGES WERE DETECTED IN<br />
MITOCHONDRIAL GENES.<br />
HOWEVER SOME GROUPS (I.E. PLANTAGO, PELARGONIUM,)<br />
SHOWED A DRAMATIC INCREASE IN THE MITOCHONDRIAL<br />
RATE OF SYNONYMOUS SUBSTITUTION.<br />
SOME OF THESE RATE INCREASES WERE TEMPORARY, WITH<br />
RATES APPROACHING OR RETURNING TO NORMALLY LOW<br />
LEVELS IN CERTAIN DESCENDENT LINEAGES<br />
SOME PLANTS CONTAIN A MIXTURE OF BOTH QUICKLY AND<br />
SLOWLY EVOLVING MITOCHONDRIAL GENES.<br />
Cho Y et al. 2004. Mitochondrial substitution rates are extraordinarily elevated and variable in a<br />
genus of flowering <strong>plant</strong>s. PNAS 101:17741-46.<br />
Parkinson CL et al., 2005. Multiple major increases and decreases in mitochondrial substitution<br />
rates in the <strong>plant</strong> family Geraniaceae. BMC Evol Biol 5:73.
Second International Barcode of<br />
Life Conference<br />
16 to 21 September 2007<br />
Taipei, Taiwan.<br />
ELIZABETH PENNISI SCIENCE 2007
Characteristic of DNA <strong>barcoding</strong><br />
region(s)<br />
- POLYMORPHIC DNA REGION<br />
- CONSERVED REGIONS FOR DESIGNING UNIVERSAL PRIMERS<br />
- EASY AMPLIFICATION<br />
- SEQUENCE LENGTH SUITABLE FOR SEQUENCING<br />
- LOW INTRASPECIFIC VARIABILITY<br />
- HIGH INTERSPECIFIC DIVERSITY<br />
From Meyer and Paulay, PLoS Biology, 2005
Additional characteristics for <strong>plant</strong>s<br />
> the gene sequences used for <strong>barcoding</strong> should be short<br />
enough to be PCR amplified easily also with degraded DNA!
Additional characteristics for <strong>plant</strong>s<br />
> Plants hybridize frequently! The selected<br />
markers should be able to distinguish hybrids<br />
from parental species!<br />
NUCLEAR OR PLASTIDIAL<br />
MARKERS ?<br />
R. hirsutum L.<br />
R. ferrugineum L.<br />
R. x intermedium Tausch
Additional characteristics for <strong>plant</strong>s<br />
> THE PLANTS ARE OFTEN POLYPLOIDS<br />
(ALLOPOLYPLOIDS AUTOPOLYPLOIDS)<br />
NUCLEAR MARKERS ?
SEVERAL COMPLETE NUCLEAR<br />
GENOME OF PLANTS<br />
NUCLEAR DNA BARCODING<br />
ITS > 40.000 SEQUENCES IN<br />
GENBANK.<br />
MORE THAN 90 PLASTIDIAL<br />
GENOMES<br />
(RAVI ET AL., 2008. PL SYST EVOL 271: 101)<br />
PLASTIDIAL DNA BARCODING<br />
- GENES (matK, rpoB, rbcL)<br />
- Spacer region (trnH-psbA )
FIRST HYPOTHESIS:<br />
NUCLEAR AND PLASTIDIAL REGION<br />
ITS<br />
trnH-psbA<br />
THE PROBLEM OF ITS AND<br />
NUCLEAR MARKERS!
NUCLEAR GENOME SEQUENCE<br />
ITS<br />
CONSIDERATIONS:<br />
NUCLEAR DNA SEGMENT PROVIDE MORE INFORMATION ON<br />
SPECIES IDENTITY, INCLUDING HYBRIDIZATION EVENTS,<br />
IN DNA DATABASE THERE ARE AL LOT OF ITS SEQUENCES<br />
PROBLEMS:<br />
PROBLEMS ARISING FROM PARALOGOUS SEQUENCES,<br />
PSEUDOGENES<br />
THE LOW SPECIES DISCRIMINATORY POWER AT THE<br />
SPECIES LEVEL IN SOME ANGIOSPERM GROUPS<br />
THECNICAL PROBLEMS:<br />
DIFFICULT TO OBTAIN A UNIVERSAL PCR AMPLIFICATION<br />
ESPECIALLY FROM DEGRADED AND LOW-QUALITY DNA<br />
DNA CONTAMINATION (i.e. PLANTS CONTAIN FUNGAL<br />
ENDOPHYTES; HERBARIUM AND FUNGAL).
THE CBOL-PLANT WORKING GROUP HAS NOT REGARDED nrITS<br />
SUITABLE FOR A UNIVERSAL PLANT DNA BARCODE<br />
MAY BE A SUPPLEMENTARY LOCUS FOR TAXONOMIC GROUPS<br />
WHICH HAVE LESS RESOLUTION WITH cpDNA AND WHERE<br />
DIRECT SEQUENCING OF ITS IS POSSIBLE.
Different plastidial markers<br />
- Universal (easy amplification, sequencing)<br />
- Size<br />
- Polymorphisms levels at the specie-genus-family..<br />
- Sequence structure<br />
rbcL<br />
atpB<br />
rpoC1<br />
ndhJ<br />
trnH-psbA<br />
matK
70%<br />
LIMITS OF DNA BARCODING ?<br />
LIMITS OF (MORPHO)SPECIES ?
Nature (2006) 440, 524-527
907 SAMPLES:<br />
445 ANGIOSPERM,<br />
38 GYMNOSPERM,<br />
67 CRYPTOGAM
ATCGGATCAATGA<br />
rbcL<br />
matK<br />
CBOL Plant Working Group<br />
rbcL + matk<br />
-It is the best characterized gene<br />
-There are universal primers not only for angiosperms<br />
-It produce a high-quality bidirectional sequences<br />
-It is one of the most rapidly evolving plastid<br />
coding regions<br />
-It shows high levels of discrimination among<br />
angiosperm species<br />
list of supplementary loci including the noncoding plastid<br />
regions (trnHpsbA, atpFatpH), intron (trnL), ITS could be<br />
used in the cases of degraded tissue or in the analysis of<br />
critical taxonomic groups.
May 2011 | Volume 6 | Issue 5 | e19254
cL: discrimination ability<br />
Kress (PNAS 2005) By comparison, ITS had a much higher<br />
divergence value (13.6%) than any of the plastid regions, and<br />
rbcL was by far the lowest in divergence (0.83%).
THE rbcL IS EASY TO AMPLIFY, SEQUENCE, AND ALIGN IN MOST LAND<br />
PLANTS AND PROVIDES A USEFUL BACKBONE TO THE BARCODE DATASET,<br />
DESPITE IT HAVING ONLY MODEST DISCRIMINATORY POWER (PLANT<br />
WORKING GROUP CBLOD).
matK: Universal amplification<br />
From: Dunning and Savolainen 2010
LIMITS: - SMALL NUMBER OF SPECIES FOR FAMILY<br />
- NOT SUITABLE FOR BARCODING IDEA
- IDENTIFICATION OF CLADE -SPECIFIC PRIMERS<br />
-MODIFICATIONS THE PRIMERS AND REACTION CONDITIONS<br />
TO OBTAIN TO INCREASE THE AMPLIFICATION SUCCESS<br />
DEFINITION OF A PRIMER COCKTAILS AROUND EXISTING<br />
MATK BARCODE PRIMING SITES.<br />
WORK IN PROGRESS!
trH-psbA<br />
- GOOD AMPLIFICATION ACROSS LAND PLANTS WITH A<br />
SINGLE PAIR OF PRIMERS (>90 % FOR ANGIOSPERMS)<br />
- HIGH LEVELS OF SPECIES DISCRIMINATION (More than rbcL<br />
+ matK in Ficus, Alnus and complex groups of Saliz and<br />
Quercus)<br />
PROBLEMS<br />
-PROBLEMS TO OBTAIN HIGH QUALITY BIDIRECTIONAL<br />
SEQUENCES.<br />
-MEDIAN LENGTH OF trnH-psbA IS ABOUT 420 BP IN<br />
EUDICOTS, BUT UPPER LENGTH OF >1,000 BP IN SOME<br />
MONOCOT AND CONIFER SPECIES! TOO MUCH<br />
-DIFFERENCES IN THE trnH-psbA LENGTH OF DIFFERENT TAXA<br />
RESULTED IN ALIGNMENT PROBLEMS FOR ESTIMATE GENETIC<br />
DISTANCES
Hollingswort et al., 2001
STATE OF ART<br />
PLANT DNA BARCODING IS BASED ON TWO<br />
CODING REGIONS: matK and rbcL<br />
HOWEVER<br />
THE MATK PRIMERS NEEDING IMPROVING, AND THE ABSOLUTE<br />
LEVELS OF DISCRIMINATORY POWER OF RBCL+MATK IS<br />
UNCERTAINTY<br />
THUS <br />
THE STANDARD CORE-BARCODE FOR LAND PLANTS BY CBOL IS<br />
SUBJECT TO A REVIEW<br />
DURING THIS REVIEW PHASE, CONTINUED SEQUENCING AND<br />
EXPLORATION OF THE PROPERTIES OF OTHER NON-CODING<br />
MARKERS IS RECOMMENDED (PARTICULARLY trnH-psbA AND<br />
THE INTERNAL TRANSCRIBED SPACERS OF NUCLEAR<br />
RIBOSOMAL DNA nrITS/nrITS2).
FACTORS INFLUENCING THE DISCRIMINATION SUCCESS<br />
OF PLANT BARCODES<br />
Hollingsworth 2011, Plos One 6: e19254<br />
The DNA <strong>barcoding</strong> approach doesn work in:<br />
- Clades where speciation has been very recent.<br />
- Woody species with long generation times and/or slow<br />
mutation rates<br />
- Polyploid speciation can lead to incongruence between<br />
barcode sequences and taxon concepts;<br />
- In taxonomically complex groups (TCGs)
Species dispersal: in species where dispersal is poor and it is<br />
constituted from isolated populations the neutral mutational<br />
variants can be slow to spread throughout a range.<br />
Thus the DNA <strong>barcoding</strong> is not universal for the same species!<br />
A secondary consequence of poor dispersal is that the<br />
permeability of a species to inter-specific gene flow may be<br />
increased. No DNA <strong>barcoding</strong> gap!<br />
Hollingsworth et al., 2011.
PLANT DNA BARCODE APPLICATIONS<br />
A- SPECIES-LEVEL TAXONOMY<br />
B- IDENTIFYING UNKNOWN SPECIMENS TO<br />
KNOWN SPECIES
A- SPECIES-LEVEL TAXONOMY
TAXONOMICALLY COMPLEX GROUPS (TCGs)<br />
Thymus (215 species)<br />
TAXA N<br />
Thymus brevicalyx Strobl 5<br />
Thymus catharinae Camarda 1<br />
Thymus oenipontanus Heinr.Braun 4<br />
Thymus paronychioides Celak. 2<br />
Thymus praecox subsp. polytrichus (A.Kern ex Borbàs) Jalas 3<br />
Thymus pulegioides L. 5<br />
Thymus spinulosus Ten. 3<br />
Thymus striatus Vahl 8<br />
Thymus vulgaris L. 3<br />
Thymus dolomiticus H.J. Coste 1<br />
Thymbra capitata (L.) Cav (=Thymus capitatus (L.)) 3<br />
11 species; 38 samples;<br />
Different geographical locations: IT; FR,<br />
SV; SPA
% variation<br />
between sp.<br />
(S.E.%)<br />
Range %<br />
between sp.<br />
Mean %<br />
variation within<br />
sp. (S.E.%)<br />
Range %<br />
within sp.<br />
matK 0.66 (0.20) 0.29-1.84 0.48 (0.10) 0.00-0.75<br />
trnH-psbA 2.39 (0.50) 1.33-3.53 2.18 (0.50) 0.74-3.60<br />
rbcL 0.17 (0.10) 0.00-0.42 0.07 (0.05) 0.00-0.16
High intraspecific variability!<br />
19 haplotypes with matK<br />
TAXA N<br />
Thymus brevicalyx Strobl 5<br />
Thymus oenipontanus Heinr.Braun 4<br />
Thymus pulegioides L. 5<br />
Thymus striatus Vahl 8<br />
Thymbra capitata (L.) Cav 3
B- IDENTIFYING UNKNOWN SPECIMENS TO KNOWN SPECIES
DNA BARCODING WORK WELL IN<br />
A- PLANT IDENTIFICATION STARTING FROM A FRAGMENTS<br />
OF PLANT MATERIALS (LEAVES, FRUITS)<br />
B - PLANT IDENTIFICATION (INCLUDING CULTIVARS)<br />
STARTING FROM A SET OF TARGET SPECIES.<br />
C- IDENTIFICATION PROBLEM RELATES TO UNFAMILIARITY<br />
WITH A GIVEN SPECIES (I.E. EXOTIC SPECIES).<br />
D- IDENTIFICATION OF PLANT SPECIES OF A SPECIFIC<br />
GEOGRAPHIC AREA WHERE THE SPECIES ARE NOT<br />
NECESSARILY CLOSELY RELATED.
A- PLANT IDENTIFICATION STARTING FROM PLANT FRAGMENTS
GRUPPO I<br />
ORMAMENTAL PLANTS WITH TOXIC METABOLITES<br />
Nandina domestica Thunb.<br />
Ilex aquifolium L.<br />
Aucuba japonica Thunb.<br />
Arum italicum Mill.<br />
Arum maculatum L.<br />
Convallaria majalis L.<br />
Euphorbia pulcherrima Willd. ex Klotzsch<br />
Spathiphyllum wallisii Regel<br />
Sansevieria trifasciata Prain<br />
Anthurium andreanum Linden
GRUPPO II<br />
CONGENERIC TAXA WITH/WITHOUT TOXIC METABOLITES<br />
Aconitum lycoctonum L.<br />
Aconitum napellus L.<br />
Aconitum degenii Gàyer subsp. paniculatum<br />
(Arcang.) Mucher<br />
Aconitum anthora L.<br />
Gruppo IIa<br />
Gruppo IIb<br />
Sambucus ebulus L.<br />
Sambucus racemosa L.<br />
Sambucus nigra L.
Gruppo IIIa<br />
GRUPPO III<br />
EDIBLE AND INEDIBLE PLANTS<br />
Prunus laurocerasus L. TOX<br />
Prunus armeniaca L. (albicocco)<br />
Prunus avium L. (ciliegio)<br />
Prunus persica (L.) Batsch (pesca)<br />
Prunus cerasus L. (amarena)<br />
Prunus domestica L. (susino)<br />
Gruppo IIIb<br />
Solanum dulcamara L. TOX<br />
Solanum nigrum L. TOX<br />
Solanum lycopersicum L.<br />
Solanum tuberosum L.
PLASTIDIAL rpoB; matK,<br />
trnH-psbA<br />
NUCLEAR Sqd1; At103<br />
trnH-psbA and matK
B - PLANT IDENTIFICATION (INCLUDING CULTIVARS)<br />
STARTING FROM A SET OF TARGET SPECIES.<br />
DNA BARCODING AND FOOD TRACEABILITY
Gruppo 1<br />
Mentha piperita L.<br />
Mentha aquatica L.<br />
Mentha spicata L.<br />
<br />
Gruppo 4<br />
Salvia officinalis L.<br />
Salvia rutilans<br />
Salvia sclarea<br />
Salvia uliginosa<br />
<br />
Gruppo 2<br />
Ocimum gratissimum L.<br />
Ocimum tenuiflorum L.<br />
Ocimum basilicum L (cultivars)<br />
<br />
Gruppo 3<br />
Origanum majorana L.<br />
Origanum vulgare L.<br />
Origanum pseudodictamnius Sieber<br />
Origanum heracleoticum<br />
<br />
Gruppo 5<br />
Thymus vulgaris L.<br />
<br />
Gruppo 6<br />
Rosmarinus officinalis L.
Markers PLASTIDIA matK; trnH-psbA; rbcL<br />
NUCLEAR: ITS; At103; Agt1<br />
ms oc ts mns ro mnc bs<br />
PCR<br />
Specific band<br />
Nonspecific band
The non-coding trnH-psbA intergenic spacer and matK are the<br />
most suitable marker for molecular spices identification.<br />
In a context of food traceability the two markers are useful to<br />
identify commercial processed spice species (sold as dried<br />
material).<br />
Basil: sequence divergences of<br />
marker trnH-psbA, matK and rbcL<br />
clearly distinguish O. gratissimum L.,<br />
and O. tenuiflorum L. from common<br />
basil (O. basilicum L.) while only<br />
trnH-psbA and matK showed<br />
appreciable differences among the<br />
basil cultivars
Origanum samples (Group III) did not show any<br />
sequence polymorphism !<br />
M. piperita L. is a sterile hybrid of M. aquatica L. × M. spicata L.<br />
The chloroplast uniparental markers used in this study, confirm<br />
that M. spicata L. is the maternal parental of M. piperita L.<br />
because both species showed the same plastidial DNA profile.
C- IDENTIFICATION PROBLEM RELATES TO UNFAMILIARITY<br />
WITH A GIVEN SPECIES (I.E. EXOTIC SPECIES).<br />
SMART DRUGS
MORPHOLOGICAL<br />
ANALYSIS<br />
CHEMICAL<br />
ANALYSIS<br />
DNA BAROCIDNG<br />
ANALYSIS<br />
ATCGGATCAATGA
DNA BARCODING ANALYSIS: rbcL, matK and trnH-psbA<br />
SOME COMMERCIAL PLANT MIX SHOWED FRAGMENTS<br />
OF MARIJUANA LEAVES <br />
SOME COMMERCIAL PLANT<br />
MIX SHOWED FRAGMENTS OF<br />
PLANT RICH OF ALKALOIDS<br />
SUCH AS Turnera diffusa L<br />
IN SOME COMMERCIAL MIX THE<br />
PLANT ARE ONLY A SHUTTLE FOR<br />
TRANSPOR SYNTHETIC DRUGS
D- IDENTIFICATION OF PLANT SPECIES OF A SPECIFIC<br />
GEOGRAPHIC AREA WHERE THE SPECIES ARE NOT<br />
NECESSARILY CLOSELY RELATED.<br />
INTEGRATED TAXONOMIC APPROACH: FLORA OF MONTE<br />
VALERIO (TRIESTE).<br />
R<br />
ZooPlantLab<br />
PLANT SAMPLING - DNA ANALYSIS AND<br />
DIGITAL IDENTIFICATION KEY<br />
STATISTICAL ANALYSIS (in progress)
Calluna vulgaris (L.) Hull<br />
Famiglia: ERICACEAE<br />
Nomi italiani: Brentoli, Brughiera, Brugo,<br />
Calluna, Erica falsa, Grecchia, Scopetti,<br />
Sorcelli.
343 PLANT COLLECTED<br />
300 WERE AMPLIFIED WITH ALL MARKERS<br />
trnH-psbA: 316 MOTU on 322 amplified samples<br />
matK: 304 MOTU on 323 amplified samples<br />
rbcL: 293 MOTU on 337 amplified samples<br />
trnH-psba matK rbcL<br />
Gr1- Acer (4) 4 1 1<br />
Gr2- Euphorbia (6) 6 6 4<br />
Gr3-Geranium (4) 4 4 2<br />
Gr4-Medicago (4) 4 3 2<br />
Gr5-Prunus (4) 3 4 2<br />
Gr6-Senecio (3) 3 2 2<br />
Gr7-Solanum (3) 3 3 3<br />
Gr8-Trifolium (5) 4 5 3
MOLECULAR TAXONOMY<br />
MORPHOLOGICAL TAXONOMY
Interactive keys<br />
DNA barcoder
Fabrizio De Mattia Ilaria Bruni Alessia Losa<br />
R<br />
ZooPlantLab<br />
www.zoo<strong>plant</strong>lab.btbs.unimib.it<br />
Maurizio Casiraghi Massimo Labra<br />
Michela Barbuto<br />
Andrea Galimberti<br />
Emanuele Ferri<br />
Sara Baccei<br />
Anna<br />
Sandionigi<br />
Silvia Federici