You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Gene</strong> <strong>gain</strong> <strong>and</strong> los s : <strong>a<strong>CGH</strong></strong><br />
<strong>ISA</strong> <strong>CGH</strong><br />
IV Cours e on Microarray data analy s is<br />
March 13, Valencia<br />
Rafael C. Jim enez
Gen etic Variation in h u m an gen om es<br />
From ch rom osom e an om alies to sin gle n u cleotid e ch an ges<br />
Chromosome <strong>Gene</strong> Nucleotide
CNV/ CNP<br />
Cop y n u m ber variation of DNA segm en ts<br />
●<br />
●<br />
●<br />
Deletion s<br />
In sertion s<br />
Du p lication s<br />
Kb<br />
Mb<br />
b(nt)
Com p arative Gen om ic Hybrid iz ation<br />
●<br />
●<br />
Meth od for an alyz in g gen om ic DNA for u n balan ced<br />
gen etic alteration s (CVN)<br />
Based on flu orescen t h ybrid iz ation of DNA<br />
Techniques<br />
●<br />
●<br />
FISH, PCR, Sou th ern<br />
A rray <strong>CGH</strong><br />
●<br />
●<br />
●<br />
●<br />
Clon es (BAC, YAC, PAC)<br />
PCR n on - red u n d an t<br />
Oligonucleotides<br />
cDNA
1. Clon es to cover a gen om ic region<br />
2. Extraction an d p u rification<br />
3. Array DNA on to glass slid es<br />
Exam p le: Array <strong>CGH</strong><br />
BAC probes<br />
4. Hybrid iz ation of labeled n orm al an d tu m or gen om ic DNA to th e m icroarray<br />
5. An alyz e flu orescen ce ratio
Exam p le: Array <strong>CGH</strong><br />
BAC probes
Array <strong>CGH</strong><br />
Adv antage<br />
●<br />
An alysis of wh ole- gen om e in a sin gle exp erim en t<br />
●<br />
High er resolu tion th an con ven tion al <strong>CGH</strong> (5- 10 Kb)<br />
Lim itations :<br />
●<br />
In ability to d etect m osaicism , balan ced<br />
ch rom osom al tran slocation s, in version s an d wh olegen<br />
om e p loid y ch an ges<br />
●<br />
Dep en d in g on th e p lataform ch osen :<br />
- cDNA: lim ited by th e gen es en cod ed on<br />
ch rom osom es, cross- h ybrid iz ation s<br />
- BACs, PACs: DNA am p lification s are n ecessary
Array <strong>CGH</strong>
<strong>ISA</strong> <strong>CGH</strong><br />
<strong>ISA</strong><strong>CGH</strong> allows visualiz in g<br />
array <strong>CGH</strong> d ata or/ an d<br />
exp ression arrays on to<br />
h u m an or m ou se<br />
ch rom osom al coord in ates<br />
(au tom atically fou n d<br />
th rou gh th eir stan d ard<br />
id en tifiers) an d rep resen ts<br />
th e cop y n u m ber alteration s<br />
fou n d by u sin g d ifferen t<br />
m eth od s.<br />
Correlation s between cop y<br />
n u m ber an d gen e exp ression<br />
level can easily be observed<br />
in a p lot. Th e p rogram<br />
allows fin d in g m in im al<br />
com m on region s with<br />
altered cop y n u m ber across<br />
d ifferen t arrays.<br />
http://isacgh.bioinfo.cipf.es
<strong>ISA</strong> <strong>CGH</strong><br />
Alth ou gh <strong>ISA</strong><strong>CGH</strong> can be u sed<br />
alon e, it is tigh tly in tegrated<br />
in th e GEPAS p ackage. Th u s,<br />
n orm aliz ation an d d ifferen t<br />
d ata p re- p rocessin g<br />
op eration s can d irectly be<br />
p erform ed with in th e sam e<br />
en viron m en t.<br />
<strong>ISA</strong><strong>CGH</strong> is also con n ected to<br />
d ifferen t tools for fu n ction al<br />
an n otation so en rich m en t in<br />
fu n ction ally relevan t term s<br />
(gen e on tology, p ath ways,<br />
etc) in am p lified or lost<br />
ch rom osom al region s can<br />
d irectly be stu d ied
<strong>Gene</strong>ral<br />
outline<br />
INPUT: gen e exp ression or/ an d<br />
gen om ic h ybrid iz ation valu es.<br />
OUTPUT: p red iction of region s<br />
with alteration s in th e n u m ber of<br />
cop ies, p lotted in th e sam e<br />
figu re th an gen e exp ression<br />
valu es to view th e relation sh ip<br />
between both variables<br />
∙Fou r m eth od s for th e<br />
estim ation of gen om ic cop y<br />
n u m ber<br />
∙Oth er p aram eters:<br />
- Scale<br />
- Man agem en t of m u ltip le<br />
p robes<br />
http://isacgh.bioinfo.cipf.es
Probe<br />
identifiers<br />
Differen t array p latform s for<br />
<strong>CGH</strong> m easurem en ts (BACs, cDNA<br />
clon es an d oligon u cleotid es for<br />
array sp ots)<br />
Users m u st collect an d keep<br />
u p d ated th e ch rom osom al<br />
coord in ates of th e p robes<br />
<strong>ISA</strong><strong>CGH</strong> au tom atically retrieves<br />
th em an d p lots th e h ybrid iz ation<br />
valu es over th e corresp on d in g<br />
p osition s in ch rom osom es.<br />
Differen t IDs can be u p load ed<br />
(En sem bl Id s, accession , Un igen e,<br />
HUGO, RefSeq, Affy, BAC<br />
n am es...)<br />
Ch r1:203465273- 2034<br />
8412<br />
User- d efin ed in form ation on<br />
ch rom osom al coord in ates can<br />
altern atively be su p p lied to th e<br />
p rogram
Genomic copy num ber<br />
es tim ation<br />
Sm oothing: variation of th e Ad ap tive Weigh ts Sm ooth in g m eth od im p lem en ted in th e p rogram<br />
GLAD.<br />
Binary s egm entation: ch ecks wh eth er every sin gle p oin t in th e d ata set is a breakp oin t in an<br />
iterative way. It u ses a p erm u tation d istribu tion to test for m ean d ifferen ces between grou p s<br />
d rawin g robu st in feren ces th at d o n ot rely u p on an y m od el of th e d ata<br />
Regres s ion: u ses som e ch aracteristics of lin ear regression of in ten sity m easu re on p osition<br />
alon g th e gen om e. Su ch a regression lin e h as slop e close to z ero wh en fitted in region s with<br />
h om ogen eou s in ten sity levels bu t th e slop e d iffers form z ero wh en in ten sity m easures on<br />
th e left- h an d sid e are h igh er or lower th an in ten sity m easu res in th e righ t- h an d sid e. Given<br />
th e p oin ts of in ten sity m easu rem en ts ord ered by th eir p osition in th e<br />
ch rom osom e, ou r p roced u re fits a regression lin e for each N con secu tive p oin ts. Th u s a vector<br />
of slop es equ ivalen t in som e way to a d erived cu rve of th e origin al d ata is obtain ed .<br />
t- test is u sed to to id en tify th e slop e- p eaks th at are big en ou gh to in d icate a break- p oin t in<br />
th e in ten sity levels.<br />
Is ow indow : tries to id en tify bord ers between region s with a sign ifican t ch an ge in th e valu es<br />
of in ten sity of h ybrid iz ation . Given th e in ten sity m easu rem en t p oin ts from th e array ord ered<br />
by th eir p osition in th e ch rom osom e, a first step fin d s th ose th at are good can d id ates of<br />
bein g su ch bord ers. Rou gh ly sp eakin g, a p oin t will be a good can d id ate if th e p - valu e of<br />
a t- test com p arin g som e close p oin ts located at its left an d righ t n eigh borh ood s is low<br />
en ou gh .<br />
Th e bin ary segm en tation m eth od u ses th e global d istribu tion of th e d ataset wh ile th e oth ers<br />
are m ore based on th e local d istribu tion s of th e p oin ts.
Plotting data <strong>and</strong> copy num ber<br />
es tim ation<br />
One array / All<br />
chrom os om es<br />
Expression data<br />
Breakpoint estimation
Plotting data <strong>and</strong> copy num ber<br />
es tim ation<br />
One chrom os om e / All<br />
array s
Study ing breakpoints <strong>and</strong> details at gene<br />
lev el<br />
Click to get a d etailed rep resen tation<br />
of th e ch rom osom al region sp an n in g<br />
4Mb th at in clu d es th e corresp on d in g<br />
p robes of th e array located th erein<br />
as well as gen es m ap p ed by En sem bl.<br />
A blu e lin e rep resen ts th e estim ation<br />
of th e gen om ic cop y n u m ber.
Minim al com m on regions w ith cons is tent<br />
copy num ber alterations acros s s ev eral<br />
<strong>ISA</strong>CHG estim ates th e gen om ic array cop y n usm ber in each in d ivid u al arrays<br />
an d th en m erge th is in form ation in a u n iqu e p lot.<br />
Th e p lot rep resen ts th ose region s th at are eith er con sisten tly<br />
<strong>gain</strong> ed or lost in all th e arrays. Th e average h ybrid isation in ten sity<br />
is also rep resen ted .
Correlation betw een genom ic <strong>and</strong><br />
expres s ion data<br />
Qu ite u sefu l wh en stu d yin g th e effect of gen om ic<br />
cop y n u m ber alteration s on th e exp ression level of<br />
gen es is to h ave a sim p le rep resen tation of th e<br />
relation ship between both variables.
Functional annotation of altered copy<br />
num ber regions<br />
Fu n ction al an n otation of ch rom osom al region s based on th e gen e on tology (GO)<br />
an n otation s.<br />
An n otation s for region s of <strong>gain</strong> or <strong>loss</strong> are com p ared to th e backgrou n d of<br />
an n otation s corresp on d in g to th e rest of gen es p resen t in th e array, in<br />
ord er to see wh eth er th is region h as an y ch aracteristic fu n ction ality or<br />
n ot.<br />
A Fish er exact test for con tin gen cy tables is p erform ed to look for GO<br />
term s sign ifican tly overrep resen ted in th e ch rom osom al region stu d ied<br />
com p ared to th e backgrou n d abu n d an ce of GO term s in th e rest of th e<br />
gen om e.<br />
P- valu es ad ju sted for m u ltip le testin g u sin g th e FDR m eth od .
GEPAS, the integrated env ironm ent<br />
Norm aliz ati<br />
on<br />
Prep roces<br />
s<br />
Clu steri<br />
n g<br />
Differen tial<br />
exp ression
An exam ple
Ejem plo<br />
1. Choose an<br />
organism
2. <strong>CGH</strong> array data<br />
Ejem plo
3. Class labels (optional)<br />
Ejem plo
4. Gender labels (optional)<br />
Ejem plo
Ejem plo<br />
5. Choose the copy number estimation mehod<br />
Fast<br />
∙ Isowindow<br />
∙ Binary<br />
Segmentation<br />
∙Regression<br />
∙ Smoothing<br />
Precission
6. <strong>Gene</strong> expression data<br />
Ejem plo
7. Position data (optional)<br />
Ejem plo
8. Scale value (optional)<br />
Ejem plo
Ejem plo<br />
8. Scale value (opcional)<br />
Scale: default<br />
(max)<br />
Scale: 5
9. Reference lines (optional)<br />
Ejem plo
9. Reference lines (optional)<br />
Ejem plo
Ejem plo<br />
10. If you have replicates<br />
∙ Average<br />
∙ Max<br />
∙ Min
Ejem plo<br />
10. Define tipo de representación
Ejem plo<br />
11. Choose the representation
Ejem plo<br />
12. You can download map data
Ejem plo<br />
13. Download data form regions
Ejem plo<br />
14. Check correlation expressioncopy number
Ejem plo<br />
15. Zoom y functional analysis<br />
<strong>ISA</strong> zoom
Ejem plo<br />
15. Zoom y functional analysis<br />
Ensembl DAS zoom
Ejem plo<br />
15. Zoom y functional analysis<br />
Análisis funcional<br />
(FatiGO)
15. All arrays plotting<br />
Ejem plo
http://gepas.bioinfo.cipf.es/cgibin/tutoXX?c=/isacgh/isacgh.config
Acknow ledgm ents<br />
Joaquin Dopazo<br />
●<br />
Group leader<br />
Lucia Conde<br />
●<br />
●<br />
●<br />
<strong>ISA</strong><strong>CGH</strong> developer<br />
Tutorial<br />
Slides<br />
Ignacio Medina<br />
●<br />
Support, guidance <strong>and</strong> advice<br />
David Montaner<br />
●<br />
Support, guidance <strong>and</strong> advice<br />
Francisco Garcia<br />
●<br />
Support, guidance <strong>and</strong> advice