29.11.2014 Views

Gene gain and loss: aCGH ISA CGH

Gene gain and loss: aCGH ISA CGH

Gene gain and loss: aCGH ISA CGH

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Gene</strong> <strong>gain</strong> <strong>and</strong> los s : <strong>a<strong>CGH</strong></strong><br />

<strong>ISA</strong> <strong>CGH</strong><br />

IV Cours e on Microarray data analy s is<br />

March 13, Valencia<br />

Rafael C. Jim enez


Gen etic Variation in h u m an gen om es<br />

From ch rom osom e an om alies to sin gle n u cleotid e ch an ges<br />

Chromosome <strong>Gene</strong> Nucleotide


CNV/ CNP<br />

Cop y n u m ber variation of DNA segm en ts<br />

●<br />

●<br />

●<br />

Deletion s<br />

In sertion s<br />

Du p lication s<br />

Kb<br />

Mb<br />

b(nt)


Com p arative Gen om ic Hybrid iz ation<br />

●<br />

●<br />

Meth od for an alyz in g gen om ic DNA for u n balan ced<br />

gen etic alteration s (CVN)<br />

Based on flu orescen t h ybrid iz ation of DNA<br />

Techniques<br />

●<br />

●<br />

FISH, PCR, Sou th ern<br />

A rray <strong>CGH</strong><br />

●<br />

●<br />

●<br />

●<br />

Clon es (BAC, YAC, PAC)<br />

PCR n on - red u n d an t<br />

Oligonucleotides<br />

cDNA


1. Clon es to cover a gen om ic region<br />

2. Extraction an d p u rification<br />

3. Array DNA on to glass slid es<br />

Exam p le: Array <strong>CGH</strong><br />

BAC probes<br />

4. Hybrid iz ation of labeled n orm al an d tu m or gen om ic DNA to th e m icroarray<br />

5. An alyz e flu orescen ce ratio


Exam p le: Array <strong>CGH</strong><br />

BAC probes


Array <strong>CGH</strong><br />

Adv antage<br />

●<br />

An alysis of wh ole- gen om e in a sin gle exp erim en t<br />

●<br />

High er resolu tion th an con ven tion al <strong>CGH</strong> (5- 10 Kb)<br />

Lim itations :<br />

●<br />

In ability to d etect m osaicism , balan ced<br />

ch rom osom al tran slocation s, in version s an d wh olegen<br />

om e p loid y ch an ges<br />

●<br />

Dep en d in g on th e p lataform ch osen :<br />

- cDNA: lim ited by th e gen es en cod ed on<br />

ch rom osom es, cross- h ybrid iz ation s<br />

- BACs, PACs: DNA am p lification s are n ecessary


Array <strong>CGH</strong>


<strong>ISA</strong> <strong>CGH</strong><br />

<strong>ISA</strong><strong>CGH</strong> allows visualiz in g<br />

array <strong>CGH</strong> d ata or/ an d<br />

exp ression arrays on to<br />

h u m an or m ou se<br />

ch rom osom al coord in ates<br />

(au tom atically fou n d<br />

th rou gh th eir stan d ard<br />

id en tifiers) an d rep resen ts<br />

th e cop y n u m ber alteration s<br />

fou n d by u sin g d ifferen t<br />

m eth od s.<br />

Correlation s between cop y<br />

n u m ber an d gen e exp ression<br />

level can easily be observed<br />

in a p lot. Th e p rogram<br />

allows fin d in g m in im al<br />

com m on region s with<br />

altered cop y n u m ber across<br />

d ifferen t arrays.<br />

http://isacgh.bioinfo.cipf.es


<strong>ISA</strong> <strong>CGH</strong><br />

Alth ou gh <strong>ISA</strong><strong>CGH</strong> can be u sed<br />

alon e, it is tigh tly in tegrated<br />

in th e GEPAS p ackage. Th u s,<br />

n orm aliz ation an d d ifferen t<br />

d ata p re- p rocessin g<br />

op eration s can d irectly be<br />

p erform ed with in th e sam e<br />

en viron m en t.<br />

<strong>ISA</strong><strong>CGH</strong> is also con n ected to<br />

d ifferen t tools for fu n ction al<br />

an n otation so en rich m en t in<br />

fu n ction ally relevan t term s<br />

(gen e on tology, p ath ways,<br />

etc) in am p lified or lost<br />

ch rom osom al region s can<br />

d irectly be stu d ied


<strong>Gene</strong>ral<br />

outline<br />

INPUT: gen e exp ression or/ an d<br />

gen om ic h ybrid iz ation valu es.<br />

OUTPUT: p red iction of region s<br />

with alteration s in th e n u m ber of<br />

cop ies, p lotted in th e sam e<br />

figu re th an gen e exp ression<br />

valu es to view th e relation sh ip<br />

between both variables<br />

∙Fou r m eth od s for th e<br />

estim ation of gen om ic cop y<br />

n u m ber<br />

∙Oth er p aram eters:<br />

- Scale<br />

- Man agem en t of m u ltip le<br />

p robes<br />

http://isacgh.bioinfo.cipf.es


Probe<br />

identifiers<br />

Differen t array p latform s for<br />

<strong>CGH</strong> m easurem en ts (BACs, cDNA<br />

clon es an d oligon u cleotid es for<br />

array sp ots)<br />

Users m u st collect an d keep<br />

u p d ated th e ch rom osom al<br />

coord in ates of th e p robes<br />

<strong>ISA</strong><strong>CGH</strong> au tom atically retrieves<br />

th em an d p lots th e h ybrid iz ation<br />

valu es over th e corresp on d in g<br />

p osition s in ch rom osom es.<br />

Differen t IDs can be u p load ed<br />

(En sem bl Id s, accession , Un igen e,<br />

HUGO, RefSeq, Affy, BAC<br />

n am es...)<br />

Ch r1:203465273- 2034<br />

8412<br />

User- d efin ed in form ation on<br />

ch rom osom al coord in ates can<br />

altern atively be su p p lied to th e<br />

p rogram


Genomic copy num ber<br />

es tim ation<br />

Sm oothing: variation of th e Ad ap tive Weigh ts Sm ooth in g m eth od im p lem en ted in th e p rogram<br />

GLAD.<br />

Binary s egm entation: ch ecks wh eth er every sin gle p oin t in th e d ata set is a breakp oin t in an<br />

iterative way. It u ses a p erm u tation d istribu tion to test for m ean d ifferen ces between grou p s<br />

d rawin g robu st in feren ces th at d o n ot rely u p on an y m od el of th e d ata<br />

Regres s ion: u ses som e ch aracteristics of lin ear regression of in ten sity m easu re on p osition<br />

alon g th e gen om e. Su ch a regression lin e h as slop e close to z ero wh en fitted in region s with<br />

h om ogen eou s in ten sity levels bu t th e slop e d iffers form z ero wh en in ten sity m easures on<br />

th e left- h an d sid e are h igh er or lower th an in ten sity m easu res in th e righ t- h an d sid e. Given<br />

th e p oin ts of in ten sity m easu rem en ts ord ered by th eir p osition in th e<br />

ch rom osom e, ou r p roced u re fits a regression lin e for each N con secu tive p oin ts. Th u s a vector<br />

of slop es equ ivalen t in som e way to a d erived cu rve of th e origin al d ata is obtain ed .<br />

t- test is u sed to to id en tify th e slop e- p eaks th at are big en ou gh to in d icate a break- p oin t in<br />

th e in ten sity levels.<br />

Is ow indow : tries to id en tify bord ers between region s with a sign ifican t ch an ge in th e valu es<br />

of in ten sity of h ybrid iz ation . Given th e in ten sity m easu rem en t p oin ts from th e array ord ered<br />

by th eir p osition in th e ch rom osom e, a first step fin d s th ose th at are good can d id ates of<br />

bein g su ch bord ers. Rou gh ly sp eakin g, a p oin t will be a good can d id ate if th e p - valu e of<br />

a t- test com p arin g som e close p oin ts located at its left an d righ t n eigh borh ood s is low<br />

en ou gh .<br />

Th e bin ary segm en tation m eth od u ses th e global d istribu tion of th e d ataset wh ile th e oth ers<br />

are m ore based on th e local d istribu tion s of th e p oin ts.


Plotting data <strong>and</strong> copy num ber<br />

es tim ation<br />

One array / All<br />

chrom os om es<br />

Expression data<br />

Breakpoint estimation


Plotting data <strong>and</strong> copy num ber<br />

es tim ation<br />

One chrom os om e / All<br />

array s


Study ing breakpoints <strong>and</strong> details at gene<br />

lev el<br />

Click to get a d etailed rep resen tation<br />

of th e ch rom osom al region sp an n in g<br />

4Mb th at in clu d es th e corresp on d in g<br />

p robes of th e array located th erein<br />

as well as gen es m ap p ed by En sem bl.<br />

A blu e lin e rep resen ts th e estim ation<br />

of th e gen om ic cop y n u m ber.


Minim al com m on regions w ith cons is tent<br />

copy num ber alterations acros s s ev eral<br />

<strong>ISA</strong>CHG estim ates th e gen om ic array cop y n usm ber in each in d ivid u al arrays<br />

an d th en m erge th is in form ation in a u n iqu e p lot.<br />

Th e p lot rep resen ts th ose region s th at are eith er con sisten tly<br />

<strong>gain</strong> ed or lost in all th e arrays. Th e average h ybrid isation in ten sity<br />

is also rep resen ted .


Correlation betw een genom ic <strong>and</strong><br />

expres s ion data<br />

Qu ite u sefu l wh en stu d yin g th e effect of gen om ic<br />

cop y n u m ber alteration s on th e exp ression level of<br />

gen es is to h ave a sim p le rep resen tation of th e<br />

relation ship between both variables.


Functional annotation of altered copy<br />

num ber regions<br />

Fu n ction al an n otation of ch rom osom al region s based on th e gen e on tology (GO)<br />

an n otation s.<br />

An n otation s for region s of <strong>gain</strong> or <strong>loss</strong> are com p ared to th e backgrou n d of<br />

an n otation s corresp on d in g to th e rest of gen es p resen t in th e array, in<br />

ord er to see wh eth er th is region h as an y ch aracteristic fu n ction ality or<br />

n ot.<br />

A Fish er exact test for con tin gen cy tables is p erform ed to look for GO<br />

term s sign ifican tly overrep resen ted in th e ch rom osom al region stu d ied<br />

com p ared to th e backgrou n d abu n d an ce of GO term s in th e rest of th e<br />

gen om e.<br />

P- valu es ad ju sted for m u ltip le testin g u sin g th e FDR m eth od .


GEPAS, the integrated env ironm ent<br />

Norm aliz ati<br />

on<br />

Prep roces<br />

s<br />

Clu steri<br />

n g<br />

Differen tial<br />

exp ression


An exam ple


Ejem plo<br />

1. Choose an<br />

organism


2. <strong>CGH</strong> array data<br />

Ejem plo


3. Class labels (optional)<br />

Ejem plo


4. Gender labels (optional)<br />

Ejem plo


Ejem plo<br />

5. Choose the copy number estimation mehod<br />

Fast<br />

∙ Isowindow<br />

∙ Binary<br />

Segmentation<br />

∙Regression<br />

∙ Smoothing<br />

Precission


6. <strong>Gene</strong> expression data<br />

Ejem plo


7. Position data (optional)<br />

Ejem plo


8. Scale value (optional)<br />

Ejem plo


Ejem plo<br />

8. Scale value (opcional)<br />

Scale: default<br />

(max)<br />

Scale: 5


9. Reference lines (optional)<br />

Ejem plo


9. Reference lines (optional)<br />

Ejem plo


Ejem plo<br />

10. If you have replicates<br />

∙ Average<br />

∙ Max<br />

∙ Min


Ejem plo<br />

10. Define tipo de representación


Ejem plo<br />

11. Choose the representation


Ejem plo<br />

12. You can download map data


Ejem plo<br />

13. Download data form regions


Ejem plo<br />

14. Check correlation expression­copy number


Ejem plo<br />

15. Zoom y functional analysis<br />

<strong>ISA</strong> zoom


Ejem plo<br />

15. Zoom y functional analysis<br />

Ensembl DAS zoom


Ejem plo<br />

15. Zoom y functional analysis<br />

Análisis funcional<br />

(FatiGO)


15. All arrays plotting<br />

Ejem plo


http://gepas.bioinfo.cipf.es/cgi­bin/tutoXX?c=/isacgh/isacgh.config


Acknow ledgm ents<br />

Joaquin Dopazo<br />

●<br />

Group leader<br />

Lucia Conde<br />

●<br />

●<br />

●<br />

<strong>ISA</strong><strong>CGH</strong> developer<br />

Tutorial<br />

Slides<br />

Ignacio Medina<br />

●<br />

Support, guidance <strong>and</strong> advice<br />

David Montaner<br />

●<br />

Support, guidance <strong>and</strong> advice<br />

Francisco Garcia<br />

●<br />

Support, guidance <strong>and</strong> advice

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!