31.12.2014 Views

KNIME Bioinformatics Extensions

KNIME Bioinformatics Extensions

KNIME Bioinformatics Extensions

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>KNIME</strong> <strong>Bioinformatics</strong> <strong>Extensions</strong><br />

Karol Kozak<br />

ETH Zurich<br />

January 2011, Zurich, <strong>KNIME</strong> SIG<br />

your name


Start HCS<br />

your name


HCS Experiments and<br />

Informatics<br />

your name


HCS & Open Source<br />

<strong>KNIME</strong><br />

LIMS<br />

<strong>KNIME</strong> Matlab<br />

WEKA Nodes<br />

Cell Classifiers<br />

HCDB –<br />

OpenBIS<br />

&<br />

Library<br />

Database<br />

<strong>Bioinformatics</strong><br />

Off-Target<br />

<strong>KNIME</strong>, Java<br />

Spotfire<br />

<strong>KNIME</strong><br />

<strong>KNIME</strong><br />

Matlab<br />

your name


Research interest<br />

Role<br />

Software<br />

engineering +<br />

modeling<br />

Database<br />

understanding<br />

RNA technology<br />

Classification<br />

Pattern<br />

recognition<br />

Annotation<br />

database<br />

Off-target<br />

prediction<br />

2D/3D Structure<br />

relationships<br />

Gene<br />

functionality<br />

Role of <strong>Bioinformatics</strong><br />

in RNAi technology to detect functionality of<br />

genes in mammalian cells<br />

-New prediction algorithms including Kernel methods<br />

-Post analysis of existing hits<br />

-Traditional bioinformatics: homology, blast, alignment<br />

score<br />

-Database development<br />

your name


RNAi Libraries<br />

- Genome wide<br />

- Functional groups (Kinases..)<br />

- n - Ologinucleotides<br />

- Pooled<br />

your name


RNAi Libraries<br />

- Qiagen<br />

- Thermo Fisher Scientific (Dharmacon)<br />

- Applied Biosystems (Ambion)<br />

- Sigma-Aldrich (esiRNA)<br />

your name


TIME<br />

RNAi Library evlolution<br />

Purchase<br />

Sigma<br />

esiRNA<br />

annot<br />

database<br />

V1<br />

AppBiosyst<br />

annot<br />

database<br />

Qiagen<br />

annot<br />

database<br />

TFisher<br />

Annot<br />

database<br />

V2<br />

Dharmacon<br />

annot<br />

database<br />

Today<br />

V3<br />

Dharmacon<br />

annot<br />

database<br />

Less genes<br />

More known oligo<br />

New design<br />

Transcripts analysis<br />

Off-target information<br />

your name


Annotation of human genome &<br />

Reliability of siRNA libraries<br />

Qiagen genome wide siRNA library<br />

(HsDgV3 (Human Druggable Genome siRNA Set V3); HsNmV1 (Human Refseq Xm siRNA Set V1); HsXmV1 (Human Predicted<br />

genome Set V1))<br />

2006: 22’832 genes / 90’728 siRNAs<br />

2010: 16’199 genes<br />

Meier Roger<br />

16.5% %<br />

12.5%<br />

12.5 %<br />

71 %<br />

71%<br />

41%<br />

% 39% %<br />

20%<br />

%<br />

0.11 %<br />

0.23 %<br />

10.3%<br />

25.6% %<br />

0.07 %<br />

33% %<br />

30.6% %<br />

0.01 %<br />

0.01 %<br />

% of genes<br />

target by:<br />

1 siRNA<br />

2 siRNA<br />

3 siRNA<br />

4 siRNA<br />

5 siRNA<br />

6 siRNA<br />

7 siRNA<br />

8 siRNA<br />

9 siRNA<br />

target genes with off target(s)<br />

wrong predicted genes<br />

target genes<br />

siRNAs with off target(s)<br />

siRNAs against wrong predicted genes<br />

siRNAs w/o off target(s)<br />

your name


Based on GENEID<br />

your name


Annotated off-target by companies<br />

How did companies select these off targets siRNA <br />

- There are eliminated a lot of Ribosomal siRNA<br />

- There are eliminated a lot of siRNA against "membrane"<br />

proteins (2061 in old list, 945 in clean list)<br />

- did not eliminate many siRNA against "kinase" proteins<br />

(2624 in old list, 1605 in list)<br />

- besides the 2800 proteins that there are discarded slightly<br />

enriched in virus related pathways<br />

your name


Library reduction<br />

• siRNA without geneID<br />

• The web site shows the gene that<br />

the siRNA matched at the time it<br />

was selected. The database table<br />

from which we get current<br />

annotation shows what the siRNA<br />

matches in the current Refseq.<br />

your name


Library handling<br />

From pooled 2 Oligo based<br />

Library analysis<br />

your name


<strong>Bioinformatics</strong> - RNAi<br />

-Off-target effect<br />

UUGCCGUACAGGAUGGACGtg<br />

UUAACUGAUGUUCCAAUCCtg<br />

your name


Sandra Kaestner<br />

<strong>Bioinformatics</strong><br />

your name


<strong>Bioinformatics</strong><br />

your name


Workflow<br />

your name


Project Ribosome biogenesis<br />

translation of ribosomal proteins<br />

maturation and export of<br />

ribosomal mRNAs<br />

80S ribosome<br />

ribosomal proteins<br />

80S ribosome<br />

Pol II<br />

mRNA<br />

transcription of<br />

ribosomal mRNAs<br />

90S particle<br />

pre-40S particle<br />

pre-60S particle<br />

maturation and export<br />

of pre-40S particle<br />

mature 40S subunit<br />

final<br />

maturation<br />

mature 60S subunit<br />

Pol I<br />

rDNA<br />

35S rRNA<br />

Pol III<br />

5S rRNA<br />

maturation and<br />

export of pre-60Sparticle<br />

transcription of ribosomal RNAs<br />

nucleolus nucleoplasm cytoplasm<br />

large ribosomal<br />

proteins<br />

small ribosomal<br />

proteins<br />

trans-acting<br />

factors<br />

18S 5.8S 25S<br />

18S<br />

18S<br />

18S<br />

18S rRNA maturation<br />

35S rRNA<br />

23S rRNA<br />

20S rRNA<br />

20S rRNA<br />

18S rRNA<br />

your name


The Rps2-YFP read out<br />

80S ribosome<br />

ribosomal proteins<br />

Thomas Wild<br />

80S ribosome<br />

mature 40S subunit<br />

mature 60S subunit<br />

Nucleolus Nucleoplasm Cytoplasm<br />

your name


The Rps2-YFP read out<br />

80S ribosome<br />

ribosomal proteins<br />

80S ribosome<br />

mature 40S subunit<br />

mature 60S subunit<br />

Nucleolus Nucleoplasm Cytoplasm<br />

your name


The Rps2-YFP read out<br />

80S ribosome<br />

ribosomal proteins<br />

80S ribosome<br />

mature 40S subunit<br />

mature 60S subunit<br />

Nucleolus Nucleoplasm Cytoplasm<br />

your name


The Rps2-YFP read out<br />

80S ribosome<br />

ribosomal proteins<br />

80S ribosome<br />

mature 40S subunit<br />

mature 60S subunit<br />

Nucleolus Nucleoplasm Cytoplasm<br />

your name


Biogenesis<br />

siRNA 84062- DTNBP1<br />

AACCTTCAAAGCTGAACTAGA<br />

DTNBP1<br />

HCDC<br />

RPS3<br />

Transcript NM_001005<br />

your name


Results<br />

Biogenesis<br />

Results<br />

RNAi<br />

Library<br />

Eg. 4 oligo<br />

RNAi<br />

Library<br />

+ Results =<br />

Hit LIST<br />

Hit<br />

Wonder<br />

Real<br />

Hit LIST<br />

Off-target<br />

in<br />

HIT LIST<br />

Results<br />

Off-target<br />

in<br />

HIT LIST<br />

Potential<br />

Off-targets<br />

your name


Analysis Off targets<br />

Known Off-targets<br />

Build model<br />

your name


AllStars MVP_si03 MVP_si06<br />

Uncoating<br />

Acidification<br />

AllStars si03 si06<br />

MVP<br />

Tubulin<br />

your name


Virus screen<br />

1 out of 3<br />

siRNA<br />

TGGGCCTGAGATGCAGGTAAA<br />

MVP NM_003010<br />

HCDC<br />

MAP2K4<br />

6416|NM_003010|2890|AS<br />

Qiagen off-target SM<br />

MAP2K4<br />

your name


mRNA 2D variants<br />

mRNA<br />

siRNA<br />

- 2D structure relation to Off-target effects<br />

- We can model 2D structures quite robust (Metaserver, Python)<br />

- We can predict potential-target effects<br />

- We must find relation<br />

your name


mRNA<br />

We want to identify structural motifs in a set of<br />

mRNA sequences<br />

your name


mRNA 3D Structure<br />

Nature Movie<br />

Nature Movie<br />

- Dicer, tRNA,<br />

- Off-target relations RNA-RNA i RNP-RNA<br />

- Model how RISC is related to microRNA/siRNA and how it finds own<br />

target and bind to him<br />

your name


RNA 3D<br />

your name


your name


Workflow<br />

ModeRNA<br />

Python<br />

BioPython for parsing structural data from the PDB format<br />

your name


Database architecture (LIMS)<br />

HCS and databases<br />

Library and<br />

annotation<br />

Screening<br />

experiment<br />

Sample, Results,<br />

Management, View<br />

Done<br />

But need maintenance<br />

Public database<br />

Phenotypic data<br />

your name


OpenBis - Open Source<br />

database<br />

your name


Screen DB<br />

Open Source database<br />

HCDB - OpenBIS<br />

your name


Open Source database<br />

OpenBIS (ETH)<br />

-web client<br />

-Command line client<br />

-Java technology, GWT Google<br />

your name


Screen DB<br />

Open Source database<br />

HCDB - OpenBIS<br />

your name


Screen DB<br />

Open Source database<br />

HCDB - OpenBIS<br />

Adam Srebniak<br />

your name


Screen DB<br />

Open Source database<br />

HCDB - OpenBIS<br />

your name


Image Processing and<br />

<strong>KNIME</strong><br />

your name<br />

Adam Srebniak


Open Source<br />

<strong>KNIME</strong><br />

WEB<br />

DESKTOP<br />

your name


Dharmaco<br />

nannot<br />

database<br />

Qiagen<br />

annot<br />

database<br />

Ambion<br />

Annot<br />

database<br />

Library<br />

files<br />

Oligo dharmacon<br />

With target gene<br />

Oligo qiagen<br />

With target gene<br />

Oligo ambion<br />

With target gene<br />

External<br />

databases<br />

www<br />

Gene<br />

Cross reference database for gene/oligo annotation<br />

+<br />

Workflow off-target prediction<br />

OpenBis<br />

your name


Image Processing<br />

One of the first Open Source: CellProfiler<br />

your name


High Content<br />

Image Processing (HCIP)<br />

your name


High Content<br />

Image Processing (HCIP)<br />

your name


Teaching module<br />

Slawek Mazur<br />

your name


HCDC-HITS<br />

Bio-Formats developed by OME software (Jason Swedlow), UW-Madison your name<br />

LOCI and Glencoe Software.


Gabor Bakos<br />

HCDC-HITS<br />

your name


Gabor Bakos<br />

HCDC-HITS<br />

your name


HCDC-HITS<br />

your name


HCDC-HITS<br />

your name


Visualization Improvement<br />

Lukasz Zwolinski, ETH – PWR Student<br />

your name


Acknowledgement<br />

Bioquant Heidelberg and ETH<br />

Holger Erfle, Karol Kozak, Berend Rind<br />

Juergen Reymann, Gabor Csucs, Adam Srebniak<br />

Slawek Mazur, Sandra Kaestner<br />

Welcome to join<br />

TU Konstanz<br />

Dorit Merhof<br />

Trinity College IR<br />

Anthony Davies<br />

MPI-IB, Berlin DE<br />

Peter Braun<br />

Andre Maeurer<br />

TU Breslau:<br />

Karol Kozak<br />

Lukasz Miroslaw<br />

your name

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!