16.11.2012 Views

Up Front - CHI Conferences - Cambridge Healthtech Institute

Up Front - CHI Conferences - Cambridge Healthtech Institute

Up Front - CHI Conferences - Cambridge Healthtech Institute

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Cambridge</strong> <strong>Healthtech</strong> Media Group<br />

www.bio-itworld.com<br />

Indispensable Technologies Driving Discovery, Development, and Clinical Trials NOVEMBER | DECEMBER 2011 • VOL. 10, NO. 6<br />

Start Spreading<br />

the News<br />

A remarkable coalition<br />

gives rise to the New York<br />

Genome Center.<br />

Page 8<br />

THE CLOUD COMPUTING<br />

HORIZON 32<br />

MEDCO’S X PRIZE MISSION 40<br />

OPEN SEASON AT<br />

BIO-IT WORLD EUROPE 11<br />

Download a PDF<br />

of This Issue<br />

SEE PAGE 4<br />

Nancy Kelley, the<br />

founding executive<br />

director of the New<br />

York Genome Center


19th International<br />

Moscone North Convention Center<br />

San Francisco, CA<br />

February 19-20<br />

NeW-symposia<br />

Targeting Cancer Stem Cells<br />

Cloud Computing<br />

Display of Difficult Targets<br />

Point-of-Care Diagnostics<br />

Next-Generation Pathology<br />

Molecular Med<br />

Pharma-Bio Partnering Forums<br />

February 20<br />

Event short courses<br />

February 21-23<br />

Core Programs<br />

Diagnostics Channel<br />

Organized by <strong>Cambridge</strong> <strong>Healthtech</strong> <strong>Institute</strong><br />

February 19-23<br />

Drug Discovery & Development Channel<br />

Informatics Channel<br />

Cancer Channel<br />

get<br />

inspired<br />

Attend. Discover. Apply.<br />

Spanning five days this year!<br />

Special Early Bird Registration<br />

until January 13, 2012<br />

Save up To $140!<br />

New This Year!<br />

TRI-CON All Access PAckAge<br />

Includes 1 Symposium / Partnering Forum,<br />

2 Short Courses, and 1 Core Program.<br />

See website for details.<br />

TriConference.com


Contents [11–12•11]<br />

Feature Story<br />

The Utility of Cloud Computing<br />

How cloud computing is being used today—<br />

and tomorrow—in life sciences. 32<br />

<strong>Up</strong> <strong>Front</strong><br />

8 A Genome Center for Gotham City<br />

11 Open Data and Patient Modeling in Europe<br />

14 New Technologies for Patient-Physician<br />

Interaction at Medicine 2.0<br />

Bush Doctrine<br />

15 Clouds for GLP Environments?<br />

Skeptical Outsider<br />

16 Science in Thrall to the FDA<br />

Insights | Outlook<br />

17 NGS by the Numbers<br />

12 Briefs<br />

Clinical Trials<br />

21 An Inside Perspective on PAREXEL<br />

24 Survey Says Tool <strong>Up</strong> for Smarter Studies<br />

26 NuMedii’s ‘De-Risks’ Drug Repositioning Work<br />

Computational Biology<br />

27 PerkinElmer Targets Holistic Data Solutions<br />

30 What’s in a Name: Systems Biology<br />

[4] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

Next-Gen Data<br />

40 New Rules for Archon Genomics X Prize<br />

41 DNAnexus to Mirror SRA Database<br />

in Google Cloud<br />

42 Inova/Complete Deal for ‘Next-Gen Medicine’<br />

43 Genia’s Nanopore/Microchip Technology Gains<br />

Life Technologies’ Support<br />

IT/Workflow<br />

45 Sophia Technology: Making Semantic Sense of<br />

Unstructured Data<br />

48 BT Connecting People and Patients<br />

In Every Issue<br />

First Base BY KEVIN DAVIES<br />

5 Dr. Watson, I Presume?<br />

The Russell Transcript BY JOHN RUSSELL<br />

54 Crescendo Bioscience’s Aspirations<br />

6 Company & Advertiser Index<br />

50 New: Free Trials & Downloads<br />

51 New Products<br />

52 Educational Opportunities<br />

Download a PDF<br />

of This Issue<br />

CLICK HERE!<br />

8 24 32<br />

Cover photo by Katja Heinemann


First Base<br />

Dr. Watson,<br />

I Presume?<br />

KEVIN DAVIES<br />

Normally in October, I travel with colleagues to Germany<br />

to participate in the annual Bio-IT World<br />

Europe conference (see p. 11). This year, I felt obliged<br />

to beg off. The International Congress of Human<br />

Genetics had called to ask if I’d be interested in moderating<br />

a plenary panel discussion to open the 2011<br />

conference. The title was “Whole Genome Sequencing: To do it<br />

or not to do it?” and the guest of honor<br />

was James D. Watson, Nobel laureate.<br />

It wasn’t much of a decision.<br />

We last featured the doyen of DNA<br />

in this magazine in 2003, on the occasion<br />

of the 50 th anniversary of the publication<br />

of the double helix paper in<br />

Nature (see “Genes, Girls, and Honest<br />

Jim,” Bio•IT World April 2003). Watson<br />

gave a press conference in Miami,<br />

where he reflected on that iconic<br />

achievement, even suggesting that if<br />

the BBC were ever to remake the film<br />

of that discovery, he’d like to be played<br />

by Ben Stiller (more recently, he suggested<br />

Sacha Baron Cohen).<br />

Made in Manhattan<br />

We seldom, if ever, feature a non-scientist on the<br />

cover of Bio•IT World, but this issue proves a worthy<br />

exception. As revealed in her first major interview,<br />

Nancy Kelley and friends have pulled off something<br />

remarkable—the creation of the New York Genome<br />

Center (see p. 8), supported and funded by a remarkable<br />

coalition of academic institutions, corporate<br />

partners and private philanthropists.<br />

If this was easy, somebody would have figured<br />

out a way to pull this off a long time ago. It speaks<br />

volumes of Kelley’s administrative skills and determination<br />

that, as the founding executive director of<br />

the Center, she was able to engineer such a feat. The<br />

irony is that it took a Bostonian to bring a worldclass<br />

genomics center to the heart of New York.<br />

IPI PHOTO, MONTREAL<br />

L to R: James D. Watson (Cold Spring Harbor Laboratory),<br />

Kevin Davies, and Marjolein Kriek (University<br />

of Leiden, The Netherlands).<br />

Now 83, Watson is the Chancellor Emeritus of the Cold<br />

Spring Harbor Laboratory on Long Island. He was also the<br />

first person sequenced using a next-generation sequencing<br />

platform, in 2007. As Jonathan Rothberg, the founder of 454<br />

Life Sciences, the company that sequenced Watson’s DNA, remarked,<br />

“You’re the first genome for the rest of us.”<br />

Watson spoke candidly about his personal genome sequence.<br />

What had he learned from his sequence? Not much,<br />

it turns out. A slight adjustment to his beta blockers. From the<br />

outset, he asked not to be told about his APOE status, which<br />

could reveal an increased risk for Alzheimer’s disease, and discussed<br />

the latest theory about the neighboring gene that might<br />

have a physiological role in the disease.<br />

Any audience with “Honest Jim” is likely to be colorful, and<br />

this was no exception. He elicited cheers when he besmirched<br />

a well-known biotech company defending a controversial<br />

gene patent, spoke movingly about his son’s diagnosis with<br />

schizophrenia, while making others uncomfortable (or worse)<br />

in talking (albeit compassionately) about “genetic losers.”<br />

Once the discussion was over, he was<br />

mobbed by photograph and autograph<br />

seekers clutching their wellworn<br />

copies of The Double Helix.<br />

The other panelists—Jim Lupski<br />

(Baylor), Seong-Jin Kim (CHA University,<br />

Korea), and Marjolein Kriek<br />

(Leiden, The Netherlands)—are all<br />

genome pioneers in their own right,<br />

and contributed to a fascinating<br />

discussion.<br />

The Digital World<br />

Watch the video of James D.<br />

Watson and fellow panelists<br />

from ICHG 2011.<br />

After nearly 10 years of printing a hard copy of<br />

Bio•IT World magazine, this issue will become something<br />

of a collector’s item. Starting in January 2012,<br />

we will publish exclusively as a digital magazine.<br />

We’ve weighed the costs of printing and mailing<br />

many thousands of issues versus our keen desire to expand the frequency<br />

and global reach of Bio•IT World.<br />

Earlier this year, we launched a free<br />

iPad app for Bio•IT World that not only<br />

shows off the printed content but also<br />

features video and other enhancements<br />

that will only grow in 2012. If you haven’t<br />

checked out or taken out a free subscription<br />

to our digital content, simply go to:<br />

http://bit.ly/bioitnew<br />

www.bio-itworld.com NOVEMBER | DECEMBER 2011 BIO•IT WORLD [ 5 ]<br />

CONTENTS


CONTENTS<br />

Company Index<br />

454 Life Sciences . . . . . . . . . . . . 5<br />

Accenture . . . . . . . . . . . . . . . . . 48<br />

Active Motif . . . . . . . . . . . . . . . . 51<br />

Affymetrix . . . . . . . . . . . . . . . . . 43<br />

Amazon . . . . . . . . . . . . . . .11, 33<br />

Amylin Pharmaceuticals . . .33, 37<br />

APEX International . . . . . . . . . . . 22<br />

Appistry . . . . . . . . . . . . . . . . . . 38<br />

Applied Biosystems . . . . . . . . . . 48<br />

Archon Genomics X Prize . . . . . . 40<br />

Argonne National Laboratory . . . 11<br />

Aspera . . . . . . . . . . . . . . . . . . . 37<br />

Assay Depot . . . . . . . . . . . . . . . 37<br />

Baylor . . . . . . . . . . . . . . . . . . . . . 5<br />

BGI . . . . . . . . . . . . . . . .12, 29, 42<br />

BIOBASE . . . . . . . . . . . . . . . . . . 50<br />

BioTeam . . . . . . . . . . . . . . .36, 49<br />

Brainloop . . . . . . . . . . . . . . . . . 50<br />

BT Global Services . . . . . . . . . . 48<br />

<strong>Cambridge</strong>Soft . . . . . . . . . . . . . 27<br />

Celera Genomics . . . . . . . . . . . . 48<br />

Center for Computational and<br />

Systems Biology Center . . . . . 11<br />

CHA University . . . . . . . . . . . . . . 5<br />

ChemAxon . . . . . . . . . . . . .38, 50<br />

CLC bio . . . . . . . . . . . . . . . . . . . 50<br />

ClearTrial . . . . . . . . . . . . . . . . . . 24<br />

Cold Spring Harbor Laboratory . . 5<br />

Complete Genomics . . . . . .38, 42<br />

Crescendo Bioscience . . . . . . . . 54<br />

Cycle Computing . . . . . . . . . . . . 36<br />

Definiens . . . . . . . . . . . . . . . . . 51<br />

Dell . . . . . . . . . . . . . . . . . . . . . . 48<br />

DNAnexus . . . . . . . . . . . . . .36, 41<br />

Eagle Genomics . . . . . . . . .33, 49<br />

Enlis Genomic . . . . . . . . . . . . . . 50<br />

Advertiser Index<br />

VOLUME 10, NO. 6<br />

Entelos . . . . . . . . . . . . . . . . . . . 54<br />

Eucalyptus . . . . . . . . . . . . . . . . 33<br />

European Bioinformatics<br />

<strong>Institute</strong> . . . . . . . . . . . . . . . . . 11<br />

Food and Drug Administration . . 16<br />

Gartner . . . . . . . . . . . . . . . . . . . 24<br />

Genentech . . . . . . . . . . . . . . . . 36<br />

Genia . . . . . . . . . . . . . . . . . . . . 43<br />

GenoLogics . . . . . . . . . . . . . . . . 12<br />

Geospiza . . . . . . . . . . . . . . . . . . 27<br />

GlaxoSmithKline . . . . . . . . . . . . 38<br />

goBalto . . . . . . . . . . . . . . . . . . . 50<br />

GoGrid . . . . . . . . . . . . . . . . . . . 33<br />

Google . . . . . . . . . . . . .33, 36, 41<br />

GSK . . . . . . . . . . . . . . . . . . . . . 26<br />

Harvard Medical School . . .34, 36<br />

HealthTap . . . . . . . . . . . . . . . . . 14<br />

Helicos . . . . . . . . . . . . . . . . . . . 17<br />

HP . . . . . . . . . . . . . . . . . . . . . . 48<br />

IBM . . . . . . . . . . . . . . . . . . .11, 48<br />

Illumina . . . . . . . . .12, 17, 29, 51<br />

Ingenuity Systems . . . . . . . . . . . 54<br />

Inova Translational Medicine<br />

<strong>Institute</strong> . . . . . . . . . . . . . . . . . 42<br />

<strong>Institute</strong> for Systems Biology . . . 30<br />

Intel . . . . . . . . . . . . . . . . . . . . . 11<br />

IO Informatics . . . . . . . . . . . . . . 51<br />

Ion Torrent . . . . . . . . . . . . . . . . . 17<br />

Johnson & Johnson . . . . . . . . . . 33<br />

Karlsruhe <strong>Institute</strong> of<br />

Technology . . . . . . . . . . . . . . . 11<br />

Max Planck <strong>Institute</strong> for<br />

Molecular Genetics . . . . . . . . 11<br />

Max Planck <strong>Institute</strong> of Molecular<br />

Cell Biology and Genetics . . . 11<br />

Medco Health Solutions . . . . . . 40<br />

[6] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

Myriad Genetics . . . . . . . . . . . . 54<br />

NABsys . . . . . . . . . . . . . . . . . . . 44<br />

NCBI . . . . . . . . . . . . . . . . . . . . . 41<br />

New York Genome Center . . . . . , 5<br />

NimbleGen . . . . . . . . . . . . . . . . 28<br />

Nimbus . . . . . . . . . . . . . . . . . . . 33<br />

Nirvanix Cloud . . . . . . . . . . . . . . 37<br />

NobleGen . . . . . . . . . . . . . . . . . 44<br />

Novartis . . . . . . . . . . . . . . . . . . 26<br />

NuMedii . . . . . . . . . . . . . . . . . . 26<br />

Omixon . . . . . . . . . . . . . . . . . . . 50<br />

Opscode . . . . . . . . . . . . . . . . . . 37<br />

Oracle . . . . . . . . . . . . . . . . . . . . 51<br />

Oxford Nanopore . . . . . . . . . . . . 44<br />

Pacific Biosciences . . . .17, 36, 43<br />

Penguin . . . . . . . . . . . . . . . . . . . 33<br />

PerkinElmer . . . . . . . . . . . . . . . . 27<br />

Pfizer . . . . . . . . . . . . . . . . . . . . . 26<br />

Praxeon . . . . . . . . . . . . . . . . . . . 50<br />

Rackspace . . . . . . . . . . . . . . . . 33<br />

Real Time Genomics . . . . . . . . . 50<br />

Roche . . . . . . . . . . . . . . . . . . . . 28<br />

San Diego Supercomputer<br />

Center . . . . . . . . . . . . . . . . . . 38<br />

Schrodinger . . . . . . . . . . . . . . . . 49<br />

Siemens . . . . . . . . . . . . . . . . . . 11<br />

Stanford University . . . . . . .14, 26<br />

SYM-BIO Life Sciences . . . . . . . 28<br />

Univa UD . . . . . . . . . . . . . . . . . 36<br />

University of California<br />

Davis . . . . . . . . . . . . . . . . . . . 12<br />

University of California<br />

Santa Cruz . . . . . . . . . . . . . . . 43<br />

University of Chicago . . . . . . . . . 11<br />

University of Leiden . . . . . . . . . . . 5<br />

University of Manchester . . .12, 12<br />

Advertiser Page # Advertiser Page #<br />

Appistry....................................... 31<br />

www .Appistry .com<br />

Barnett Educational Services: PAREXEL Biopharmaceutical<br />

R&D Statistical Sourcebook 2011/2012.......... 47<br />

www .barnettinternational .com<br />

Bio-IT World Asia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39<br />

Bio-ITWorldAsia .com<br />

Bio-IT World Best Practices Awards .................. 35<br />

Bio-ITWorld .com/bp_awards .aspx<br />

Bio-IT World Conference & Expo ..................18-19<br />

Bio-ITWorldExpo .com<br />

BioPartnering North America....................... 55<br />

Techvision .com/bpn<br />

Dell Compellent.................................. 7<br />

Inetpresent .com/w/Dell/136/reg/<br />

Educational Opportunities.......................52-53<br />

Bio-ITWorld .com<br />

Life Sciences Innovation Forum..................... 56<br />

www .LSInnovation .com<br />

Molecular Med TRI-CON 2012....................... 2<br />

TriConference .com<br />

Panasas........................................ 3<br />

Panasas .com<br />

RSA Conference ................................ 23<br />

RSAConference .com/bioit<br />

SciEngines..................................... 20<br />

www .sciengines .com<br />

SGI ......................................... 13<br />

www .sgi .com/go/bio-it<br />

This index is provided as an additional service. The publisher does not assume any liability for errors or omissions.<br />

Editorial, Advertising, and Business Offices: 250 First Avenue, Suite 300, Needham, MA 02494; (781) 972-5400<br />

Bio•IT World (ISSN 1538-5728) is published bi-monthly by <strong>Cambridge</strong> Bio Collaborative, 250 First Avenue, Suite 300, Needham, MA 02494 .<br />

Bio •IT World is free to qualified life science professionals . Periodicals postage paid at Boston, MA, and at additional post offices . The one-year<br />

subscription rate is $199 in the U .S ., $240 in Canada, and $320 in all other countries (payable in U .S . funds on a U .S . bank only) .<br />

Subscriptions: Address inquires to Bio-IT World, 250 First Avenue, Suite 300, Needham, MA 02494 888-999-6288 or e-mail<br />

kfinnell@healthtech .com<br />

Reprints: Copyright © 2011 by Bio-IT World All rights reserved . Reproduction of material printed in Bio •IT World is forbidden without written<br />

permission . For reprints and/or copyright permission, please contact Tim McLucas, (781) 972-1342, tmclucas@healthtech .com .<br />

Indispensable Technologies Driving<br />

Discovery, Development, and Clinical Trials<br />

EDITOR-IN-<strong>CHI</strong>EF<br />

Kevin Davies (781) 972-1341<br />

kevin_davies@bio-itworld .com<br />

MANAGING EDITOR<br />

Allison Proffitt (617) 233-8280<br />

aproffitt@healthtech .com<br />

ART DIRECTOR<br />

Mark Gabrenya (781) 972-1349<br />

mark_gabrenya@bio-itworld .com<br />

VP BUSINESS DEVELOPMENT<br />

Angela Parsons (781) 972-5467<br />

aparsons@healthtech .com<br />

VP SALES —<br />

LEAD GENERATION PROGRAMS<br />

Alan El Faye (213) 300-3886<br />

alan_elfaye@bio-itworld .com<br />

ACCOUNT MANAGER<br />

Tim McLucas (781) 972-1342<br />

tmclucas@healthtech .com<br />

CORPORATE MARKETING<br />

COMMUNICATIONS DIRECTOR<br />

Lisa Scimemi (781) 972-5446<br />

lscimemi@healthtech .com<br />

WEB ADVERTISING OPERATIONS MANAGER<br />

Franscena Schandelmayer-Davis<br />

(781) 972-1351<br />

fsdavis@healthtech .com<br />

ADVERTISING OPERATIONS COORDINATOR<br />

Stephanie Cline (781) 972-5465<br />

scline@healthtech .com<br />

Contributing Editors<br />

Michael Goldman, Karen Hopkin,<br />

Deborah Janssen, John Russell,<br />

Salvatore Salamone, Deborah Borfitz<br />

Ann Neuer, Tracy Smith Schmidt<br />

Advisory Board<br />

Jeffrey Augen, Mark Boguski,<br />

Steve Dickman, Kenneth Getz,<br />

Jim Golden, Andrew Hopkins,<br />

Caroline Kovac, Mark Murcko,<br />

John Reynders, Bernard P. Wess Jr.<br />

<strong>Cambridge</strong> <strong>Healthtech</strong> <strong>Institute</strong><br />

PRESIDENT<br />

Phillips Kuhl<br />

Contact Information<br />

editor@healthtech.com<br />

250 First Avenue, Suite 300<br />

Needham, MA 02494<br />

Follow us on Twitter, LinkedIn, and Facebook<br />

http://twitter .com/bioitworld<br />

www .linkedin .com/groupRegistration?gid=3141702<br />

www .facebook .com/bioitworld<br />

®


COMPLIMENTARY LIVE WEBINAR:<br />

Meeting Compliance Needs through<br />

Enhanced Management of Patient Data<br />

Date: December 8, 2011<br />

Time: 1PM/EST<br />

Length: 60 minutes<br />

Sponsored by<br />

Description: Healthcare providers understand the pressure to keep the<br />

data center highly-available at all times. These organizations consistently<br />

try to manage rapidly growing digital patient data, while adhering to<br />

regulatory compliance that dictates how long it must be stored. Learn<br />

how Radiology Ltd. has utilized the Dell Compellent SAN to improve<br />

the performance and scalability of its data center, while gaining critical<br />

transparent reporting to help enable better patient care.<br />

Presenter: Colin Schmugge/Data Center Manager at Radiology, Ltd.<br />

Moderator: Dr. Kevin Davies, Editor-in-Chief, Bio-IT World<br />

Colin Schmugge started with Radiology Ltd in 1996 as an x-ray darkroom<br />

technician. He then worked as a computer technician from 2004-2006<br />

before moving on to become the help desk manager. He continued to<br />

work his way through various IT management roles—each with increasing<br />

responsibilities--before taking on the data center manager position in<br />

2009. Colin holds a bachelor’s degree in engineering management from<br />

the University of Arizona.<br />

Register today: www.inetpresent.com/w/Dell/136/reg/


CONTENTS<br />

<strong>Up</strong> <strong>Front</strong> News<br />

A Genome Center for Gotham City<br />

Nancy Kelley nurtures the New York Genome Center.<br />

It is fair to say that Nancy Kelley, the executive director of the New York Genome Center<br />

(NYGC), did not expect to be leading such an ambitious effort 12 months ago. But<br />

convinced of the potential to create a major genomics institute in the center of New<br />

York city, Kelley relentlessly lobbied, marshaled and cajoled a consortium of all the<br />

major New York institutes and medical centers, along with private philanthropists<br />

and corporate partners, that it was time to seize the day.<br />

Kelley’s name may not be widely known in scientific circles because, as she freely<br />

admits, she’s not a scientist. But this economist, lawyer, and commercial real estate<br />

developer has been working in science for more than over 25 years, landing on the<br />

Board of the Jackson Laboratory and helping numerous biotech companies grow their<br />

business through stints at Spaulding & Slye Colliers and Alexandria Real Estate. Two<br />

years ago, she began working with a client building a personalized medicine institute.<br />

Those discussions took her to New York, where Kelley, aided by Columbia University<br />

molecular biologist Tom Maniatis and others, began fleshing out plans for NYGC.<br />

For the past year, Kelley has been working as NYGC’s executive director, helping to<br />

raise some $120 million in advance of the center’s official opening in Manhattan in<br />

early 2012.<br />

Prior to the formal announcement of NYGC on November 3 in Manhattan, Kelley<br />

gave her first in-depth interview with Bio•IT World editor Kevin Davies and talked<br />

about the extraordinary effort to bring a world-class genome center to the Big Apple,<br />

and what it means for the New York scientific and medical establishment.<br />

(Remarks have been edited for brevity. The full interview can be found at:<br />

www.bio-itworld.com/news/11/2011/new-york-genome-center.html)<br />

Bio•IT World: Nancy, how has your<br />

diverse background led you to your<br />

current role as executive director?<br />

Kelley: I’ve worked in the private sector,<br />

non-profit and public sector at various senior<br />

levels throughout my career. I started<br />

as a young lawyer at Hale & Dorr as a<br />

corporate securities lawyer, representing<br />

start-up companies. My standard client<br />

was a doctor or scientist who walked out<br />

of MIT or Harvard and we’d help build<br />

a company around it. I had the good<br />

fortune of seeing a lot of life science and<br />

health care companies develop from infancy,<br />

we’d help negotiate IP, raise money,<br />

put the management teams together, financing<br />

etc. If they were successful they’d<br />

grow up to go public and develop their<br />

drug…. It not only gave me a really nice<br />

education about how organizations grow<br />

and develop over time, but it helped me to<br />

understand the science. It was my job as<br />

the lawyer to sit down with the scientists<br />

and write down what they were doing in<br />

English for investors, so people buying<br />

their securities would actually under-<br />

[8] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

stand what they were doing!<br />

It was also a wonderful chance to<br />

work with some big companies that were<br />

babies back then, like Genetics <strong>Institute</strong>.<br />

Eventually, I found myself working with<br />

start-ups to raise money and to help run<br />

them, whether for profit or non-profit...<br />

In the last ten years, I became involved<br />

in life science real estate development.<br />

I helped negotiate a major transaction<br />

on behalf of my hometown—Belmont,<br />

Mass—with McLean Hospital… It turned<br />

out that life science real estate is a technologically<br />

very sophisticated product that<br />

has to be marketed and developed in a way<br />

completely different than other real estate.<br />

I thought it was an intriguing challenge.<br />

Tell us about the origins of the New York<br />

Genome Center.<br />

I’d been working in the city for nearly ten<br />

years in life sciences with Spaulding &<br />

Slye Colliers as well as Alexandria Real<br />

Estate. As senior VP of strategic operations<br />

for Alexandria, I was responsible for<br />

responding for the New York City RFP<br />

KATJA HEINEMANN<br />

and creating the development of the East<br />

River Science Park, which is 1 million<br />

square feet in Manhattan… Manhattan<br />

isn’t an easy place to develop anything in.<br />

It was a very large project—three towers,<br />

nearly $1 billion and 1 million square feet.<br />

The combination of legal, financial,<br />

operational and real estate with the scientific<br />

background all came together to<br />

create an opportunity in New York. I’d<br />

been an independent consultant for two<br />

years after Alexandria. I’d been working<br />

on another large sequencing project for a<br />

client for almost 12 months, which didn’t<br />

materialize. But it opened up the idea<br />

and possibility of introducing the idea of<br />

a large sequencing operation in New York.<br />

Given the fact I’d been working there for<br />

ten years and had many long-standing<br />

relationships with scientists there, I<br />

brought it to New York to see if there was<br />

interest. So we started discussing it and<br />

things developed from there.<br />

Obviously there was… a lot of skepticism<br />

about this, and it’s continued right<br />

through until today. But as I say, the only<br />

thing you can do is try. After consulting<br />

with a few individual scientists, we pulled<br />

together a very large meeting with representatives<br />

from all the institutions [in<br />

August 2010], and asked them to participate<br />

in the creation of a center like this. It


started with Columbia, went to Sloan Kettering<br />

and Rockefeller University… They<br />

enthusiastically endorsed it, to the extent<br />

that within 30 days, we had eight institutions<br />

putting seed money into a feasibility<br />

study and agreeing to move forward…<br />

THE MISSION<br />

How would you describe the vision,<br />

the mission of NYGC?<br />

The vision for NYGC is really to achieve<br />

transformational results for health care<br />

and research. In the end, our success will<br />

be measured by how the center affects<br />

medical delivery the way research is being<br />

done. For New York in particular, this is<br />

a very exciting development. For a long<br />

time, they’ve had the leading global institutions<br />

in health care, but for whatever<br />

reason, haven’t always come together to<br />

collaborate and leverage that strength.<br />

With this enterprise, it will allow them to<br />

do that and take their role on the global<br />

stage, as they should be…<br />

It will involve a very large highthroughput<br />

sequencing center, offering<br />

services not only to founding members<br />

but to other institutions and companies.<br />

It will also have a very strong research<br />

element, driving it forward with both<br />

genomics and bioinformatics involved.<br />

Who else has played a key role in<br />

organizing NYGC?<br />

The first and probably most important is<br />

Tom Maniatis at Columbia, a renowned<br />

and accomplished scientist, someone I’ve<br />

known from the Jackson Lab. He’s also a<br />

serial entrepreneur, having built some of<br />

the best known companies in the life sciences<br />

field, including Genetics <strong>Institute</strong>.<br />

He was the first person whom I talked<br />

to about this, and has been instrumental<br />

at every level in guiding this forward. So<br />

we’ve really done this together.<br />

But I’d also point to other key individuals—Tom<br />

Kelly (Sloan Kettering) has<br />

been involved from the beginning. New<br />

leaders at Sloan Kettering and Rockefeller,<br />

Craig Thompson and Marc Tessier-<br />

Lavigne respectively, were recruited and<br />

understood the need for inter-institutional<br />

and private sector collaboration. Russ<br />

Carson (founder/chair, Welsh Carson and<br />

the New York City Investment Fund) immediately<br />

saw the benefits of this to the<br />

New York Genome Center<br />

Charter Members<br />

Institutional Founding Members:<br />

Cold Spring Harbor Laboratory<br />

Columbia University<br />

Cornell University/Weill Cornell Medical<br />

College<br />

The Jackson Laboratory<br />

Memorial Sloan-Kettering Cancer<br />

Center<br />

Mount Sinai School of Medicine<br />

New York-Presbyterian Hospital<br />

New York University/NYU School of<br />

Medicine<br />

North Shore-Long Island Jewish Health<br />

System<br />

The Rockefeller University<br />

Stony Brook University<br />

Associate Founding Member:<br />

Hospital for Special Surgery<br />

Additional Members/Collaborators:<br />

F. Hoffmann-La Roche<br />

Illumina<br />

Support from:<br />

Empire State Development Corp.<br />

New York City Economic Development<br />

Corp.<br />

New York City Investment Fund<br />

Philanthropic Donors:<br />

The Simons Foundation (Founding<br />

Partner)<br />

Bloomberg Philanthropies<br />

Russell L. Carson<br />

Anthony B. Evnin<br />

WilmerHale<br />

city. He is now the Chairman of NYGC’s<br />

Board of Directors. Tony Evnin (senior<br />

partner, VenRock) also saw the immediate<br />

benefit to New York; he was our first<br />

individual philanthropic donor and has<br />

also played a critical role. Many, many<br />

people throughout the New York community<br />

have been instrumental.<br />

Why did you choose Illumina as NYGC’s<br />

sequencing partner?<br />

We went through a very extensive technology<br />

selection process, which involved<br />

all the scientists on the scientific advisory<br />

committee, and the technology and bioinformatics<br />

working group committees. We<br />

invited the two leading companies—Illumina<br />

and Life Technologies—to come in<br />

and make a presentation by their senior<br />

management, how they would approach a<br />

partnership with a large sequencing cen-<br />

ter like this... In the end, there’s just been<br />

enormous progress made by Illumina in<br />

their productivity and turnaround times<br />

this year, and that proved to be one of the<br />

deciding factors—the continuing evolution<br />

of their technology and widespread<br />

adoption.<br />

We will start with approximately 30<br />

sequencers in year one and build up to<br />

a fairly significant number over a 5-year<br />

period. There are many terms to the<br />

Illumina collaboration, and obviously<br />

pricing is one of them. But this will not<br />

be an exclusive technology in any way. In<br />

the innovation center, we’ll be testing new<br />

technologies and making them available<br />

to the scientists in New York.<br />

BUILDING THE COALITION<br />

How have you managed to keep all these<br />

New York institutions happy during<br />

negotiations?<br />

The center was legally incorporated and<br />

raised its initial seed financing in August<br />

2010. The original feasibility study was<br />

completed last December and underwent<br />

extensive review by all the institutions<br />

in January 2011. We started a planning<br />

process—meaning implementation—<br />

which concluded at the end of June. In<br />

the weeks that followed, we finalized the<br />

governance and closing documents for 9<br />

Institutional Founding Members, now 11,<br />

and transitioned to our build-out process.<br />

We now are opening a small development<br />

office, expanding our launch team, and<br />

negotiating the other agreements that<br />

will serve as the foundation for our official<br />

opening in 2012.<br />

Getting to this point has been very,<br />

very challenging! A few things helped a<br />

great deal: first, the commitment from<br />

the leadership of the various institutions<br />

to actually sit down and discuss this,<br />

knowing how important this was going<br />

to be for New York in general. New York<br />

is behind in the genomics area, so these<br />

leaders saw NYGC as a real opportunity to<br />

catch up. Their commitment to the process<br />

was very important and helped overcome<br />

some bumps when we met them.<br />

What were some of the concerns and<br />

hesitations you heard regarding NYGC?<br />

I think there were three substantive<br />

issues. First, would the institutional de-<br />

www.bio-itworld.com NOVEMBER | DECEMBER 2011 BIO•IT WORLD [ 9 ]<br />

CONTENTS


CONTENTS<br />

<strong>Up</strong> <strong>Front</strong> News<br />

mand be sufficient to sustain a center of<br />

this magnitude and what would be the<br />

commitment for those institutions to the<br />

center and the commitment of the center<br />

to them?<br />

Traditionally, New York has lagged<br />

behind other areas in sequencing capacity,<br />

as have all the institutions we’ve been<br />

speaking to—although one or two, like<br />

Mount Sinai, have been building a large<br />

genomics capability in recent years. So<br />

that was a big contentious question that<br />

continued through the end. Obviously<br />

with a start-up, nothing is certain and<br />

everyone has a different idea. We had to<br />

do as much research as we could and go<br />

with our gut. We’ll have to adjust as we go<br />

along, obviously.<br />

A second major issue was, how would<br />

the major institutions participate together?<br />

Would there be equal representation<br />

or proportional participation with different<br />

sizes of institutions? In the end, after<br />

enormous debate, a principle of equal<br />

representation was chosen. The initial<br />

contribution by all the founding members<br />

is the same, even though their use of the<br />

center may vary dramatically. The idea<br />

that won out was that the value of participation<br />

in a scientific community and<br />

endeavor created by NYGC was worth the<br />

small initial investment and that everyone<br />

would share in this scientific enterprise<br />

equally.<br />

The third issue was, how would this<br />

initiative interact with the initiatives<br />

going on within each of the institutions?<br />

Competitive or supportive? How would<br />

that work? This was especially important<br />

in defining the research element of the<br />

center, because there is a plan and process<br />

underway to recruit a world renowned<br />

scientist to be the scientific director.<br />

There are a lot of questions about how the<br />

research portion would interface with the<br />

other institutions.<br />

At one point, Richard Gibbs (Director,<br />

BCM genome center) said something to<br />

me: ‘You have to be absolutely fearless<br />

to do this.’ He’s right—if one stopped to<br />

think too much about all the things that<br />

could go wrong at any moment, you’d<br />

never get up in the morning. So the only<br />

thing you can do is to work at it the best<br />

you can, manage the issues as they come<br />

up. I don’t think there was one day this<br />

[10] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

year when I had full confidence we’d actually<br />

make it, but I knew it was important<br />

to try.<br />

FUNDING AND LOGISTICS<br />

What is it costing to get NYGC off the<br />

ground?<br />

The total project cost is about $120<br />

million. It will come from a variety of<br />

sources—the Institutional Founding<br />

Members, Associate Members, private<br />

philanthropists, founding member companies,<br />

technology collaborators, the New<br />

York City Economic Development Corporation<br />

and the New York City Investment<br />

Fund, That’s not all inclusive, but there<br />

are a number of pieces to the quilt that<br />

had to be knit together to be able to put<br />

this all together.<br />

What is NYGC’s operating structure?<br />

We’ve built up the business model with<br />

seven operational units. We start with<br />

the sequencing service center, which will<br />

serve not only the institutional founding<br />

members and pharma collaborators, but<br />

also hospitals on the clinical side… That<br />

will be coupled with a very robust bioinformatics<br />

presence.<br />

We’re looking to recruit Ph.D.s—<br />

mathematicians, and computational biologists,<br />

to interpret the sequencing data<br />

and to assist researchers on a consulting<br />

basis.<br />

Third, the center will have its own<br />

internal research, led by a world-class scientific<br />

director. We’ll also make another<br />

senior investigator appointment in the<br />

first year.<br />

The Innovation Center is a cost center<br />

where we’re investing a lot of capital and<br />

operational resources to allow scientists<br />

from throughout New York and beyond<br />

to use these instruments, to publish<br />

early and establish thought leadership.<br />

Finally, we hope there will be commercial<br />

activities from these efforts to create new<br />

jobs and products to enhance medical<br />

treatments.<br />

Training will take place in conjunction<br />

with Cold Spring Harbor’s wonderful programs,<br />

and possibly with Stony Brook and<br />

CUNY, with whom we’ve talked about<br />

creating some degree programs to help<br />

train people in this area. And finally, we’ll<br />

have a small philanthropic unit.<br />

To start, we’ll have two research<br />

groups with about 15 people each, so<br />

the internal research will have about 30<br />

people.<br />

What about a clinical component?<br />

Is that a revolution that NYGC aims<br />

to be a part of?<br />

Absolutely! There will be a CLIA-certified<br />

portion of the facility and we’ll be interacting<br />

very closely with the hospitals<br />

like NewYork-Presbyterian Hospital and<br />

North Shore-LIJ, two of our institutional<br />

founding members, to create some innovative<br />

programs. Obviously there are a lot<br />

of regulatory and other issues that need to<br />

be overcome, but we do think the clinical<br />

component of this endeavor will be part<br />

of the future of medicine and health care.<br />

When do you expect to recruit the<br />

scientific director, and what will your<br />

role be after that?<br />

An international search effort is underway,<br />

led by a committee comprised of<br />

representatives of all the institutional<br />

founding members and myself, chaired<br />

by Tony Evnin. Obviously, this leadership<br />

group has a lot of established relationships<br />

in the industry, so we’re also seeking<br />

recommendations as to people who<br />

should be considered. I think it will take<br />

[several] months to recruit someone who<br />

has the right scientific, leadership and<br />

administrative attributes.<br />

Obviously there’s an enormous<br />

amount of work to be done to make this<br />

successful. The first step in creating the<br />

strategic plan and raising the money is<br />

only a small component of making this<br />

successful. There’s a huge ongoing financial,<br />

executive and operational role that<br />

has to be put together with a large organization,<br />

and I’d expect to play a key role<br />

in doing that as the Executive Director or<br />

Deputy Director.<br />

You have to be both proud and relieved<br />

to have come this far.<br />

In truth, when I first talked to Tom Maniatis<br />

and Tom Kelly about this, we were<br />

operating with a cell phone and a hotmail<br />

account! 12 months later, to have raised<br />

$120 million and brought this number<br />

of institutions together is really quite<br />

extraordinary. •


Open Data and Patient Modeling in Europe<br />

Open source debates abounded at 3rd annual Bio-IT World Europe conference.<br />

BY ALLISON PROFFITT<br />

HANNOVER, GERMANY—“The tools<br />

and library situation in bioinformatics is<br />

an open-source zoo,” said Misha Kapushesksy,<br />

functional genomics team leader<br />

with the European Bioinformatics<br />

<strong>Institute</strong> (EBI), during his presentation<br />

on the Gene Expression Atlas platform.<br />

If that’s so, attendees got quite a tour of<br />

menagerie at the third annual Bio-IT<br />

World Europe conference*, with much<br />

emphasis on open-source platforms and<br />

cloud deployments.<br />

The ArrayExpress Archive is an EBI<br />

database of functional genomics experiments<br />

where one can query and download<br />

data. Gene Expression Atlas contains<br />

a subset of curated and re-annotated<br />

archive data, which can be queried for<br />

individual gene expression results under<br />

different biological conditions across<br />

experiments. It’s meant to be a simple interface<br />

for identifying strong differential<br />

expression candidate genes in conditions<br />

of interest.<br />

Kapushesksy and colleagues have deployed<br />

the ArrayExpress high-throughput<br />

sequencing pipeline inside the R Cloud at<br />

EBI. The R Cloud was originally a server<br />

farm and is EBI’s “experiment to increase<br />

the use of its infrastructure,” Kapushesksy<br />

said. The cloud is free, but use is constrained.<br />

Currently R Cloud is linked to<br />

the European Nucleotide Archive (ENA),<br />

ArrayExpress, and quality control reports.<br />

Daniel White (Max Planck <strong>Institute</strong><br />

of Molecular Cell Biology and Genetics)<br />

presented Fiji, an extension of the<br />

ImageJ open platform for image analysis<br />

but with Java, Java 3D, and plugins organized<br />

in a menu structure. White said<br />

that Fiji now represents a philosophy<br />

and community as well as a platform,<br />

with users sharing code and developing<br />

plugins for image transformation, registration,<br />

segmentation and analysis.<br />

Urban Liebel (Karlsruhe <strong>Institute</strong><br />

* Bio-IT World Europe 2011, Hannover, Germany, October<br />

11-13, 2011<br />

of Technology, KIT) presented the<br />

Harvester bioinformatics search portal<br />

(harvester.kit.edu). The portal scours all<br />

of the images and figures in PubMed and<br />

more than 30 other databases for images<br />

of interest. “I don’t know about you,” Liebel<br />

said, “but when I find a paper, I look<br />

at the figures first. If I can understand the<br />

figures, I read the paper.”<br />

Liebel also presented Sciety (sciety.<br />

org), an application that works like<br />

Digg for PubMed articles. Much of the<br />

post-publishing commentary on papers<br />

SAVE THE DATES:<br />

Bio-IT WORLD EUROPE 2012<br />

New Location: VIENNA, AUSTRIA<br />

October 9-11, 2012<br />

isn’t recorded, Liebel said. If a student<br />

brings a Nature paper to his/her supervisor,<br />

the group leader may know that the<br />

paper had been later invalidated, but the<br />

student doesn’t. Sciety lets researchers<br />

comment on published research. There<br />

is an option to comment anonymously,<br />

but comments from users who log in and<br />

identify themselves carry more weight.<br />

Programming the Patient<br />

Hans Lehrach, director of vertebrate genomics<br />

at the Max Planck <strong>Institute</strong> for<br />

Molecular Genetics in Berlin, presented<br />

“Systems Patientomics” in one of two<br />

keynote talks. We’ve been treating patients<br />

not as individuals, but as members<br />

of large homogeneous groups, Lehrach<br />

said. Now that sequencing costs are falling<br />

and speeds are increasing, we must<br />

develop virtual patient systems taking<br />

into account data from mutation data-<br />

bases, genomics, tumor sequencing, and<br />

more. Lehrach compared the models to<br />

crash models employed by the automobile<br />

industry. We don’t run crash simulations<br />

with real people, he pointed out, and<br />

called for similar models to find optimal<br />

treatments for patients.<br />

A trial is currently underway to test<br />

the models with the ITFoM (IT Future<br />

of Medicine) project. The goal, Lehrach<br />

said, is to define a reference model, then<br />

use many technologies to individualize<br />

those models. Within ten years, he believes<br />

such models will be able to lead to<br />

advances in cancer and metabolic disease.<br />

He acknowledges that the compute<br />

requirements for such models will be<br />

huge—he estimates a petaflop required<br />

for each patient—but the effort currently<br />

has support from IBM, Amazon, Siemens,<br />

Intel, COSBi, and other computing powerhouses<br />

involved in the ITFoM project<br />

(see, “Hans Lehrach’s Predictive Biology<br />

Philosophy,” Bio•IT World, May 2009).<br />

Connected to that project, Corrado<br />

Priami, CEO of Microsoft’s Center for<br />

Computational and Systems Biology<br />

Center (COSBi) in Trento, Italy, argued<br />

for the need for a new language to describe<br />

how biology works, and it just may<br />

be a programming language. We are trying<br />

to help biologists “program without<br />

knowing they are programming,” said<br />

Priami.<br />

Priami presented an option for modeling<br />

a biological pathway using an interface<br />

based on natural language to explain<br />

a biological system. The model is easier to<br />

write, change, and reuse than traditional<br />

mathematical approaches, though it may<br />

be slower than classical equations for<br />

smaller systems.<br />

Science as a Service<br />

In the cloud forecast, Folker Myer (Argonne<br />

National Laboratory/University<br />

of Chicago) sees the cloud as part of a<br />

solution, but not the end. Myer defines<br />

“cloud” as basically grid computing with<br />

virtual machines, though he acknowledg-<br />

www.bio-itworld.com NOVEMBER | DECEMBER 2011 BIO•IT WORLD [ 11 ]<br />

CONTENTS


CONTENTS<br />

ROB WHITROW<br />

<strong>Up</strong> <strong>Front</strong> News<br />

University of Manchester’s Carole Goble<br />

es that “cloud” now includes infrastructure-as-a-service<br />

(Iaas), platform–as-aservice<br />

(PaaS), and software-as-a-service<br />

(SaaS). Myer’s group applies the cloud as<br />

a solution to huge metagenomics problems<br />

(the Earth Microbiome Project plans<br />

to collect 200,000 samples). MG-RAST<br />

(http://metagenomics.anl.gov), Argonne<br />

National Lab’s open source metagenomics<br />

analysis server, was released for the<br />

cloud in March.<br />

But the cloud cannot be the only answer.<br />

When sequencing massive datasets,<br />

the compute cost is now surpassing the<br />

sequencing costs, Myer said. Using a<br />

2009 example, Myer reported that Illumina<br />

HiSeq 2000 data, running BLAST-<br />

X on soil samples, could cost $45,000<br />

for the sequencing and $900,000 for<br />

the Amazon EC2 computing costs (not<br />

including storage or data storage).<br />

Myer believes researchers should<br />

share results to minimize re-computing<br />

costs and raw data to minimize data access<br />

problems. The Open Source Data<br />

Framework (OSDF), which Myer’s group<br />

announced in late September, could help<br />

fix the data analysis problem, he said.<br />

The Argonne Workflow Engine—a<br />

RESTful interface with Google’s v8 JA-<br />

VAscript engine—could also help. It has<br />

been running MG-RAST for the last 18<br />

months, and scaling up by 600x. AWE<br />

distributes work across a number resources<br />

including HPC clusters, clouds,<br />

and systems with accelerators (GPUs or<br />

FPGAs).<br />

[12] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

But what does a cloud solution look<br />

like for labs without HPC clusters or massive<br />

budgets? The concern voiced by University<br />

of Manchester’s Carole Goble<br />

(see, “Democratizing Informatics for<br />

the ‘Long Tail’ Scientist,” Bio•IT World,<br />

March 2011) is not big pharma’s use of<br />

the cloud, but how small academic labs<br />

can use the resource. Researchers with<br />

limited funding options or access to large<br />

computer clusters, already dependent on<br />

open-source software solutions, have the<br />

ability to sequence, but need help with<br />

annotation.<br />

And so Goble decided to do the “most<br />

naïve thing you could imagine”—she<br />

and her team moved her open-source<br />

workflow platform, Taverna, to the cloud.<br />

Annotation is perfect for the cloud, Goble<br />

said. It’s highly repetitive and deeply<br />

parallel. It works well on a pay-as-you-go<br />

model. Her goal was to provide annotation<br />

as a cloud service—science–as-a-service.<br />

The experiment took four days and<br />

cost $600 to set up a Web interface and<br />

test Taverna in the AWS environment.<br />

The cost per run? $5 and less than two<br />

hours. She and her team used datasets<br />

from researchers who are comparing<br />

why some African breeds of cattle seem<br />

much heartier than others. The researchers<br />

were able to use Taverna in the cloud<br />

to annotate the Boran and Cape Buffalo<br />

genomes in a couple of hours.<br />

Goble said the experiment did expose<br />

some challenges. First, Eucalyptus clouds<br />

(an open source platform for building<br />

private clouds) are not the same as AWS.<br />

Users can not save development costs<br />

by testing in Eucalyptus clouds before<br />

moving to AWS—it just won’t work. But<br />

the bigger problem Goble uncovered was<br />

that the reference dataset the researchers<br />

wanted to use was located in the US Amazon<br />

space, while their work was being<br />

done in the EU Amazon space.<br />

“This is a social problem!” Goble<br />

said. Public databases must exist in the<br />

cloud, but “just the West coast zone is<br />

not enough.” The set up raises many data<br />

ownership issues. Goble’s next crusade<br />

is to convince major cloud providers<br />

including Amazon to host public data<br />

sets for little money so that researchers<br />

worldwide can easily access and work<br />

with the data. •<br />

Briefs<br />

2012 BEST PRACTICES CALL FOR<br />

ENTRIES ANNOUNCED<br />

The 2012 Bio•IT World Best Practices<br />

competition has released its call for<br />

entries. Since 2003, Bio•IT World’s<br />

Best Practices competition has been<br />

recognizing outstanding examples<br />

of technology and strategic innovation<br />

initiatives across the drug discovery<br />

enterprise. The deadline for<br />

entry in the 2012 program is January<br />

13, 2012, and the early bird deadline<br />

is December 16, 2011. Entries will be<br />

accepted in six categories: Clinical<br />

& Health-IT; IT infrastructure/HPC;<br />

Informatics; Knowledge Management;<br />

Research & Drug Discovery;<br />

and Personalized & Translational<br />

Medicine.<br />

ILLUMINA INVESTS IN<br />

GENOLOGICS<br />

Illumina has led an $8 million<br />

funding round in GenoLogics, the<br />

maker of LIMS (Laboratory Information<br />

Management System) software<br />

designed for next-generation<br />

genomics labs. The financing will<br />

be used to accelerate GenoLogics’<br />

product development to support<br />

future clinical applications and<br />

new desktop sequencing systems,<br />

and to expand sales and marketing<br />

functions.<br />

BGI AND UC DAVIS LAUNCH<br />

JOINT GENOME CENTER<br />

The University of California, Davis,<br />

and BGI have signed an agreement<br />

to establish a state-of-the-art BGI<br />

sequencing facility on the UC Davis<br />

Health System campus in Sacramento,<br />

and initiate planning for<br />

a permanent BGI@UC Davis Joint<br />

Genome Center. The facility will be<br />

used to support research initiatives<br />

in human and animal health, food<br />

safety and security, biology, and<br />

the environment. When complete,<br />

the permanent center will occupy<br />

about 10,000 square feet on the<br />

health system campus in Sacramento,<br />

and will increase UC Davis’<br />

sequencing capability approximately<br />

tenfold.


SGI ® Accelerates Drug Development Pipeline<br />

at The <strong>Institute</strong> of Cancer Research<br />

Overview<br />

Founded in 1909, The <strong>Institute</strong> of Cancer<br />

Research (ICR) is now at the forefront of<br />

international cancer research with approximately<br />

1,100 internationally renowned<br />

staff. Requiring high-performance big<br />

memory supercomputers, superclusters,<br />

multi-tiered storage design, and applications<br />

expertise to process its growing<br />

research requirements, the ICR has<br />

chosen SGI® to provide comprehensive<br />

server and storage solutions.<br />

Business Challenge<br />

The ICR focuses on molecular pathology,<br />

therapeutic development, and genetic<br />

epidemiology, including studying basic<br />

molecular and cell biology to identify<br />

new strategies for cancer therapeutics.<br />

These studies are used to identify new<br />

targets for drug development. Increasing<br />

availability and advances in technology<br />

have created the challenge of processing,<br />

analyzing, and storing massive volumes<br />

of cancer-relevant data. The SGI solution<br />

addresses this comprehensive challenge<br />

in a manner that allows researchers to<br />

focus their full efforts on science, rather<br />

than on the supporting IT infrastructure.<br />

Portfolio of SGI Solutions<br />

Drives Productivity<br />

The ICR currently utilizes SGI UV shared<br />

memory supercomputers and several<br />

SGI ICE clusters on Intel® Xeon® processors,<br />

and a fully integrated, multi-tiered<br />

storage design.<br />

The SGI UV uniquely enables extremely<br />

large and heterogeneous data to be<br />

integrated swiftly, allowing researchers<br />

to correlate data from patient to cancer<br />

cells on an unprecedented scale. The<br />

genome assembly algorithms that build<br />

the final sequences are all memorylimited;<br />

thus, large shared memory is<br />

critical in accelerating the workflows in<br />

this area. Ultimately, this research is<br />

leading to models that can be used in<br />

drug development.<br />

The <strong>Institute</strong> relies on SGI ICE for<br />

high throughput image analysis (e.g.,<br />

image processing, recognition, and feature<br />

extractions) and implementation<br />

BIO-IT WORLD ADVERTISING SPONSOR<br />

“<br />

The rapidly expanding data requirements within systems<br />

biology has made large, shared memory essential. Images and<br />

sequence data results are increasingly converging, indicating a<br />

growing need for cross-industry collaborations. The SGI solution<br />

facilitates this by drawing on large memory capacity, enabling us<br />

to manage data at a higher speed and with greater focus.”<br />

of machine-learning algorithms on very<br />

high dimensional datasets, as well as<br />

for developing database structures for<br />

data integration.<br />

As there are several systems stationed<br />

throughout the U.K., the ICR also needed<br />

a comprehensive, mirrored storage solution.<br />

The SGI CXFS shared file system<br />

and DMF tier virtualization products are<br />

used at headquarters to ensure that storage<br />

capacity utilization is optimized and all<br />

data is easily accessible. Dedicated Parallel<br />

Data Movers work at both ICR sites<br />

to and from tier 2 (SGI COPAN) and tier<br />

3 (LTO 5 tape) storage, depending upon<br />

the criticality and timeliness of the data<br />

being processed. Automatic redundancy<br />

is built in, providing both reliability and<br />

security. SGI DMF management system<br />

also enables the ICR to have the option<br />

of scaling its existing infrastructure in<br />

the future.<br />

Business Results<br />

The ICR’s server and data requirements<br />

highlight the growing demand for scal-<br />

— Professor Chris Marshall, Director of Research, ICR<br />

ability and power within the scientific<br />

research field. Aside from the need to<br />

manage and analyze hundreds of terabytes<br />

of data, the ICR needs to be able<br />

to perform rapid calculations across a<br />

broad range of research in a way that<br />

is cost effective and requires as little<br />

human input and intervention as possible.<br />

The entire SGI solution enables<br />

the ICR to systematically process and<br />

correlate its growing data requirements,<br />

providing a system that is essential to<br />

the future of integrative biology and the<br />

subsequent development of human<br />

disease therapies.<br />

www.sgi.com/go/bio-it<br />

www.bio-itworld.com NOVEMBER | DECEMBER 2011 BIO•IT WORLD [ 13 ]


CONTENTS<br />

<strong>Up</strong> <strong>Front</strong> News<br />

New Technologies for Patient-Physician<br />

Interaction at Medicine 2.0<br />

Social media options<br />

growing for health.<br />

BY MING GUO<br />

PALO ALTO—What if you had access<br />

to the advice of a real doctor, anytime,<br />

anywhere? Better yet, what if the advice<br />

was tailored to your personal situation?<br />

A web-based medical consultation company,<br />

HealthTap, is leading the way in<br />

bridging the communication gap among<br />

patients and physicians.<br />

At the recent Medicine 2.0 conference*<br />

held at Stanford University, Ron<br />

Gutman, founder and CEO of HealthTap,<br />

announced the enrollment of 5,000 physicians<br />

in his online medical consultation<br />

platform. Coming from a family of physicians,<br />

Gutman strongly advocates “interactive<br />

health”—helping patients reach<br />

physicians, which in turn helps physicians<br />

“do good in the world.”<br />

“Medicine is more than 50% art,” he<br />

said. “Better information is not better<br />

health. The idea is not to finish everything<br />

*Medicine 2.0, Stanford University, September 16-18, 2011<br />

[14] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

on the Internet, but to find information,<br />

reduce anxiety, and interact with more<br />

people in the process of health care.”<br />

While striving to have the most userfriendly<br />

interface, the key strength of the<br />

platform lies in the strong informatics<br />

technology operating behind the scenes.<br />

The company has developed robust ontologies<br />

to organize medical knowledge<br />

in a way that is both flexible and easily<br />

extended to different disease verticals.<br />

“This ontology at the back-end helps<br />

to direct the right question to the right<br />

doctor, learn about the patient, and make<br />

answers tailored to the patient,” said<br />

Gutman.<br />

Leveraging the network effect of social<br />

media, the platform allows physicians<br />

around the world to share their most<br />

up-to-date information. Those whose<br />

information is most agreed upon receive<br />

higher scores, which helps establish a better<br />

online reputation.<br />

Although the current platform does<br />

not allow 1-on-1 private consultations,<br />

HealthTap is in the process of creating<br />

a more secure HIPAA-compliant<br />

environment to allow physician-patient<br />

interactions.<br />

Around the Bay Area, companies<br />

integrating health information with<br />

social networks seem to be booming.<br />

Healthism.com is designed to “push the<br />

envelope even further by developing<br />

the world’s first effective social network<br />

custom-built to inspire lifestyle changes<br />

by transforming advice into action,” said<br />

CEO Damon Ramsey.<br />

Another online platform for patient-to-patient<br />

information sharing<br />

in the Crohn’s and Colitis community<br />

is Crohnology.com. Its founder, Sean<br />

Ahrens, talked about “demoing Crohnology.”<br />

After the conference, his blog post<br />

about was viewed by more than 15,000<br />

people overnight. “There is clearly a pentup<br />

need by patients around the globe for a<br />

site like Crohnology,” Ahrens said.<br />

In the “Medicine 2.0” era, information-empowered<br />

patients will take charge<br />

of their own health care within the ever<br />

more connected physician-patient network.<br />

While issues like reimbursement<br />

may take time to be sorted out, Medicine<br />

2.0 is nevertheless giving more attention<br />

to “care” in the health care equation, and<br />

putting the individual back into personalized<br />

health care. •<br />

And the Winner Is...<br />

The NGS Data Interpretation survey in<br />

the last issue of Bio•IT World received an<br />

overwhelming response.<br />

The winner of the Kindle is:<br />

Alan H. Christensen<br />

Associate Professor<br />

George Mason University<br />

Thank you for your time, and we invite<br />

you to participate in future surveys.<br />

Kevin Davies<br />

Editor-In-Chief


<strong>Up</strong> <strong>Front</strong> The Bush Doctrine<br />

Another<br />

Cloudy Day?<br />

ERNIE BUSH<br />

Cloud computing certainly deserves the prize for most<br />

talked about (most hyped?) computing development<br />

of the new decade. Even my 84-year-old Mother recently<br />

made a joke about moving her work into the<br />

cloud in the foreseeable future, although I am sure<br />

she is clueless about any technical meaning behind<br />

the phrase. Unfortunately, as with all newly-minted concepts<br />

and jargon, the exact interpretation or instantiation of “cloud<br />

computing” is still highly dependent on who is expressing the<br />

opinions. Still, the core idea and vision for cloud computing has<br />

definitely gelled to the point where all businesses can have intelligent<br />

discussions around if, and how, this new tool set could<br />

add value to their enterprise.<br />

Cloud Computing in a Regulated Environment<br />

Perhaps the key hallmark of pre-clinical safety assessment that<br />

differentiates it from all other forms of non-human research<br />

activities is that much of this work is conducted under the<br />

strict oversight mandated by Federal Good Laboratory Practice<br />

(GLP) regulations. These Code of Federal Regulation directives<br />

provide clear expectations for security, traceability, and validation/qualification<br />

of software products and therefore the natural<br />

question becomes: Is there a place for cloud computing in a<br />

GLP environment?<br />

Clearly others are much more qualified to discuss and pontificate<br />

over the technical details and merits of cloud computing<br />

in a GLP environment, so this column will focus on the<br />

perceived advantages/disadvantages such a tool set could offer<br />

from the perspective of one that had the responsibility to make<br />

the business of GLP safety assessment as high quality and yet<br />

as cost effective as possible. There was a time in my career<br />

when pharmaceutical pre-clinical safety assessment had an air<br />

of “spare no expense”. While we certainly did not have a blank<br />

check, we did have a lot of resources and funding to do the job<br />

right. However with all the cost cutting pressures on pharma<br />

these days, this mandate for “quality at any cost” has certainly<br />

been muted. This pressure for cost containment has progressed<br />

to the point where many pharmas—even very large pharmas—<br />

have completely discontinued all GLP operations in-house and<br />

now conduct these studies at Contract Research Organizations<br />

(an unimaginable development when I entered the business<br />

nearly 30 years ago). Overall I do not see this migration of GLP<br />

into the out-sourcing and off-shoring arena as a significant<br />

loss to the integrity of early safety assessment, but I do wonder<br />

about what this commoditization of safety assessment means in<br />

terms of retaining in-house expertise.<br />

Therefore, given both the need to manage costs and the<br />

need to maintain fundamental or firsthand experience with a<br />

required element of new drug development, one could ask what<br />

does cloud computing bring to the table? And the overwhelming<br />

answer is of course saving the incredible costs that GLP<br />

safety assessment organizations currently invest in purchasing,<br />

qualifying and maintaining extensive IT infrastructures to support<br />

their regulated studies. In my career I have often marveled<br />

at how much time and money we spend on these IT issues (actually<br />

just the time spent arguing about these IT issues) and the<br />

idea that I could simply contract this out to some cloud-based<br />

(cloudy?) service provider sounds nearly irresistible to me. On<br />

the other hand, when asked what would be my major concern<br />

about moving to such a cloud-based system it would have to be<br />

security. Again, the traditional view of preclinical safety data<br />

was that they were an essential element of the family jewels<br />

and as such needed to have a level of protection befitting that<br />

status. Therefore the question<br />

There are strong<br />

business reasons<br />

to consider or<br />

avoid cloud-based<br />

services in a GLP<br />

environment.<br />

becomes: can cloud-based solutions<br />

be as—or even more—<br />

secure than in-house data<br />

systems? I have heard both<br />

pro and con arguments on that<br />

position, but I must say the<br />

rather constant drum beat of<br />

security breaches at financial<br />

institutions gives me reason to<br />

worry.<br />

So it would appear there<br />

are strong and compelling business reasons to consider or avoid<br />

cloud-based services in a GLP environment. But what would<br />

keep me up at night would not be the question of whether we<br />

can achieve a certain level of cost savings or maintain appropriate<br />

levels of security and integrity. Rather it would be whether<br />

we had yet again fallen victim to the hyped-up rhetoric surrounding<br />

the introduction of new technology. In my experience<br />

only about 1 in 10 new technology introductions actually end<br />

up having a truly positive or game-changing impact over the<br />

long haul. Most new technologies (say 5 or 6 of 10) end up offering<br />

some advantages and some disadvantages and therefore<br />

sum up to a low or zero net impact. And some, on the order of<br />

2 of 10, actually end up costing us more than we recovered in<br />

benefit. Will “cloudy” solutions be that 1 in 10 big leap forward,<br />

or are they just another resource drain whose benefits never<br />

live up to their sunny promise?<br />

Ernie Bush is VP and Scientific Director of <strong>Cambridge</strong> <strong>Healthtech</strong><br />

Associates. Email. ebush@chacorporate.com<br />

www.bio-itworld.com NOVEMBER | DECEMBER 2011 BIO•IT WORLD [ 15 ]<br />

CONTENTS


CONTENTS<br />

The Skeptical Outsider<br />

Science in Thrall<br />

to the FDA<br />

BILL FREZZA<br />

Risk aversion is the inherent enemy of progress. In a<br />

free society we can each seek our own balance, accepting<br />

the consequences. But when entrenched interests<br />

are allowed to thwart attempts by innovators<br />

and entrepreneurs to challenge the status quo, we<br />

all pay the price. As America slides into malaise and<br />

decline, nowhere is this more evident than in our passive acceptance<br />

of the absolute power of the U.S. Food and Drug Administration—even<br />

in the face of certain death. It need not be so.<br />

Forty years after President Richard Nixon declared War on<br />

Cancer—one of the most expensive and protracted wars in U.S.<br />

history—it’s hard not to be discouraged. If you should be unfortunate<br />

enough to be diagnosed with ovarian cancer, you are<br />

going to die. Same for pancreatic cancer. Or lung cancer. Not to<br />

mention a host of other metastatic sarcomas that medical science<br />

has not yet conquered.<br />

Who hasn’t lost a friend or family member to the Emperor<br />

of all Maladies? Who hasn’t watched victims struggle with false<br />

hopes raised by palliative remedies that a quick Google search<br />

will tell you can at best delay the inevitable? And who hasn’t<br />

imagined that if fate were to put them in a similar circumstance<br />

they would accept the risk of an experimental alternative on the<br />

chance of a cure, however slim?<br />

Yet standing between overburdened innovators and willing<br />

patients is an unaccountable agency determined to enforce<br />

a standard of safety completely disproportionate to the dire<br />

circumstances of the terminally ill. Exactly where in the Constitution<br />

did the people confer such power over life and death on<br />

unelected bureaucrats?<br />

Initially established by the Pure Food and Drug Act of 1906,<br />

the U.S. Department of Agriculture was granted the power<br />

to seize adulterated or misbranded food as well as mislabeled<br />

drugs when shipped across state lines. Note the constitutional<br />

foot in the door. In 1911, attempts to expand the USDA’s authority<br />

to regulate drug efficacy were struck down by the Supreme<br />

Court, slowing but not stopping the steady accretion of power.<br />

By the time the New Deal finished eviscerating most constitutional<br />

limitations, the FDA was born.<br />

A report recently released by the Milken <strong>Institute</strong>—“The<br />

Global Biomedical Industry: Preserving US Leadership”—documents<br />

how the wheels of progress have been slowly grinding<br />

[16] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

to a halt as the FDA raises the bar for drug approval. The length<br />

of time required to complete clinical trials over the past decade<br />

is up 70%. The median number of procedures required per trial<br />

is up 50%, as is the total work burden per protocol. Meanwhile,<br />

volunteer enrollment and retention has been driven down by<br />

21% and 30% respectively. And, of course, new drug approvals<br />

are down 50%. Keep this up and it won’t be long before clinical<br />

trials follow U.S. manufacturing to China, with other elements<br />

of the pharmaceutical industry trailing closely behind.<br />

Even as trials drag on with preliminary information kept<br />

blinded, new laboratory developments that could improve<br />

results are banned due to an insistence on maintaining rigid<br />

conformity to outdated protocols. In what other industry does<br />

this happen? The calls for “adaptive trials” that might be more<br />

suitable in an age of personalized medicine remain unheeded.<br />

And the shameful spectacle of denying Americans access to<br />

drugs that are approved and available in Europe and elsewhere<br />

defies justification.<br />

Is there an alternative?<br />

While wholesale FDA reform is needed, doing it right would<br />

entail a monumental effort that could take years. But there are<br />

modest steps we can take in the meantime. How about carving<br />

out one or more FDA-free Enterprise Zones where doctors,<br />

scientists, and volunteer patients can make their own decisions<br />

unfettered by the heavy hand of regulators? Imagine an experimental<br />

terminal-illness wing of the Cleveland Clinic where informed<br />

consent was the only law. How hard would it be to draft<br />

enabling legislation?<br />

Defenders of the FDA’s prerogatives would fight such proposals<br />

tooth and nail. But what kind of nanny state arguments<br />

can be made against conducting such a policy experiment when<br />

anyone who objects doesn’t have to be treated? Breakthroughs<br />

that emerge would still have to pass through the FDA gantlet<br />

before they would be generally available. The difference is that<br />

researchers could continue making improvements while treating<br />

volunteers during this long process.<br />

If we fail to arrest the regulatory assault on medical progress<br />

in the U.S., sooner or later out-of-the-box solutions like the one<br />

described above will pop up—elsewhere. And American patients<br />

will follow. The price of a plane ticket is trivial compared<br />

to the cost and consequences of cancer. Perhaps if enough patients<br />

start voting with their feet, pressure for change will build<br />

from those without the means to take care decisions into their<br />

own hands.<br />

But it should never come to that. There is no good reason<br />

we shouldn’t have access to the most advanced treatments at<br />

home. First, however, we need to exercise the political will to<br />

demand our right to choose. Are we up to the challenge?<br />

Bill Frezza is a fellow at the Competitive Enterprise <strong>Institute</strong><br />

and a Boston-based venture capitalist. He can be reached at<br />

waf@acm.com.


Insights | Outlook<br />

NGS by the<br />

Numbers<br />

LAURIE SULLIVAN<br />

Insight Pharma Reports conducted a user market survey<br />

as part of its latest report focused on next-generation<br />

sequencing (NGS), which describes innovations in this<br />

field and analyzes their impact on what can only be described<br />

as an extremely dynamic technology and market<br />

sector. Partial survey results are shared here.<br />

Respondents were asked to rate sequencer attributes on a<br />

scale of 1–5 (with 5 being most important; see Table). Average<br />

response ratings are proportional to the relative importance of<br />

an attribute. Results show a great deal of bunching<br />

in the 3–4-point range, which indicates that those attributes<br />

are considered well above average in importance.<br />

However, a few points stand out. “Instrument<br />

size” rates considerably lower in importance than any<br />

other factor within this survey population, which is<br />

weighted strongly toward modest-sized, academic<br />

core labs. Also noteworthy are raw data accuracy and<br />

consensus accuracy, both of which have the highest<br />

importance ratings.<br />

This observation is not surprising given the predominance<br />

of genetic variation studies among user<br />

applications. It is interesting in this respect that, in<br />

our PacBio interview (excerpted in the September/<br />

October issue of Bio•IT World), Dr. Steven Turner<br />

claims that long-read tails in repetitive SMRT sequencing<br />

of single-molecule fragments can compensate<br />

for relatively low raw read accuracy and generate<br />

very high consensus accuracy.<br />

Selling Points<br />

We also asked whether a respondents’ organization had purchased<br />

a low-cost NGS system or planned to do so by the end of<br />

2011. In fact, more than half (56%) answered positively.<br />

In terms of system selection among the three such systems<br />

available, Ion Torrent’s PGM came in first (59.2%), the new Illumina<br />

MiSeq second at 26.5%, and the 454 GS Junior at 14.3%.<br />

Next we asked whether organizations are currently doing<br />

single-molecule sequencing using one of the two commercially<br />

available systems. Only one-fifth of those surveyed responded<br />

to this question, and nearly three-quarters of those use Pacific<br />

Biosciences’ RS system compared to one-quarter using Helicos’.<br />

When asked whether their organization planned to purchase<br />

a Pacific Biosciences’ sequencer in the coming year,<br />

a surprisingly-high 20% of the 71 respondents answered<br />

affirmatively.<br />

An additional query along this same line asked about degree<br />

of interest in performing single-molecule sequencing. Given<br />

five degrees of interest, 85% of the 68 responses fell in the top<br />

three categories, with roughly equivalent numbers for ‘very<br />

much,’ ‘quite a lot,’ and ‘moderately.’<br />

The next two queries dealt with nanopore-based singlemolecule<br />

sequencing. The first asked respondents’ opinion on<br />

the likelihood that a nanopore system would hit the market<br />

by the end of 2012, given four levels of response covering ‘very<br />

likely,’ ‘somewhat likely,’ ‘possible,’ and ‘unlikely.’ The largest<br />

number of responses fell in the ‘possible’ category, although<br />

about one-quarter fell in each of the two more likely categories.<br />

Only 13% thought it unlikely. The result reflects somewhat<br />

more optimism than conclusions drawn by individuals whom<br />

we interviewed, who tended to believe that once feasibility for<br />

sequencing was made public, several more years would need to<br />

elapse before commercial introduction.<br />

Sequencer Attribute Rankings<br />

Chart shows the number of respondents that gave each category a 1-5 ranking, along with the<br />

average for each category and the total number of responses.<br />

Category 1 2 3 4 5 Avg. Count<br />

Instrument size 19 26 17 6 5 2.34 73<br />

Instrument cost 4 7 21 21 22 3.67 75<br />

Raw data accuracy 4 2 4 25 40 4.27 75<br />

Consensus accuracy 2 3 3 25 39 4.33 72<br />

Read length 2 4 11 35 24 3.99 76<br />

Output per time 3 10 19 27 16 3.57 75<br />

Overall run time 4 6 27 28 9 3.43 74<br />

Prep time 3 6 29 24 12 3.49 74<br />

Sequencing time 3 11 20 32 9 3.44 75<br />

Informatics time 0 10 13 32 19 3.81 74<br />

Source: Insight Pharma Reports<br />

Respondents’ were then asked to rate attributes of a putative<br />

nanopore system on an attractiveness scale of 1–5, with 5<br />

reflecting most important. Again, the average score is proportional<br />

to attractiveness. Read length and accuracy both led with<br />

scores exceeding 4, followed closely by speed and consumables<br />

cost. Instrument cost and size fell last in the attractiveness<br />

scale, both with scores less than 3.4.<br />

For complete survey results, as well as transcripts of interviews conducted<br />

with individuals highly knowledgeable in diverse aspects of NGS, please refer to the<br />

full Insight Pharma Report: Next-Generation Sequencing Generates Momentum:<br />

Markets Respond to Technology and Innovation Advances. July 2011.<br />

www.InsightPharmaReports.com<br />

www.bio-itworld.com NOVEMBER | DECEMBER 2011 BIO•IT WORLD [ 17 ]<br />

CONTENTS


ConCurrEnT TraCks:<br />

1 IT Infrastructure – Hardware<br />

2 IT Infrastructure – Software<br />

3 Cloud Computing<br />

4 Bioinformatics<br />

5 Next-Generation Sequencing<br />

Informatics<br />

6 Systems and Multiscale Biology<br />

7 eClinical Solutions<br />

8 eHealth and HIT Solutions for<br />

Personalized Medicine<br />

9 Drug Discovery Informatics<br />

EvEnT FEaTurEs:<br />

Access All 12 Tracks for One Price<br />

Network with 2,000+ Global Attendees<br />

Hear 125+ Technology and Scientific<br />

Presentations<br />

Connect with Attendees Using <strong>CHI</strong>’s Intro-Net<br />

Choose from 14 Pre-Conference Workshops<br />

Organized by:<br />

<strong>Cambridge</strong> <strong>Healthtech</strong> <strong>Institute</strong><br />

<strong>Cambridge</strong> <strong>Healthtech</strong> Inst itute’s Eleventh Annual<br />

10 Molecular Diagnostics Informatics NEW!<br />

11 Open Source Solutions NEW!<br />

12 Cancer Informatics NEW!<br />

CONFERENCE & EXPO ’12<br />

Enabling Technology. Leveraging Data. Transforming Medicine.<br />

kEynoTE PrEsEnTaTIons By:<br />

Eric D. Perakslis, Ph.D.,<br />

Chief Information Officer and Chief<br />

Scientist of Informatics, U.S. Food<br />

and Drug Administration<br />

Martin Leach, Ph.D.,<br />

Chief Information Officer,<br />

Broad <strong>Institute</strong> of MIT<br />

and Harvard<br />

Jill P. Mesirov, Ph.D.,<br />

Associate Director, Chief Informatics<br />

Officer; Director, Computational<br />

Biology and Bioinformatics, Broad<br />

<strong>Institute</strong> of MIT and Harvard<br />

See the Winners of the following<br />

2012 Awards: Benjamin Franklin,<br />

Best of Show, and Best Practices<br />

View Novel Technologies and Solutions<br />

in the Expansive Exhibit Hall<br />

And Much More!<br />

Bio-ITWorldExpo.com<br />

Platinum Sponsors:<br />

Official Publication:<br />

register by<br />

December 9<br />

and save<br />

up to $600!


2012 Bio-IT World Conference and Expo speakers (partial list as of October 2011)<br />

Joe Acciavatti, Director, Operations and Communications,<br />

MONOC<br />

Michael J. Ackerman, Ph.D., Office for High Performance<br />

Computing and Communications, National Library<br />

of Medicine<br />

Peter Sejer Andersen, Director, Antibody Discovery,<br />

Symphogen<br />

Bjorn Andersson, Director of Solutions Marketing, BlueArc<br />

Mark Apgar, Senior Manager, Clinical Informatics, Allergan<br />

Sandy Aronson, Executive Director, IT , Partners HealthCare<br />

Center for Personalized Genetic Medicine<br />

Alex Bangs, CIO, Crescendo Bioscience<br />

William K. Barnett, Ph.D., Director, Science Community Tools,<br />

Research Technologies, Indiana University<br />

Toby Bloom, Ph.D., Director, Informatics, Genome<br />

Sequencing, Broad <strong>Institute</strong><br />

Jeffrey Brown, Ph.D., Assistant Professor, Population<br />

Medicine, Harvard Pilgrim Health Care <strong>Institute</strong>/ Harvard<br />

Medical School<br />

Zhaohui (John) Cai, Director, AstraZeneca<br />

Michael Cantor, M.D., Senior Director, Biomedical<br />

Informatics Services, Pfizer, Assistant Professor of Medicine,<br />

NYU Medical Center<br />

Leonard D’Avolio, Ph.D., Associate Center Director,<br />

Biomedical Informatics, MAVERIC, Department of<br />

Veterans Affairs<br />

Chinh Dang, Senior Director, Technology, Allen <strong>Institute</strong> for<br />

Brain Science<br />

Ramin Daron, IT Director, Information Technology, Johnson<br />

& Johnson<br />

Bernd Doetzkies, Director, Informatics, Daiichi Sankyo<br />

Pharma Development<br />

David J. Dooling, Ph.D., Assistant Director, The Genome<br />

Center, Washington University, St. Louis<br />

Chuanbin Du, Ph.D., Post-Doctoral Fellow, Bioinformatics and<br />

Genomics, University North Carolina-Charlotte<br />

Yaniv Erlich, Ph.D., Principal Investigator, Whitehead Fellow,<br />

Whitehead <strong>Institute</strong> for Biomedical Research<br />

James Ewen, Business Analyst, Research Informatics &<br />

Automation, Bristol-Myers Squibb Company<br />

Christopher Farah, Bioinformatics Systems and Software<br />

Specialist, Maine <strong>Institute</strong> for Human Genetics and Health,<br />

Eastern Maine Healthcare Systems<br />

Jacob Farmer, Chief Technology Officer, <strong>Cambridge</strong> Computer<br />

Norbert Fritz, Development Leader Business Information<br />

Warehouse, Pharma Operations, PDGI, F. Hoffmann-La<br />

Roche Ltd.<br />

Rich Furr, Head, Global Regulatory Affairs and Chief<br />

Compliance Officer, Staff, SAFE-BioPharma Association<br />

Ray Fyhr, Project Lead, MRL-IT, Merck<br />

Christine Gibson, Associate Director, Clinical Business<br />

Systems, Biogen Idec<br />

Robert Gottlieb, Principal, RMG Associates, LLC<br />

Andrew Grygiel, Vice President, Product Management,<br />

ClearTrial<br />

Jian Han, Ph.D., Faculty Investigator, Hudson Alpha <strong>Institute</strong><br />

for Biotechnology<br />

Paul A. Harris, PhD, Director, Office of Research Informatics<br />

Operations, Associate Professor, Department of Biomedical<br />

Informatics, Department of Biomedical Engineering,<br />

Vanderbilt University<br />

Sifei He, Beijing Genome <strong>Institute</strong><br />

Joe Herring, CEO & Chairman of Covance<br />

Mark Hulse, R.N., Vice President, Information Technology,<br />

Chief Information Officer, Moffitt Cancer Center<br />

Xin Jin, Beijing Genome <strong>Institute</strong><br />

Jason Johnson, Executive Director, Informatics IT, Merck<br />

Khodaberdi Kalavi, Ph.D., Faculty, Lab. Medicine, Molecular<br />

Medicine, Golestan University of Medical Sciences<br />

Aaron Kamauu, M.D., CEO, Healthcare Data Analytics,<br />

Anolinx LLC; former Head, Healthcare Data Strategy, Roche<br />

and Genentech<br />

Randy King, Ph.D., M.D., Associate Professor, Department of<br />

Cell Biology, Harvard Medical School<br />

Isaac (Zak) Kohane, M.D., Ph.D., Director, Children’s Hospital<br />

Informatics Program; Henderson Professor of Pediatrics and<br />

Health Sciences and Technology, Harvard Medical School<br />

(HMS); Co-director, HMS Center for Biomedical Informatics<br />

and Director of the HMS Countway Library of Medicine<br />

Michael Kopach, Ph.D., Research Advisor, Chemical Product<br />

Research and Development, Eli Lilly and Company<br />

Serdar Kurtkaya, Software Engineer, Chemistry, Emory<br />

<strong>Institute</strong> for Drug Discovery<br />

Steve Labkoff, MD, FACP, Head of Strategic Programs,<br />

AstraZeneca Pharmaceuticals LP<br />

Janis E. Landry-Lane, Program Director, World-wide Deep<br />

Computing , Life Sciences/ Higher Education Segments, IBM<br />

Jason M. Laramie, Ph.D., Principal Application Scientist,<br />

Complete Genomics, Inc.<br />

Anthony Leotta, Bioinformatics Manager, Cold Spring<br />

Harbor Laboratory<br />

Roger (Rangjiao) Liu, Ph.D., Research Manager of<br />

Bioinformatics, Life Science Development, Corning<br />

Victor Lobanov, Ph.D., Director, Informatics Center of<br />

Excellence, Johnson & Johnson<br />

Ravi Madduri, Fellow, Computation <strong>Institute</strong>, University of<br />

Chicago and Argonne National Lab<br />

Raymond Ng, Ph.D., CIO, Computer Science, PROOF Centre<br />

Chris Petersen, CIO and Founder, Assay Depot<br />

Tibor van Rooij, Faculty of Pharmacy and Pharmaceutical<br />

Sciences, University of Alberta; former Director,<br />

Bioinformatics, Génome Québec and Montreal Heart <strong>Institute</strong><br />

Pharmacogenomics Centre<br />

Vishal Rosha, Senior Scientist, Bioprocess R&D, Novartis<br />

Pharma Ag<br />

Anthony Rowe, Ph.D., Principle Scientist - External<br />

Innovation, R&D IT, Janssen<br />

Jared Shockcor, Domain Architect, Research Informatics &<br />

Automation, Bristol-Myers Squibb Company<br />

Alex Sherman, Director, Systems, Department of Neurology,<br />

Massachusetts General Hospital<br />

Juswinder Singh, Ph.D., Founder & CSO, Avila<br />

Therapeutics Inc.<br />

Timothy Swaller, Director, Information Technology and<br />

Genomics, Ceres, Inc.<br />

Peter J. Tonellato, Ph.D., Visiting Professor, Senior Research<br />

Scientist, Pathology, BIDMC & Center for Biomedical<br />

Informatics, Harvard Medical School<br />

Laszlo Vasko, Director, R&D Information, AstraZeneca<br />

Vas Vasiliadis, Director, Products, Computation <strong>Institute</strong>,<br />

University of Chicago<br />

Xiaoming Wang, Ph.D. Fellow, Computation <strong>Institute</strong>,<br />

University of Chicago and Argonne National Laboratory<br />

Andrew Witty, CEO, GlaxoSmithKline<br />

Elizabeth Worthey, Ph.D., Assistant Professor, Human and<br />

Molecular Genetics Center, Department of Pediatrics,<br />

Bioinformatics Program, Medical College of Wisconsin<br />

Wenming Xiao, Ph.D., Staff Scientist, Division of<br />

Computational Biology, Center for Information Technology,<br />

National <strong>Institute</strong>s of Health<br />

Lixia Yao, Ph.D., Investigator, Computational Biology,<br />

Quantitative Sciences, GlaxoSmithKline<br />

Alexander Wait Zaranek, Ph.D., Director of Informatics,<br />

Harvard Personal Genome Project, Genetics, Harvard<br />

Medical School<br />

For more speaker additions, please visit:<br />

Bio-ITWorldExpo.com


Computational Biology<br />

Sustainable Science: Energy Efficient<br />

Computing for Biosciences<br />

Illustration of the RIVYERA reconfigurable computer for high performance analysis<br />

By Tim Pietruck<br />

Moore’s Law, the observation that<br />

the speed of computers tends<br />

to double every 18 months, is<br />

common knowledge. For scientists and ITexperts,<br />

it is also common knowledge that<br />

the increase in biological data generation<br />

outpaces Moore’s law by far. From next<br />

generation sequencing to high-throughput<br />

imaging, improvements in technology<br />

and widened analysis-scope put a high<br />

workload on researchers’ IT infrastructure.<br />

One might even say that it is turning<br />

life-sciences into information- rather than<br />

natural-sciences.<br />

It is no surprise that many professionals<br />

are searching for ways to manage<br />

this expanding gap between biological<br />

research possibilities — and the increase<br />

in data should be seen as a possibility<br />

rather than a burden — and realistically<br />

attainable processing resources. One<br />

approach has been the thoroughly hyped<br />

cloud-computing with all its benefits and<br />

drawbacks. Another trend, somewhat less<br />

omni-present but more efficient in solving<br />

infrastructural bottlenecks, is the computation<br />

acceleration with dedicated hardware.<br />

Here, the options are many — ranging<br />

from consumer graphics<br />

cards to millions of dollars<br />

expensive ASIC production.<br />

In between those extremes,<br />

Field-Programmable Gate<br />

Arrays (FPGAs) promise to<br />

provide the most favorable<br />

combination of reconfigurability<br />

for changing scientific<br />

requirements and processing<br />

power per dollar spent.<br />

Indeed, cost is increasingly<br />

important when it comes<br />

to strained research budgets,<br />

in terms of purchasing<br />

IT-equipment but also maintenance,<br />

electricity and cooling. Especially with<br />

increasing electricity prices, a total-cost-ofownership<br />

approach is needed for investment<br />

decisions. Energy- and therefore<br />

[20] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

BIO-IT WORLD ADVERTISING SPONSOR<br />

cost-efficient solutions are in increasingly<br />

high demand and FPGAs, with a few watts<br />

of energy consumption per processor fit<br />

exactly this description.<br />

FPGA-computing is not a brand-new<br />

concept, but only recently have companies<br />

like Germany based SciEngines<br />

made inroads into usability, scalability<br />

and affordability that promise significant<br />

practical benefit for a wide audience in<br />

the sciences.<br />

Key to this is a standardized platform<br />

— the RIVYERA computer, which provides<br />

a scalable architecture for up to 256 midsized<br />

FPGAs within one regular server.<br />

For putting this into perspective, please<br />

refer to below technology comparison.<br />

Smith Waterman Algorithm — Performance (in GCUPS)<br />

Cell/BE (8 Threads)<br />

NVIDIA GTX 295 (Dual-GPU)<br />

2.66 GHz Nehalem-EP (8 cores)<br />

RIVYERA 2S6-LX150 (16 FPGAs)<br />

SGI Altix UV 1000 (384 cores)<br />

RIVYERA S6-LX150 (128 FPGAs)<br />

RIVYERA S6-LX150 (256 FPGAs)<br />

Source: University of Kiel<br />

16<br />

30<br />

32<br />

756<br />

986<br />

6046<br />

0 3500 7000 10500 14000<br />

Cell <strong>Up</strong>dates Per Second (in billions)<br />

The referenced Smith-Waterman algorithm<br />

is a standard, processing-intensive<br />

computational task and the optimizable<br />

structure of FPGAs easily outclasses<br />

other technologies — even in a version<br />

that includes affine gap alignment and<br />

custom scoring matrices. In addition, also<br />

memory-access focused applications, like<br />

NGS short read mapping, can be very well<br />

executed on such a system. PC-workstations,<br />

no matter if containing single- or<br />

twelve-core CPUs, and hybrid computers<br />

are usually limited by access speed to<br />

a single memory. The RIVYERA provides<br />

users with access to a large number of<br />

distributed modules, equal to the number<br />

of FPGAs, and dedicated lanes from each<br />

FPGA to optional shared memory. Thus,<br />

for specific algorithms, the performance<br />

of a cluster or even a small data-center<br />

can be provided at the desk-side, and<br />

new approaches to analysis and science<br />

become available — for only<br />

a five digit price-tag.<br />

SciEngines is currently<br />

driving the development of<br />

bioscience-focused FPGA<br />

solutions and provides<br />

development resources to<br />

interested parties. Application<br />

programming interface<br />

specifications are freely avail-<br />

12092 able and the development<br />

community is supported with<br />

tools and affordable smallscale<br />

systems. Please feel<br />

free to contact the author if<br />

you would like to get involved.<br />

Tim Pietruck is Vice-President at SciEngines<br />

GmbH, a leading manufacturer of FPGA-<br />

computers for the hpc community. He can<br />

be reached at timpietruck@sciengines.com


Clinical Trials<br />

An Inside Perspective<br />

on PAREXEL<br />

One of the oldest clinical research organizations<br />

discusses technology and changes in the industry.<br />

Mark Goldberg, PAREXEL’s chief operating officer, has been with the company since<br />

1997 when he started the CRO’s medical imaging business. A radiologist by training,<br />

Goldberg ran a spin-off company from Massachusetts General Hospital in the<br />

tele-health space, which he helped to found. In 2000, PAREXEL created its technology<br />

business, Perceptive Informatics, as a majority-owned subsidiary. As Goldberg says,<br />

“we believed that it was an industry that grossly under-utilized technology and that<br />

this was a huge opportunity that would be important.” Believing that it could help<br />

differentiate their CRO service offerings, and importantly, help biopharmaceutical<br />

companies get products to market faster, in 2005, PAREXEL brought in Perceptive as<br />

a wholly-owned subsidiary, seeking greater efficiencies in the development process by<br />

combining an eClinical platform with clinical expertise.<br />

Goldberg recently sat down with Bio•IT World chief editor Kevin Davies to discuss<br />

the importance of convergence in Perceptive’s suite of eClinical offerings, and to share<br />

his insights on the future opportunities, trends, and challenges facing the industry.<br />

Bio•IT World: Mark, what were PAREXEL’s<br />

core services, and how did the company’s<br />

technologies integrate into the formation<br />

of Perceptive Informatics?<br />

GOLDBERG: PAREXEL was one of the first<br />

clinical research organizations (CROs),<br />

founded nearly 30 years ago... We were<br />

born initially as a consulting business to<br />

help companies outside the U.S. navigate<br />

the regulatory environment. Eventually<br />

the company expanded into the core clinical<br />

research services, including monitoring,<br />

data management, medical affairs,<br />

biostatistics, medical writing, and regulatory<br />

affairs—all the core components of<br />

running a clinical trial.<br />

We took the customer-facing technologies<br />

that existed within PAREXEL<br />

and rolled them into Perceptive, including<br />

the medical imaging business and interactive<br />

voice response systems (IVRS).<br />

We were also probably the first player<br />

to launch portal technology in the drug<br />

development industry. Portal technology<br />

was already used in other industries, but<br />

it was new to clinical trial management.<br />

Today, PAREXEL is a $1.2 billion company<br />

and most of our revenue generation<br />

is derived from the conduct of clinical tri-<br />

als around the world. Perceptive accounts<br />

for approximately 13% of PAREXEL’s<br />

total revenue.<br />

How has Perceptive evolved from its<br />

imaging roots over the past decade?<br />

[Imaging] created the basis of our technology<br />

business. IVRS was introduced<br />

not only because it made it simpler to randomize<br />

patients in clinical trials but also<br />

because it allowed better management of<br />

investigational drug inventory.<br />

Historically you would basically put<br />

a full supply of both the investigational<br />

product as well as placebo (or comparator)<br />

at all of the sites, with little ability to<br />

predict actual need. The amount of waste<br />

was enormous, because every site was<br />

treated like its own little repository. Using<br />

technology, and being able to predict and<br />

track how patients are randomized at the<br />

various sites, one could ensure that the<br />

inventory sent to sites was more appropriate<br />

to the number and mix of patients<br />

enrolled. In terms of other technologies,<br />

we have conducted EDC-based trials<br />

for many years, and built an industryleading<br />

clinical trial management system<br />

(CTMS).<br />

PAREXEL’s acquisition of ClinPhone<br />

in 2008 set a number of dominoes in<br />

motion. We were focused on bringing the<br />

first interoperable eClinical suite to the<br />

industry. Standalone systems were not<br />

well integrated and we saw an opportunity<br />

to really accelerate development...<br />

From a procurement standpoint, sponsors<br />

spent a lot of time trying to understand<br />

the different tools. The advantage<br />

was that by doing things electronically<br />

and eliminating paper, you could shorten<br />

cycle times by reducing query activity,<br />

locking databases faster, and so forth.<br />

But then, as technologies mature and you<br />

know that they work, the features and<br />

functionality among competing products<br />

become fairly similar...<br />

We use the term convergence when we<br />

talk about similar trends in the eClinical<br />

industry. If you enter our eClinical<br />

suite through one of our applications,<br />

we expose the functionality of other applications<br />

without it feeling like you’re<br />

working with a number of different tools.<br />

It becomes much easier for the user. You<br />

can also eliminate extra work on the back<br />

end that used to occur around reconciling<br />

databases across the different tools being<br />

used for a given trial. I don’t know that any<br />

eClinical suite does absolutely everything,<br />

but we believe that ours covers the broadest<br />

waterfront. However it is still important<br />

for an eClinical suite to be designed<br />

to link easily with other third-party tools.<br />

www.bio-itworld.com NOVEMBER | DECEMBER 2011 BIO•IT WORLD [ 21 ]<br />

CONTENTS


CONTENTS<br />

Clinical Trials<br />

You can never assume everything you have<br />

is everything you’ll ever need.<br />

What other services does PAREXEL offer<br />

and how does technology fit?<br />

When we are hired to run clinical trials,<br />

the purchaser is increasingly allowing us<br />

to make the decisions about what tools we<br />

should apply to do that work. Historically<br />

the client might have been very prescriptive<br />

about using certain technologies.<br />

Today, it is more likely that they tell us,<br />

‘We want you to run the program, and<br />

you have the flexibility to recommend<br />

the best tools for the job—the ones that<br />

will deliver the greatest efficiencies.’ In<br />

that scenario, we have a strategic advantage<br />

because we have a full technology<br />

capability in addition to our clinical trial<br />

execution capability, which is unique to<br />

the CRO industry.<br />

That said, we also sell our technology<br />

to other CROs. In fact, part of the reason<br />

that we preserve Perceptive as a subsidiary<br />

is because it makes it somewhat<br />

easier to support the full range of customers<br />

in the clinical trial business, including<br />

sponsors and other CROs.<br />

Roughly 75% of PAREXEL’s revenue<br />

is derived from what we call CRS (clinical<br />

research services). That includes phase<br />

I through IV development. We offer<br />

a broad range of services from first in<br />

human and patient studies in early development<br />

to core clinical development<br />

programs, through and including periapproval<br />

and post-marketing work. As I<br />

mentioned earlier, Perceptive contributes<br />

approximately 13% of our total revenue.<br />

The remaining 12-13% comes from our<br />

consulting and medical communication<br />

business. We probably have one of the<br />

oldest and most respected regulatory consultancies.<br />

We consult in a broad range<br />

of areas including strategic compliance,<br />

product development strategy, and regulatory<br />

affairs. We also consult in commercialization,<br />

which incorporates a focus on<br />

reimbursement and market access.<br />

What areas do you see for big growth?<br />

I think there are a couple trends to mention...<br />

We’ve been fairly provocative in the<br />

market in terms of our focus on having a<br />

convergent eClinical solution or suite. It’s<br />

built upon a platform of technologies that<br />

[22] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

allows the various tools to intercommunicate<br />

readily, and relieves the customer<br />

of the need to deal with both integration<br />

work and also the reconciliation of disparate<br />

databases.<br />

In our eClinical suite, we have the<br />

ability to leverage a broad set of solutions.<br />

This includes our IMPACT CTMS,<br />

which is still a market leader as well as<br />

our DataLabs EDC solution. We have<br />

the ability to provide IVRS for patient<br />

enrollment and study<br />

drug management—an<br />

offering that we refer to<br />

as RTSM, or random-<br />

R oughly 75%<br />

of PAREXEL’s<br />

revenue is<br />

derived from<br />

what we call<br />

CRS—clinical<br />

research<br />

services.<br />

ization and trial supply<br />

management—and our<br />

medical imaging services<br />

remain a core offering.<br />

Perceptive does<br />

a lot of work around<br />

Web-based reporting<br />

tools in order to surface<br />

data from the various<br />

systems to inform trial<br />

management and decision<br />

making.<br />

We also support<br />

electronic patient-reported<br />

outcomes (ePRO), although we<br />

have elected to avoid the hardware business<br />

for electronic diaries. Perceptive does<br />

provide consultation including help with<br />

design, instrument selection, and regulatory<br />

considerations. Where appropriate,<br />

we partner with technology vendors to<br />

provide whatever modality the client<br />

many need for data collection. We deliver<br />

voice based diaries ourselves.<br />

As a $1.2 billion company, where do you<br />

go looking for new business?<br />

We’ve always been a global business... We<br />

were one of the first to grow into Eastern<br />

Europe; we have a large presence in Latin<br />

America and in Africa. And we have one<br />

of the largest presences in Asia-Pacific,<br />

partly the result of the acquisition of a<br />

company called APEX International,<br />

which was the largest CRO in Asia-<br />

Pacific, in 2007.<br />

Are there any downsides to global trials?<br />

There is certainly complexity. You have to<br />

be experienced, have the knowledge and<br />

expertise to know how to execute in these<br />

various countries. I would say that there<br />

are a relatively small number of companies,<br />

like PAREXEL, that are credibly capable<br />

and have the experience to execute<br />

on a global scale using standardized processes.<br />

You have to understand the regulatory<br />

environment, the import/export<br />

rules, how medical care is conducted, and<br />

so on. And you have to have knowledge<br />

of the investigator community so that<br />

you know who can deliver quality data<br />

based upon a track record<br />

of high performance. The<br />

sponsor looks to us to be<br />

able to do a great job of<br />

selecting top quality investigators<br />

around the world.<br />

You have to have experience<br />

in these geographies<br />

and local expertise—real<br />

feet on the street.<br />

Where are you finding<br />

new business<br />

opportunities?<br />

Strategic partnering is really<br />

a new trend over the<br />

last few years, and gaining<br />

momentum. We are now<br />

considered a strategic partner of choice<br />

for a broad range of leading sponsors—<br />

from large pharmaceutical companies to<br />

mid-sized and smaller biopharmaceutical<br />

companies. This industry was historically<br />

very tactical in the way that it utilized<br />

outsourcing. In the past, a CRO would be<br />

called in if the sponsor had a shortage of<br />

personnel in a certain location or around<br />

a certain function. However, over time it<br />

became clear that these tactical relationships<br />

were fairly inefficient.<br />

Strategic partnering is characterized<br />

by relationships where the sponsor will<br />

select one or two, typically large, global<br />

CROs, and will agree that their work is<br />

going to be split between these providers.<br />

Unlike a tactical relationship, in strategic<br />

deals ownership and accountability for<br />

delivery, and participation in design starts<br />

to become a shared event. It’s not just that<br />

a sponsor develops a protocol and asks<br />

the CRO to execute it for them. Now, we<br />

are involved in working together on the<br />

strategy, the cost, the trade-offs in terms of<br />

different countries we could go to, the in-<br />

(continued on page 25)


Each day security professionals are faced with new enemies, tactics and threats—<br />

it’s time to evolve your security strategy. RSA® Conference brings you �ve days<br />

of innovative sessions, inspiring keynotes, and collaborative strategy building.<br />

Gain insights on today’s hottest topics, learn how to leverage the latest trends and<br />

technologies, and get access to new best practices on the most critical technical<br />

and business issues facing you today.<br />

Join the community at RSA® Conference 2012 to empower yourself with new<br />

insights, solutions and allies to keep your organization secure.<br />

REFINE YOUR EXPERTISE<br />

Participate in over 220+<br />

expert-led sessions.<br />

SHARE YOUR INSIGHTS<br />

Collaborate with information<br />

security’s best and brightest.<br />

EXPAND YOUR OUTLOOK<br />

Attend world-class<br />

keynotes.<br />

HARNESS YOUR STRATEGIES<br />

Discover practical solutions<br />

to implement at the o�ce.<br />

THE reat Cipher<br />

Mi htier than the Sword<br />

Protect Your Kin dom.<br />

Learn How at RSA® Conference<br />

Re ister Now!<br />

www.rsaconference.com/bioit<br />

s av e<br />

$700<br />

Before Friday,<br />

November<br />

18, 2011<br />

©2012 EMC Corporation. All rights reserved. EMC, RSA, the RSA logo and the RSA Conference logo are registered trademarks of EMC Corporation in the United States and/or other countries.<br />

All other marks are trademarks of their respective companies.<br />

,


CONTENTS<br />

Clinical Trials<br />

Tooling up for Smarter Studies<br />

Two industry studies found<br />

need for appropriate trial<br />

planning and tools.<br />

BY DEBORAH BORFITZ<br />

Separate industry surveys by information<br />

technology research and advisory firm<br />

Gartner and global software company<br />

ClearTrial make a strong case for clinical<br />

resource management tools in the costcutting<br />

arsenal of biopharmaceutical and<br />

medical device companies.<br />

The Gartner survey of industry sponsors<br />

and their technology partners late<br />

last year identified several clinical study<br />

best practices.<br />

For starters, study-related activities<br />

need to get appropriately planned upfront<br />

to the right level of depth and detail, says<br />

Gartner analyst Steven Lefebure. Companies<br />

don’t always devote sufficient time<br />

to planning, creating costly mid-study<br />

surprises. Project managers also need a<br />

firm understanding of the study schedule<br />

and required resources so that information<br />

can be used to better plan studies and<br />

keep them on track.<br />

Third, companies need to better<br />

match protocol design and study. Many<br />

protocols get “very ambitious” in terms of<br />

clinical measures and endpoints without<br />

an understanding of the downstream consequences,<br />

Lefebure says. “It’s essential to<br />

balance out cost and complexity with…the<br />

scientific factors.”<br />

Studies require technology-based support<br />

at every stage, from data capture to<br />

regulatory document management and<br />

submission, continues Lefebure. Productivity<br />

improvements and cost reductions<br />

have already been seen in this arena, but<br />

the technology itself has introduced issues.<br />

“We find seams between systems,<br />

handoff complications between EDC<br />

[electronic data capture] systems and<br />

data management tools at the sponsor<br />

organization, access problems, and…issues<br />

in terms of how medical information<br />

gets coded across a study.” The multitude<br />

of systems employed to do discrete study<br />

[24] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

JIM NEWBERRY<br />

ClearTrial CEO Mike Soenen says the platform excels at planning, forecasting, and tracking.<br />

tasks only adds to the complexity.<br />

Study information with real-time visibility<br />

is yet another industry-identified<br />

best practice, says Lefebure, although this<br />

is still not the case at many organizations.<br />

As suggested by earlier Gartner research,<br />

cross-organizational information sharing<br />

by the sponsor and its clinical research<br />

organization (CRO) partners is another<br />

favored practice, especially since their<br />

operations and activities can be highly<br />

interwoven. Ideally, sponsors are collaboratively<br />

engaged with CRO partners in<br />

“honest and real communication about<br />

study status.”<br />

Finally, the industry-best planning<br />

process involves both the sponsor and<br />

its CRO partners in protocol design and<br />

budgeting. This gets all collaborating<br />

organizations “on the same page” at study<br />

launch, says Lefebure.<br />

Purpose-Built Tools Needed<br />

These best practices are “necessary but<br />

not sufficient” to improve study performance,<br />

says Lefebure. Also vital are clinical<br />

resource management tools “tailored<br />

to the job at hand” that can turn information<br />

into actionable insights. “Business<br />

intelligence tools are not enough.”<br />

EDC technology will “fade a bit into the<br />

background…because it’s not necessarily<br />

vital to achieving the next breakthrough<br />

in future operational efficiency. A lot of<br />

gains have already been achieved in that<br />

category.”<br />

Lefebure advises companies to seek<br />

out clinical resource management tools<br />

whose planning and budgeting features<br />

“reveal all assumptions” from protocol<br />

development through study execution.<br />

The software should allow a collaborative<br />

environment for transparent information<br />

sharing to get “more eyeballs” on problems,<br />

he adds.<br />

Linked study planning and execution<br />

tools will ensure study plans aren’t<br />

“obsolete” once trials get underway,<br />

Lefebure continues. Gaps between plan<br />

and practice can be immediately identified,<br />

along with needed actions to ensure<br />

the best possible outcomes in terms of<br />

resource use.<br />

Although industry opinion is mixed<br />

on whether or not systems integration<br />

helps with study execution, Gartner research<br />

indicates integrated systems and<br />

approaches more effectively “move the<br />

needle” on performance, Lefebure says.<br />

Industry is likewise split on the value of<br />

simulation and optimization tools. Software<br />

that enables what-if analysis and<br />

scenario planning, in Lefebure’s view, is<br />

“essential to the future” because it allows<br />

assumptions and plans to be tested before<br />

real resources are expended – notably, on<br />

costly protocol amendments once a study<br />

is underway.<br />

Lefebure further discusses Gartner<br />

research and industry best practices in<br />

a new video posted on the ClearTrial<br />

website.


ClearTrial’s Supportive Technology<br />

ClearTrial has the only fully integrated<br />

system for clinical planning, forecasting,<br />

outsourcing, and project and financial<br />

tracking, notes CEO Mike Soenen. The<br />

clinical study operational plan and detailed<br />

study budget also get built from the<br />

“bottom up” based on assumptions from<br />

clinical operations professionals.<br />

Multiple operational scenarios can<br />

be quickly developed based on timeline,<br />

cost, or resource objectives, says Soenen.<br />

Once a study is underway, heads of clinical<br />

development and operations can “easily<br />

view project status and trends,” and<br />

project managers can “quickly drill down<br />

to pinpoint underlying problems putting<br />

the study at risk.” New assumptions can<br />

be entered into the system “based on the<br />

problems identified to determine the best<br />

operational plan for getting the study<br />

back on track—in a matter of hours.”<br />

The technology also further supports<br />

the kind of strategic, long-range planning<br />

PAREXEL<br />

(continued from page 22)<br />

clusion/exclusion criteria for patients, etc.<br />

What about new and emerging eClinical<br />

technologies?<br />

The R&D focus has shifted more toward<br />

the convergence of the tools and providing<br />

a comprehensive eClinical suite,<br />

and somewhat less on new features and<br />

functionality. Of course, there is always<br />

development work going on, whether it’s<br />

for new medical imaging analyses or making<br />

sure we can support the latest imaging<br />

modalities. But I think the biggest trend<br />

has been investing in a platform that<br />

allows the various technologies to work<br />

seamlessly together.<br />

Another trend would be the ability<br />

to leverage other kinds of datasets like<br />

electronic health records (EHRs), which<br />

have the potential to be helpful in a number<br />

of ways including identifying eligible<br />

patients. With the right search criteria,<br />

a database could be analyzed in an anonymized<br />

way to determine if there is an<br />

attractive population at a given site for a<br />

study. I don’t think that the technology<br />

advocated by Lefebure<br />

In a separate survey of industry professionals<br />

with clinical study budget<br />

responsibilities, ClearTrial found the use<br />

of “old tools” and “wrong tools” hampering<br />

the efforts of life science companies<br />

to improve operational efficiency, says<br />

Soenen. Well over half of respondents rely<br />

on Excel as their primary tool for study<br />

budgeting.<br />

The same survey revealed serious<br />

operational inefficiencies. Only one in<br />

ten surveyed are able to keep their cost<br />

variances, from forecast to actual, at 5%<br />

or below. “What was surprising is that this<br />

pattern applies across the board, regardless<br />

of company size.”<br />

Among other disturbing survey<br />

findings:<br />

• 65% of respondents take five weeks or<br />

more to complete the review and revision<br />

cycle for a single study.<br />

• 62% of respondents require three weeks<br />

or more to roll up individual study bud-<br />

and data standards are ready to support<br />

the capture of all of the data needed for<br />

a clinical trial from an EHR in the near<br />

term. I do, however, believe that we are<br />

going to see more and more use of EHRs<br />

for safety surveillance, and probably to<br />

support some types of observational research<br />

in the future.<br />

That’s another one of the big trends<br />

we see—observational research... used<br />

to look for safety signals or to evaluate<br />

alternative treatments for purposes of<br />

comparative effectiveness.<br />

From a reimbursement/safety standpoint,<br />

REMS—risk, evaluation, mitigation<br />

strategies—that are mandated by the FDA<br />

are increasingly relying on non-interventional<br />

or observational research. Our late<br />

phase technologies are being used in these<br />

settings to enable more efficiency.<br />

What are among the biggest headaches<br />

or challenges for you?<br />

I don’t know if I would call it a challenge,<br />

but we must always be vigilant to ensure<br />

that we are executing on time and on<br />

budget to appropriate quality standards.<br />

One real challenge relates to the<br />

conduct of clinical development in some<br />

gets into a budget portfolio<br />

• 79% of respondents are only “somewhat<br />

confident” or “not confident” in their<br />

budget forecasts.<br />

• 88% of respondents have a typical budget<br />

variance of at least 6%, with more<br />

than half of this number experiencing a<br />

variance of 11% or more.<br />

At least some ClearTrial customers are<br />

having a decidedly difference experience,<br />

says Soenen. One top-20 sponsor using<br />

the software reports that annual cost<br />

variance is less than 1% across its clinical<br />

portfolio.<br />

A small biotech customer successfully<br />

negotiated cost savings of $2 million with<br />

its CRO for a single phase II oncology<br />

study. And a mid-size customer reduced<br />

its contract closure time from 2.5 months<br />

to 2.5 weeks for outsourced studies using<br />

ClearTrial as the common platform for<br />

negotiations with its CRO. “This proves<br />

that operational efficiency gains are indeed<br />

possible,” observes Soenen. •<br />

emerging markets. In some countries,<br />

one is entering markets where the investigators<br />

are less experienced and where<br />

you lack a pool of experienced talent. As<br />

an organization, to deliver reliably, we<br />

must take on an increased responsibility<br />

for training investigators, for training<br />

employees, and essentially building an<br />

industry where it did not exist before.<br />

In this regard, CROs like PAREXEL<br />

have worked with regulators and government<br />

officials and the industry to<br />

advance the adoption of ICH-GCP in<br />

emerging markets for biopharmaceutical<br />

development.<br />

Ultimately, you can be as good as the<br />

quality of the drugs that are going into<br />

man, right?<br />

Our job is to provide high-quality research.<br />

We hope our clients are successful,<br />

but our responsibility is to conduct clinical<br />

trials that are adherent to all of the<br />

applicable regulations. Our job is to make<br />

sure that we do good science.<br />

We believe that the CRO industry has<br />

an important role to play in innovating<br />

around how to conduct clinical research<br />

more effectively and efficiently. •<br />

www.bio-itworld.com NOVEMBER | DECEMBER 2011 BIO•IT WORLD [ 25 ]<br />

CONTENTS


CONTENTS<br />

Clinical Trials<br />

NuMedii’s New Way to ‘De-Risk’<br />

Drug Repositioning Work<br />

Stanford University spin-off relies on database to identify new drug uses.<br />

BY DEBORAH BORFITZ<br />

A newly minted biotechnology company<br />

is offering to match the molecular genomic<br />

activity of previously approved medicines<br />

to that of known diseases to help<br />

drug makers find new uses for therapies.<br />

“We have two published studies demonstrating<br />

[our technology platform] can be<br />

used that way, and a third pending publication,”<br />

says Gini Deshpande, co-founder<br />

of Menlo Park, CA-based NuMedii.<br />

The company’s matchmaking capability,<br />

which has been likened to an “online<br />

dating service,” is driven by a strong computational<br />

engine that reduces the search<br />

for new indications for existing drugs,<br />

says Deshpande. The company is based<br />

on technology developed in the Stanford<br />

lab of Atul Butte, who is a co-founder of<br />

NuMedii (along with Joel Dudley).<br />

A multitude of other companies are<br />

trying to reposition approved drugs<br />

using more costly and time-consuming<br />

approaches, including cell based assays<br />

for screening an entire library of drugs,<br />

says Deshpande. NuMedii is less a fishing<br />

expedition than “educated fishing,” as<br />

one National <strong>Institute</strong>s of Health (NIH)<br />

scientist described it.<br />

What distinguishes NuMedii is an<br />

annotated and normalized database developed<br />

at Stanford that holds genomewide<br />

molecular profiles for more than<br />

300 diseases, as well as the informatics<br />

know-how to use it to identify new uses,<br />

including those still in clinical development,<br />

says Deshpande. It is also the first<br />

computational company to test its hypotheses<br />

in a pre-clinical animal model<br />

context and publish the results, which<br />

could help drug makers “de-risk” their<br />

medicine repurposing work.<br />

As reported in Science Translational<br />

Medicine, two predictions made by the<br />

technology—now licensed to NuMedii—<br />

have demonstrated preclinical activity<br />

in mouse studies (Dudley et al., 3:(96):<br />

[26] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

96ra76 and Sirota et al., 3:(96): 96ra77 ).<br />

One of the predictions helped reposition a<br />

generic anticonvulsant medication for the<br />

possible treatment of inflammatory bowel<br />

disease. The other aimed a generic antiulcer<br />

drug at lung cancer. A third prediction,<br />

also for a gastrointestinal indication,<br />

will soon be written up in the literature.<br />

“The platform essentially marries the<br />

Stanford database with other drug, data,<br />

and knowledge bases,” says Deshpande. It<br />

can be tapped to find a disease whose molecular<br />

profile either counters or mimics<br />

that of a commercially available medicine.<br />

The NuMedii platform has potentially<br />

broad applications, she adds, but repositioning<br />

drugs with known safety profiles<br />

was the easiest starting point.<br />

Platform Efficiencies<br />

The timing couldn’t be better, given that<br />

the NIH has been actively pushing drug<br />

repositioning. Novartis, GSK, and Pfizer<br />

already have internal repositioning efforts.<br />

NuMedii’s approach would also be<br />

an efficient way for such companies to<br />

find first-ever treatments for a host of rare<br />

diseases, notes Deshpande.<br />

NuMedii is in the midst of establishing<br />

partnerships with several pharmaceutical<br />

companies to demonstrate the breadth<br />

of the platform and enhance its value,<br />

‘ The<br />

NuMedii platform<br />

essentially marries the<br />

Stanford database with<br />

other drug, data, and<br />

knowledge bases.”<br />

Gini Deshpande, NuMedii<br />

says Deshpande. She’s hopeful the methodology<br />

will get used early in the drug<br />

development process to help companies<br />

“maximize the potential” of approved<br />

compounds in development by testing<br />

them for additional, multiple indications<br />

in parallel. NuMedii’s platform could also<br />

be used to detect potential blockbuster<br />

indications for current, low-volume niche<br />

products, she adds.<br />

NuMedii faces more of an uphill battle<br />

in the clinical development realm, where<br />

computational technology is a novelty,<br />

says Deshpande. But the platform could<br />

prove valuable in guiding and prioritizing<br />

new drug development programs, together<br />

with physician input on real-world prescribing<br />

and dosing patterns. The technology<br />

can’t definitively predict a drug’s<br />

failure, given the volume of confounding<br />

factors (such as patient compliance), but<br />

it could capture its efficacy odds overall<br />

and in certain subsets of patients.<br />

Ultimately, the platform could be<br />

used to identify potential biomarkers of<br />

disease and genetic receptivity to certain<br />

drugs, says Deshpande. “Preliminary<br />

publications have shown that these disease<br />

genomic databases can also be used<br />

for biomarker discovery and we hope to<br />

pair this approach with drug indication<br />

discovery over the next year.” •


Computational Biology<br />

PerkinElmer<br />

Targets Holistic<br />

Data Solutions<br />

A series of acquisitions is expanding and refining the<br />

company’s focus, explains CSO Dan Marshak.<br />

With its recent acquisitions of <strong>Cambridge</strong>Soft and Geospiza to name but two,<br />

PerkinElmer is signaling a new strategy that highlights data analysis and software as<br />

much as its traditional strengths in hardware and chemical analysis. Spearheading<br />

the new strategy is Dan Marshak, an accomplished cell biologist who has served as<br />

PerkinElmer’s chief scientific officer for five years.<br />

Marshak spent a decade at Cold Spring Harbor Laboratory, studying cell cycle<br />

control in cancer before helping to set up one of the first stem cell companies, Osiris<br />

Therapeutics. “We went from two guys eating sushi to 117 employees raising over $100<br />

million,” says Marshak. At PerkinElmer, he reports to CEO/chairman Rob Friel and<br />

has a key role in formulating new technology strategies and acquisitions. He spent two<br />

years as general manager of operations in China—PerkinElmer is now one of the leading<br />

makers of hepatitis and HIV testing systems for high sensitivity testing in China.<br />

Marshak sat down with Bio•IT World chief editor Kevin Davies to discuss<br />

PerkinElmer’s evolving strategy from both business and scientific perspectives.<br />

Bio•IT World: PerkinElmer has made a<br />

number of interesting acquisitions lately.<br />

Why all the recent activity?<br />

Dan Marshak: We’ve been doing a lot of<br />

acquisitions and divestitures for the last<br />

10 years. We divested our fluid sciences<br />

business in 2006, and then last year the<br />

Illumination and Detection Systems<br />

(IDS) business, which was quite different<br />

than many of our other business-tobusiness<br />

health science-related products.<br />

We were able to take that cash and<br />

reinvest it in a number of different areas.<br />

We’ve decided to focus on human health<br />

and environmental health. Our legacy<br />

business in analytical chemistry is now<br />

focused on specific end markets, particularly<br />

in environmental laboratories and<br />

environmental testing—industries that<br />

need to have strong analytical chemistry<br />

to preserve the environment. We also are<br />

the leader in inorganic analysis—heavy<br />

metal analysis in water samples. We’ve<br />

been very active in food testing as well.<br />

Environmental health is about half our<br />

business, including instruments, reagents<br />

and software.<br />

About 25% of our business is in services—instrument<br />

repair or maintenance<br />

services, or validating for FDA purposes.<br />

Typically a scientist isn’t qualified to strip<br />

[an instrument] down, move it to another<br />

building or continent and reassemble it,<br />

get it tested and working and revalidate it.<br />

Your most recent acquisition is Caliper<br />

Life Sciences. What is the rationale for<br />

buying Caliper and how will it complement<br />

the existing technology portfolio?<br />

Caliper Life Science’s focus on next-gen<br />

sequencing workflow is an ideal fit with<br />

PerkinElmer’s NGS service as well as<br />

our Geospiza software. The LabChip is<br />

essential to ensuring the quality of the<br />

samples prior to sequencing. The combined<br />

offering will supply unmatched<br />

value in providing high-quality samples to<br />

the rapid speed required by sequencers as<br />

well as the informatics needed to analyze<br />

the massive amounts of data generated.<br />

The LabChip also allows the separation<br />

and purification of small quantities of<br />

biological contaminants within food and<br />

water testing…<br />

Caliper’s differentiated platforms for<br />

animal and tissue imaging will enable<br />

PerkinElmer to offer scientists in translational<br />

medicine the ability to understand<br />

at molecular, cellular, tissue and animal<br />

levels what a biomarker or drug target is<br />

doing. We are very excited about being<br />

able to offer our fluorescent biomarkers<br />

to Caliper’s IVIs customers. By combining<br />

Caliper’s and PerkinElmer’s expertise,<br />

researchers will be able to multiplex fluorescent<br />

and bioluminescent in the same<br />

experiment as well as at different times<br />

in the same animal to better understand<br />

disease mechanisms.<br />

Where is your focus in the human health<br />

side of the business?<br />

About two-thirds of that is diagnostics,<br />

one third is in the research products area,<br />

which supports both the pharmaceutical<br />

and biotech industries as well as academics<br />

funneling new discoveries into drug<br />

www.bio-itworld.com NOVEMBER | DECEMBER 2011 BIO•IT WORLD [ 27 ]<br />

CONTENTS


CONTENTS<br />

Computational Biology<br />

discovery.<br />

In diagnostics, we manufacture some<br />

of the world’s fastest, highest contrast<br />

digital X-ray detectors. We’re one of the<br />

leaders in amorphous silica flat panel<br />

X-ray detectors, sold for medical purposes<br />

to GE. In oncology, we make very<br />

fast detectors that are used in therapeutic<br />

oncology devices… We recently acquired<br />

Dexela, a UK company that does CMOS<br />

X-ray detection. So we do both CMOS<br />

and our classic amorphous silica X-ray<br />

detectors in diagnostics.<br />

The central part of our diagnostics<br />

business is around pregnancy and birth—<br />

newborn screening, cord blood banking,<br />

prenatal/maternal health. We’re a leader<br />

in prenatal testing for Down’s syndrome<br />

and other disorders. Probably 80% of<br />

all the newborn Guthrie heel-stick blood<br />

spot cards are done through PerkinElmer<br />

systems. We recently acquired a company<br />

that produces these filter cards, all the<br />

way to the puncher and the instruments<br />

to do multiple testing of different types<br />

of analytes. Some of our most advanced<br />

areas are in the software for doing statistical<br />

analysis of the data. It’s easy to<br />

measure something and produce a number.<br />

What’s difficult is to know whether<br />

that number is significant, so a physician<br />

can say whether a child has a disorder or<br />

needs further testing.<br />

On the DNA side, we acquired Signature<br />

Genomics, which has proprietary<br />

content on Roche NimbleGen array kits.<br />

They do a service of prenatal and newborn<br />

arrays for further diagnosis of genetic<br />

disorders. They have a database they<br />

provide through unique software for genetic<br />

disorders called Genoglyphics. This<br />

year, we launched the first foray into using<br />

those chips in oncology for hematological<br />

malignancies, called Oncoglyphics…<br />

We’ve expanded our diagnostics<br />

through acquisition of SYM-BIO Life Sciences<br />

two years ago in China. SYM-BIO<br />

is a leader in the high sensitivity test for<br />

hepatitis, HIV, and some STDs. There are<br />

100 million carriers of Hep-B in China.<br />

We’ve been able to fill that niche and we’re<br />

expanding that business significantly.<br />

What is the over-arching goal or strategy<br />

to these acquisitions?<br />

<strong>Up</strong> to the 1970s, PerkinElmer was the<br />

[28] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

leader in introducing new instruments—<br />

we introduced the first infrared spectrometer,<br />

the first GC mass spec, a lot of firsts<br />

in analytical chemistry. In those days, the<br />

learning curve, the technology curve was<br />

very much in the hardware. The field has<br />

matured in the last 30 years and a lot of<br />

companies have entered the field. Ours<br />

is still outstanding technology, but the<br />

differentiator for the customer is not as<br />

much in the box as in other parts.<br />

Also, the users have changed... [Customers]<br />

need help on sample preparation<br />

and getting it to the instrument; on<br />

the other side, it’s about understanding<br />

the data coming out, turning that data<br />

into information and then using other<br />

resources to turn the information into<br />

knowledge. They want the complete<br />

solution from one vendor: so samples,<br />

instrument, reagents, consumables, and<br />

the software, and the analysis, and training<br />

and service support from one source.<br />

We’re trying to provide that.<br />

So it’s about providing a more holistic<br />

data to knowledge solution?<br />

Moving from data acquisition to knowledge<br />

is not trivial and it’s not simply<br />

laboratory information management. It’s<br />

really using the new tools of the Web 3.0<br />

and cloud-based tools to understand what<br />

else is out there to collaborate or interact<br />

within or outside the facility.<br />

Some of the software no longer needs<br />

to be tied to the instrument. It’s really<br />

software-as-a-service in some cases—<br />

software that enables the customer to<br />

take their information and turn it into<br />

solutions.<br />

Some years ago, we bought Evotec<br />

Technologies, which had cell-based<br />

screening for high content screening. The<br />

accompanying software actually runs<br />

on its own—it’s very useful for sample<br />

analysis and image analysis either from<br />

our instrument or from open sources, and<br />

in general for analyzing cellular imaging.<br />

Our high-end instrument is called Opera,<br />

and there’s a lower-end instrument called<br />

Operetta.<br />

The software (with a graphical user<br />

interface) is called Harmony, but the central<br />

software that’s used for analyzing the<br />

data and moving to a knowledge-based<br />

architecture is called Acapella—because<br />

it is without an instrument! We’ve now<br />

launched Acapella on our Columbus Conductor<br />

Workstation, which allows you to<br />

analyze image files and understand what<br />

is going on to build models…<br />

<strong>Cambridge</strong>Soft is the leader in electronic<br />

lab notebooks, enterprise wide<br />

but also for the individual researcher.<br />

They also have ChemDraw and other<br />

applications on that platform… There<br />

are tools linked to instruments but there<br />

are also tools that scientists will use independently<br />

and then these systems like<br />

electronic lab notebooks (ELNs) or workstations<br />

that pull data together and allow<br />

knowledge creation. That goes along with<br />

Signature Genomics and the Genoglyphics<br />

and Oncoglyphics software. Whether<br />

we do the service, use our array chips or<br />

our Luminex system, or clinical sequencing,<br />

we’re still developing an information<br />

and knowledge solution for the customer<br />

using the software.<br />

How will this help researchers<br />

down the road?<br />

We’d like the researcher of the future to be<br />

able to sit at a terminal or on an iPad and<br />

access all these things, and understand<br />

better what (s)he needs to do next, to call<br />

on service support or collaborators as<br />

needed, to reorder reagents and consumables<br />

as needed, to reserve data for FDA<br />

purposes and make sure things stay in<br />

the right place. It’s more than laboratory<br />

information management—it is knowledge<br />

creation and solving problems for<br />

customers.<br />

There are some very highly data intensive<br />

systems now. Mass spectrometry,<br />

imaging and DNA sequencing generate<br />

very large bodies of data. So you do the<br />

experiment quickly and then you spend<br />

your time analyzing. This has happened<br />

in physics, too. There is the ability to<br />

analyze that data more quickly in very<br />

large data sets. We have customers with<br />

petabyte storage devices because they<br />

have so much data. Even our Conductor<br />

workstation is sold with 24 terabyte<br />

storage. So being able to manipulate<br />

large data sets, and share them with your<br />

collaborators in Europe or China, maintaining<br />

security, and to manipulate the<br />

information to garner the desired results,<br />

is very important.


What does the acquisition of Geospiza<br />

say about your interest in next-gen<br />

sequencing?<br />

Our DNA strategy is still a work in<br />

progress. I mentioned Signature Genomics<br />

for array analysis and Luminex<br />

as a platform to develop bead-based assays.<br />

This year we acquired Chemagen<br />

Biopolymer Technologies in Germany,<br />

which is a magnetic bead-based nucleic<br />

acids purification system, and competes<br />

with the likes of Qiagen on methods for<br />

isolating DNA.<br />

Downstream, the Geospiza deal was<br />

terrific for us because they had some of<br />

the best technology for DNA sequence<br />

analysis and managing sequence information.<br />

We’ve set up a DNA sequencing<br />

service. Our goal was not to enter the<br />

field of DNA sequencing technology per<br />

se. We’re providing sequencing services<br />

at a very high level of automation, accuracy,<br />

service and security, as well as the<br />

Geospiza software for analysis. There are<br />

still some gaps from sample acquisition<br />

and processing all the way down through<br />

analysis of the results. We hope to fill<br />

those gaps and have a more complete<br />

DNA strategy, both in the service and<br />

product offering.<br />

Why did you decide to get into the<br />

sequencing service business?<br />

We heard from a lot of customers that<br />

there was a demand, both in pharma and<br />

in academia. One of the leading providers<br />

of these services is BGI in China, but we<br />

heard a lot of interest in finding a domestic<br />

supplier of high quality services, and<br />

we thought there was an opportunity for<br />

us to enter this market.<br />

We typically use Illumina instruments,<br />

but we are methodology agnostic and<br />

will go where the field needs in terms of<br />

technology. We want to provide—from<br />

sample to knowledge—that complete<br />

continuum to the customer to try and<br />

solve their problems, and allow them to<br />

back away from having to be experts in<br />

every technology.<br />

Dick Begley, previously the president<br />

of 454 and at Agilent before that, is run-<br />

ning the business. We’re trying to provide<br />

very high quality service, very good data<br />

capabilities, but we’re starting slowly.<br />

Rome wasn’t built in a day and neither is<br />

our DNA sequencing.<br />

Are you planning more<br />

acquisitions?<br />

Well, a) you’re never done and, b) we still<br />

have plenty of money and debt capacity<br />

to do more investment. So no, we’re<br />

not done! We’re still very interested in<br />

acquisitions of significant size that may<br />

be somewhat transformational but we<br />

continue to consider ‘bolt-ons’ for the<br />

existing businesses.<br />

Some of the areas of overlap between<br />

human health and environmental health<br />

include things like pathogen detection,<br />

not only for infectious disease but also<br />

in food and water. So you can see where<br />

technology is sort of agnostic to what you<br />

use it for, and we could use nucleic acid<br />

testing technology as well as immunoassay,<br />

imaging and other things to help us<br />

cross that frontier. •<br />

Final Print Issue of Bio-IT World<br />

Starting in January 2012, Bio-IT World will publish<br />

exclusively as a more frequent, digital magazine.<br />

Print subscribers will automatically receive the<br />

digital edition beginning in<br />

January.<br />

Earlier this year,<br />

we launched a<br />

free iPad/iPhone<br />

app that not only<br />

shows off the printed content but<br />

also features video and other<br />

enhancements that will only grow<br />

in 2012. If you haven’t checked<br />

out our digital magazine, go to:<br />

http://bit.ly/bioitnew<br />

BIO-IT WORLD ADVERTISING SPONSOR<br />

<strong>Up</strong> First Contents <strong>Front</strong> Base The News [09–10•11] Bush Doctrine<br />

Download a PDF<br />

of This Issue<br />

CLICK HERE!<br />

Special Report<br />

The latter requires a different level of database information, come In an in that effort case to create is by no stigmergy means guaranteed. in this space, DSEC has<br />

NGS in Interpreting this case transcription and the factor Personalized binding Genome motifs. As these implemented Our cover image Medicine:<br />

a new tool—a depicts virtual Chris Toumazou, poster gallery the that founder enables and<br />

Stigmergy,<br />

elements Interpreting<br />

Jackson Lab Priorities<br />

rarely act on their own, their integration into a scientists CEO of a relatively and startup small companies British company, to presentDNA summaries Electronics, of their<br />

network<br />

19 Seize<br />

or<br />

the<br />

pathway<br />

New:<br />

system<br />

Laying<br />

is<br />

the<br />

a reasonable<br />

Clinical Genomics<br />

subsequent<br />

Pipeline in Wisconsin<br />

new which technologies has been licensing in a virtual its intellectual exhibit on the property DSECto website some of the<br />

Computational analytics key in push to more<br />

move. 24 The Richard And finally, Resnick’s Role all these Quest findings for of should Genome Data be Governance<br />

presented in Interpretation<br />

biggest translational names in the business, research.<br />

(http://www.chicorporate.com/dsec_poster_list.aspx).<br />

including Ion Torrent Systems<br />

Anyone?<br />

an the intuitive visualization Genome<br />

environment, allowing researchers and SoRoche. far, more Toumazou than 60has posters a distinguished have been uploaded record in to medical the site,<br />

to<br />

27<br />

interactively<br />

DNA Electronics<br />

combine<br />

is<br />

experimental<br />

Powering Preventative<br />

results with other<br />

Medicine<br />

data which devices, are and presented his latest ingizmo, four categories: the ‘SNP-Dr,’ Cardiovascular which instantly Toxicity, calls<br />

BY KEVIN DAVIES<br />

and to see the result in an understandable format. Deep, Discovery to mind Mr Pharmacology, Spock’s tricorder, Genetic bracing shows Toxicity the winters potential and Hepatic of of Maine, real-time ToxicLiu<br />

31 The Bedrock of BGI: Huanming Yang<br />

have one big job left before they put<br />

ity. genomic It is free diagnostics to upload(page and display 27). has no theillusions posters and of the available challenges to<br />

extensive, high quality data content is therefore a critical<br />

The advance of Next Generation Sequencing technology has<br />

component<br />

35 An NGS<br />

of<br />

Analysis<br />

each of these<br />

Software<br />

steps<br />

Smorgasbord<br />

Several articles look at new software initiatives from the<br />

KEVIN me out DAVIES to pasture. Singapore needs<br />

anyone who would like to share ahead. theirFor poster all of with Singapore’s the DSECmuch<br />

and is required to enable<br />

enabled researchers to generate genome wide sequencing<br />

new ERNIE a refresh,<br />

biological BUSH and perhaps I do as well.”<br />

community. commercial Our and academic hope is that fields. providing publicized Managing this funding opportunity editor problems, Allison forProf<br />

Liu<br />

‘I 40 Kevin So welve McKernan says months insights.<br />

Edison ago, Leaves Liu, we arguably published Life Technologies a special issue on<br />

data of a High<br />

of unprecedented fitt stigmergy interviewed will<br />

Open<br />

quality stimulate many source leading tools<br />

and others quantity. notes<br />

like genome not<br />

CIRCOS<br />

only that Chromatin informatics to the<br />

(used<br />

upload NIH<br />

to<br />

IP budget other companies visualize<br />

posters is flat.<br />

the<br />

structural variants here) can easily be incorporated<br />

(ChIP-Seq), Genotyping (DNA-Seq) and expression analysis<br />

Turning<br />

44 most Galaxy tigmergy prominent Bio•IT World<br />

Results<br />

Provides is currently medical dedicated<br />

into<br />

Life defined researcher to “The<br />

Medical<br />

Support in Wikipedia Road to the<br />

for<br />

Action<br />

NGS Exploration<br />

as: $1,000 for buttheir also to latest investigate tools and and trends, evaluate “The including U.S. the new situation Knome, technologies isReal not Time rosy being ei-<br />

in Singapore Genome.” and theWhile two-term we didn’t president expect to reach that mo- Genomics,<br />

With lowered sequencing costs, the introduction of<br />

(RNA-Seq)<br />

exhibited.<br />

due Partek, to our and adaptable DNASTAR<br />

are three of the many application<br />

ther,”<br />

system (page he<br />

areas<br />

says.<br />

architecture. 35). On the open-<br />

currently<br />

of 46 the BGI Human “A Launches mentous mechanism Genome if somewhat New of indirect Organization Big arbitrary Data coordination Journal: threshold between GigaScience<br />

so soon, source To further front, we enhance meet the thetwo interest founders What in this appeals of virtual Galaxy, toposter Liu an increas- about exhibit, JAX<br />

benchtop sequencers, and the emergence of multiple<br />

used for research purposes. Epigenomics, transcriptomics<br />

(HUGO), agents there’s who is or no returning actions. doubt The we’re to the principle getting Unitedvery<br />

is that close. the trace This past left ingly the DSEC popular organization platform for haseasy implemented is and thatefficient the institute what access might “has to a bea variety called degree<br />

strategies T for data analysis, the medical application<br />

and<br />

of NGS<br />

the analysis of structural variants are also widely<br />

States 48 VAAST after in summer, the a decade Potential environment Illumina abroad for to by announced Genome become an action theyet<br />

Analysis stimulates another Software the drop per- in its of iterative genome stigmergy informatics (i.e. tools an act (page of freedom stigmergy 44). inintended terms oftoorganiza pro-<br />

S technologies personal new is a natural next step. However, if medical<br />

used with this technology. With data generation growing<br />

51 president De formance Facto genome and Software sequencing of CEO a next ofaction, The Standards service, Jackson by dropping the same in Next-Gen or the a introduc- different Genologics mote On another page 24, act Richard of stigmergy) Resnick, tion byCEO holding andof capabilities GenomeQuest, quarterly innovation thatdis I just<br />

action tory Laboratory agent. individual is In to (JAX) that be price based way, in Bar below subsequent on Harbor, $10,000. scientific Maine. actions For results, academic tend several to reinforce and issues pharma at an exponential cusses awardshis competitions. rate, company’s the need ambitions Infor this efficient competition, don’t in the see analyses genome in four anyand analysis other recently university realm. upload-<br />

need<br />

customers 52 Liu and NGS to will build be Software seeking take addressed:<br />

on up each whole his other, Systems new genome leading post in for sequencing Janu- to the the spontaneous Enterprise services, emer- the comprehensible cost As ed posters one visualizations might areexpect nominated of has someone by become the department. chairman who critical. took his of I saw the bioinformatics<br />

DSEC a veryAdvi robust 31<br />

Use paired-end<br />

per<br />

n<br />

ary<br />

Sequencing<br />

gence genome 2012, RNA-Seq of succeeding coherent, is now<br />

platforms experiments<br />

as apparently Rick low as Woychik, $4,000,<br />

must to systematic identify provide<br />

who competition<br />

data<br />

activity. being<br />

that<br />

Stigdriven<br />

training<br />

is both<br />

sory Board from as top Broad examples <strong>Institute</strong> oforganization director new technology Eric Lander, that innovation had Resnick sufficient and is<br />

gene fusions<br />

by is now the mergy and<br />

emergence deputy is read-throughs<br />

a form director of of many self-organization. ofnew theservice National providers, It produces and complex, even Turning one a Data knowledgeable into Results and refreshingly resources candid the because spokesperson DSEC of members excellent for the<br />

reliable and reproducible.<br />

or <strong>Institute</strong> two seemingly sequencing of Environmental intelligent service structures, brokers, Health such without Scias<br />

Denmark’s need for any Blue- Analysis generally genome informatics starts with industry. the alignment management of reads are asked ofto mouse a to vote services. on<br />

n SEQ, Analysis pipelines must also be reliable and reproducible.<br />

<strong>Up</strong> <strong>Front</strong> offering to find the right sequence and analysis<br />

Edison Liu<br />

ences planning, in Research control, Triangle or even Park, direct N.C. communication Liu<br />

between partner reference genome IT/Workflow and the subsequent They analysis understand one of variations of these thatas there theirare fa-<br />

for is the Additionally, the a client’s founding agents. needs. As executive they such it will supports director have to efficient of be the standardized collaboration enome, the and<br />

8 compared dynamic portions Challenges<br />

to this reference overand time Opportunities<br />

(e.g. potential challenges vorite ahead new technology in terms of<br />

Genome certified. As between Ed the Liu <strong>Institute</strong> cost extremely on of whole-genome Singapore ofsimple Singapore agents, Science sequencing (GIS), who and lack halves and any different memory, every five 53 Sequencing the Earth: the Computational<br />

SNPs,<br />

conditions.” Speaking of candid, I was<br />

re-arrangements). The next<br />

what struck the mouse by the open model poster attitude can for that do, of and quarter. BGI they<br />

n months which Medically intelligence Jackson has or grown so, relevant the or Lab to even focus 27 Priorities individual laboratory disease inevitably databases awareness groups shifts to the of need cost each Liuto of suggests other… analyzing be that institutions founder Burden Huanming that Find are out and which Yang, Future need key who toplayers be Role gave medically us of in an gene FPGAs For exclusive relevant.” Q2 networks 2011, interview the or winpath- steps are then application-specific,<br />

and andintegrated interpreting with the existing genome. data The task content of analyzing and their the biologically flood<br />

ways are associated with your NGS data and how<br />

9 Stigmergy nearly Patent 300 was Reform’s staff. first Liu observed is ‘Brave credited in New social withWorld’<br />

insects, not however divided into it departments, during a visit like to JAX, Boston earlier The future this year of the (page ner mouse was 31). VivaQuant’s disease Yang has model<br />

of next-generation an example being a copy number<br />

being relevant is not a driving restricted counterparts.<br />

force to sequencing in eusocial the creation creatures, (NGS) of the data or even is have likened to physical a better by the chance of worked they influence molecular mechanisms.<br />

10 analysis or the In doing<br />

identification Every wonders, a deepIssue inbuilding<br />

is a BGI critical into issue the world’s poster, for JAX. largest A Novel “By using genome Alin- Broad Biopolis<br />

of areas<br />

If systems. these AstraZeneca <strong>Institute</strong>’s research and On the other Matthew center Internet Opens as (see, yet Trunnell there Predictive unforeseen “Building are to many driving Science collective requirements<br />

tegration. in a snowstorm. proj- “Historically, sequencing computational center, providing formatics services and the for gorithm a sequence large number foras Highly a linkage of<br />

of higher and/or First lower Base read BY density. KEVIN DAVIES<br />

are “If Buzz we ects met, Center in just where Singapore,” NGS waited users in has Russia for interact the Bio•IT the potential roads only World, to by be to modifying June ploughed, become biology local it a would significant parts is split be among academic 3 to 4 different and pharma between clients. But mouse as in and any Automated human, arena, reaching Measure- we can the put<br />

Yet, these and other first steps are<br />

technology really 2008). of their easy, shared but in we the virtual have enablement to environment. white-knuckle of personalized Wikipedia it now,” departments. rather is medicine. an than It’s not pinnacle common 5 Interpreting is they one thing, get staying the valueGenome added there is inquite thement medical another, of QTtranslational and Interval. Yang<br />

wait 12 however Quebec’s long Genizon for the perfect Biosciences platform, Closes with negligible its Doors simply er- data does conversions not sugarcoat that BGI’s reduce challenges Genomatix in the next is the few proud years.<br />

Genomatix Liu example listsof three this.” offers primary comprehensive, goals as heintegrated,<br />

aggregated and into flexible a powerful unit. Around piece,” he says. As gene Our congratulations<br />

knockouts become<br />

The Russell Transcript BY JOHN winner RUSSELL of Illumina’s 2011<br />

rors, takeslong read lengths, and high throughput. Perhaps seeking the massive amount One of the of first raw NGS data platforms featured in Bio•IT World was<br />

turnkey 13 Reevaluating up his new post.<br />

solutions and the First,<br />

services Role “to move of for the into<br />

the entire Research the world, where computational biology ubiquitous, Liu wants to VivQuant to refocus andon to “in- all<br />

spectrum 58 CDER Must Have Clinical iDEA Trial Challenge Reform award<br />

inspiration human Openrelevance—translational Virtual from Collaboration a well-known is ‘next-generation’ research, another example, has linked expert, where up, to the aa<br />

it(usually is in institutions in still 2007, huge) when like number Kevin the McKernan novation of technologies.” took for the us “Most the through nominees.<br />

Creative Thethe mouse basics isof still<br />

of NGS Librarian data analysis in Pharma with a R&D<br />

Broad’s upstream NGS pipeline<br />

strong<br />

is humorously<br />

focus on<br />

named<br />

medical<br />

Picard,<br />

Easily overlay<br />

you company can<br />

multiple<br />

use posts other<br />

results<br />

a problem terms.<br />

from<br />

That’s on<br />

your<br />

the<br />

NGS<br />

the Webkey.” experi-<br />

and rewards Broad, those Sanger, regions who even or SOLiD Washington positions that may be<br />

Visualization” in the<br />

application. From first level alignment to variant calling to<br />

6 Company sequencing, Uni- now and thepart best Advertiser of mammalian Life Technologies. Index If model you have McKernan for acomplex poster<br />

which “Commercial” category.<br />

ments with<br />

Second, contribute<br />

our The feeds<br />

first<br />

isBush to ainto class<br />

solution, engage Doctrine a disease-specific pipeline appropriately<br />

genome<br />

genomics and<br />

annotation<br />

this isinoften a cost-<br />

to<br />

held upversity’s as a successful genome relevant center, to spent the not biological his within final 12 the question months physiology at Life Technologies and biology, that you leading he would says. the like R&D to<br />

the dubbed integration of multiple result files, we provide biological 7 On Deck<br />

understand<br />

effective commercial 14 biological Stigmergy, Firehose. manner application Agility<br />

contexts.<br />

toAnyone? increase of is stigmergy. at the a institute’s premium, So are says there department Trunnell, other but team of newly acquired Ion Torrent Systems, before stepping<br />

at practi- ofhand. computer While science.” this data conversion Liuis wants demanding to “highlight share in terms withthe the relevance mem-<br />

knowledge NGS impact. cal applications is still Third, posing and to ofmedically “conceptualize stigmergy tough questions relevant in pharmaceutical how for results we the IT from crowd R&D? your five NGS years down 54 this New summer. Products His ofnew theventure, mouse Medicinal in Genomics, has<br />

of processing power and storage requirements,<br />

bership systems<br />

it can be<br />

ofor DSEC problems (to<br />

data. later. canThe make The “It’s Our Drug Skeptical new the extensive Safety mouse to us,” Outsider Executive even says high-quality Trunnell. moreCouncil relevant.” background “We’re (DSEC, not The www.drugsafe-<br />

high-energy data Maine and Event sequenced<br />

automated and 56 pipelined Educational and released where<br />

for high-throughput Opportunities<br />

data from therethe are cannabis stigmergise?), not other genome. organisms you This can<br />

analyses.<br />

award-winning physics!” tycouncil.org) 15 A Big major Pharma’s priority was visualization established will Last be what Refuge: in tools Liu 2007 greatly calls Ma to promote Meds? enhance Liu says collabora- that the as ability much as may sound like some fancy new recreational visitgenomics the DSECstart website<br />

Turning Leaf<br />

he<br />

the cutter<br />

loves<br />

data ants<br />

GIS,<br />

into demonstrate<br />

he available.”<br />

meaningful the results, princi-<br />

And he wants to use synthetic<br />

however,<br />

to computational tiongain In inour developing meaningful 2011 genome analytics. pharma insight informatics R&D “The into safety very linkage special complex assessment issue, never biological we practices. intended offer a to stayples up, more but of indirect than as McKernan<br />

Inside the Box<br />

requires an interpretive<br />

Get about coordination:<br />

step<br />

a comprehensive<br />

explains biology stigmergy. (page to<br />

and a significant<br />

expression study 40), the or the contact effects<br />

amount<br />

overview<br />

goal is of my to<br />

of<br />

with mutations better colall<br />

processes.<br />

collection between (Ed Note: human DSEC of features, is and a sister profiles, mouse organization inand a trans- interviews to Bio•IT 10-12 around World.) years. the It lat- The initial understand approach<br />

16 Virtualization and the Cloud<br />

known<br />

the from potent<br />

plus in novel<br />

anti-cancer different transcripts genetic properties<br />

and league backgrounds. splice<br />

of Brian some<br />

variants. Murphy, of “We the can<br />

estlational is atrends virtual world in community NGS willand generally genome of morebethan informatics. through 2,000 pharma Perhaps JAX came company the high out most of quality the plant’s head blue. background of “When active business compounds. they information development de-convolute NGS at for several <strong>Cambridge</strong> complexity Data different <strong>Healthtech</strong> in an organismal As-<br />

gratifying genetics, employeesstory in genome interpretation this year is the levels. work For example, Our digital to annotate edition carries regions two of bonus relevance, articles a in this report:<br />

Insights genomics thatOutlook sponsors andseveral computational consortia. Their first contacted collective me, I thought sociates they at bmurphy@chacorporate.com.<br />

had the manner,” he says. “I’d love to explore that.”<br />

of analytics. missions Elizabeth Worthey and colleagues at the Medical College<br />

17 Single are I hesitate to Molecule improve to call the Sequencing economics it computaand<br />

and effectiveness wrong the Future person! comprehensive of I’m not a mouse story Finally, genome about geneticist, if you VAAST annotation have Interpretation<br />

(page seen other 48), database a examples promising must ofalgorithm stigmergy be for in the<br />

of tional safety Wisconsin evaluation biology—that (page (especially 19). often Worthey, means in thewhom areas someof<br />

I had discovery but the Contact that’s pleasure toxicology USA: not available the of issue. to detection pharma They allow want correlations R&D, of some- pathogenic we would with Push Contact mutations very multiple Pull much Germany: amidst like types to hear an of ocean genomic about of them. exome<br />

interviewing thing and<br />

Genomatix Software Inc.<br />

Genomatix Software GmbH<br />

10 non-clinical Briefs else—but this the development). summer concept in Copenhagen, of One analytics of DSEC’s played one primary toa pivotal takegoals 3025 Boardwalk, elements them role into<br />

Suite such or Send genomics, whole-genome me<br />

160 as transcripts, a note human pointing data, transcription There which out or has is<br />

Bayerstr. Survey<br />

demonstrating available been<br />

85a start much to sites academic attention your and/ examples users to the<br />

in engages is to the promote much IT, publicized informatics, the introduction, case computational<br />

of Nicholas qualification, Volker, genetics anda sick Wiscon-<br />

Ann implementa-<br />

Arbor, andor MI translational regulatory now<br />

48108 andmedicine. regions Iand will will report (promoters/enhancers). be They part on them of state a commercial package<br />

80335 inof future Munich scientific columns. Drilling affairs in by deeper Singapore, San Fran- but<br />

sin biology, tionboy of new whose simulations—the technologies genome was that whole sequenced improve gamut. the (almost prediction needed Complete the survey for a<br />

Clinical Trials<br />

USA as a last to offill resort), clinical into a cancer the biology, center cisco-based May director’s one the stigmergy can<br />

Omicia pothen<br />

shortly. be assess despite Germany with And you! the what we<br />

function<br />

talk Liuto calls Laurie<br />

of “naivety these<br />

Goodman, among<br />

resulting That’s adverse www.genomatix.com<br />

what events in you’ll a positive before needthe diagnosis to make initiation all and aspects of therapy. humansition The studies Phone Volker and +1-877-436-6628<br />

thus they case the editorial director chance of Phone GigaScience, to +49-89-599-766-0<br />

win an a new Amazon peer-reviewed Kindle.<br />

Fax +1-734-622-0477 elements,<br />

dewant<br />

globalization.<br />

e.g. proteins<br />

I was<br />

encoded<br />

bureaucrats<br />

Fax by +49-89-599-766-55<br />

a specific<br />

in dealing<br />

transcript<br />

with scientists,”<br />

highlights of creasing 18 genomics Trials the the at failure work—not vast the potential rate Point-of-Care<br />

of just drugs of sequencing clinical in clinical genome trials onesequencing, due person to unex- that fit the open-access Ernie entireBush bill.” is journal VP and published scientific the situation director by See BGI, inpage Singapore devoted of <strong>Cambridge</strong> 37 to is big still Health data pretty in<br />

E-Mail sales@genomatix.com<br />

variant or transcription factor patterns E-Mail info@genomatix.com<br />

in a regulatory region.<br />

even the pected genome, though—as safetybut findings. the Worthey transcriptome, readily admits—the epig- gratifying Trading inout the Singapore genomics Associates. heat and He forcan life thesciences<br />

be reached (page at: ebush@chacorporate.com.<br />

46). (continued on page 11)<br />

CONTENTS<br />

Ed Liu on Singapore Science and<br />

BART NAGEL/FILE<br />

BANDWAGONMAN AT EN.WIKIPEDIA<br />

[4] [14] [8] BIO•IT WORLD SEPTEMBER | | OCTOBER | OCTOBER 2011 www.bio-itworld.com<br />

www.bio-itworld.com SEPTEMBER Cover photo-illustration | OCTOBER 2011 BIO•IT by William WORLD Duke [ 5 ]<br />

CONTENTS<br />

www.bio-itworld.com NOVEMBER | DECEMBER 2011 BIO•IT WORLD [ 29 ]<br />

CONTENTS


CONTENTS<br />

Computational [GUEST COMMENTARY Biology<br />

]<br />

What Should We Call ‘Systems Biology’?<br />

Systems biology has been around for years. Is it time for a makeover?<br />

BY DESMOND SMITH<br />

Systems biology lies at the very heart of<br />

modern biomedical research and a number<br />

of review articles and conferences<br />

have caught the spirit of this movement.<br />

The discipline encompasses an assortment<br />

of related but distinct concepts,<br />

including the idea that detailed and<br />

comprehensive data from living systems<br />

will enable robust prediction of their<br />

trajectory and also the paradoxical notion<br />

that some biological behaviors occur<br />

in a relatively surprising and unexpected<br />

fashion, so-called emergent properties.<br />

However, the ideas underlying systems<br />

biology are not new. First articulated by a<br />

number of researchers in the 1960s, some<br />

of the principal notions<br />

were also articulated<br />

by Francis Crick in a visionary<br />

Nature review<br />

article in 1970 that anticipated<br />

the state of<br />

molecular biology in<br />

the year 2000. In his<br />

article (Nature 228,<br />

613-615 (Nov 1970)),<br />

Crick stated that “problems<br />

involving complex<br />

interactions can hardly<br />

be avoided… a simple<br />

example would be the<br />

3000<br />

2500<br />

2000<br />

1500<br />

1000<br />

500<br />

‘total’ behavior of a microorganism such<br />

as Escherichia coli, including all of its<br />

regulatory mechanisms.”<br />

More recently, the systems biology<br />

agenda has been resurrected by Leroy<br />

Hood with his founding of the <strong>Institute</strong><br />

for Systems Biology in Seattle (see “All<br />

Systems Go at ISB,” Bio·IT World, June<br />

2002). This signal event has been followed<br />

by a growing body of work from the<br />

institute as well as several other groups.<br />

The renaissance of systems biology<br />

has depended upon the dramatic advent<br />

of new and interrelated genetic and genomic<br />

technologies that gather biological<br />

data with ever increasing speed. The<br />

midwives of systems biology are thus<br />

the accompanying “omics” technologies,<br />

0<br />

[30] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

such as transcriptomics, proteomics and<br />

interactomics. At first, the large amount<br />

of data obtained by these omics technologies<br />

was impressive in its own right.<br />

More recently, however, there has been a<br />

tangible sense that mindless accumulation<br />

of data is insufficient, and that new<br />

insights are required if the discipline is to<br />

thrive. This disquiet has been exemplified<br />

by criticisms likening systems biology to a<br />

Stalinist five-year plan that will ultimately<br />

fall flat on its face.<br />

Personally, I am not so sure and would<br />

like to throw back at the critics a quotation<br />

attributed to the Generalissimo:<br />

“Quantity has a quality all its own.”<br />

Nevertheless, it does seem that the<br />

2000 ’01 ’02 ’03 ’04 ’05 ’06 ’07 ’08 ’09 ’10 2011*<br />

*projected total<br />

“Systems biology” references in PubMed jumped from 8 articles<br />

in 2000 to 2,422 in 2010, but 2011 appears to be leveling off.<br />

term systems biology is (already) becoming<br />

a bit tired despite the importance<br />

of the subject matter. The half-life of<br />

excitement for new disciplines does seem<br />

to get shorter with each passing year.<br />

After all, molecular biology was a good<br />

enough term for a topic that thrived for<br />

a couple of decades at least, while the<br />

term functional genomics seems like a<br />

distant memory, even though only a few<br />

years old.<br />

Bring on the New<br />

So, what would be a fresh and evocative<br />

new term for systems biology? The<br />

difficulty of developing an acceptable<br />

name reflects the broad landscape of this<br />

discipline. One recent suggestion from<br />

Eric Schadt is multiscale biology, referring<br />

to the diverse nature of the data used<br />

to construct models in systems biology.<br />

(Schadt is the new director of an institute<br />

dedicated to genomics and multiscale biology<br />

at Mt Sinai School of Medicine; see,<br />

“Partnering on Multiscale Biology,” Bio•IT<br />

World, July 2011)<br />

However, a few other alternatives do<br />

present themselves.<br />

• Biosignal processing is an attractive<br />

possibility. This name forges an analogy<br />

between the concepts underlying<br />

systems biology and electronic signal<br />

processing, except that the signals are<br />

biological (transcripts, proteins, nerve<br />

impulses etc.) instead of electrical.<br />

• Biowireomics would emphasize the<br />

notion of integrating biological wiring<br />

diagrams from many diverse domains.<br />

• In the same vein, bitomics emphasizes<br />

the now commonplace observation that<br />

biology is an information science.<br />

• Another possibility is Tukomics, in<br />

honor of John Tukey, coiner of the term<br />

“bit,” while Shannomics would honor<br />

the father of information theory, Claude<br />

Shannon.<br />

• Other instances that share the same sentiment<br />

are dataomics (because systems<br />

biology is concerned with integrating<br />

all relevant data types), totalomics<br />

(same sentiment) and kitchensinkomics<br />

(same sentiment, tongue-in-cheek).<br />

And of course, understandomics,<br />

because surely that is the final goal of<br />

systems biology.<br />

But other fields are also overdue for<br />

a name modification. One instance that<br />

immediately springs to mind is nanotechnology.<br />

However, those scientists with<br />

young children know that finding affordable,<br />

scalable childcare represents a much<br />

more fundamental challenge for modern<br />

biomedical research.<br />

“Nannytechnology” anyone?<br />

Desmond J. Smith is a professor in the Dept.<br />

Molecular and Medical Pharmacology at<br />

UCLA. Email: DSmith@mednet.ucla.edu


BIO-IT WORLD ADVERTISING SPONSOR<br />

BIO-IT WORLD ADVERTISING SPONSOR<br />

Eliminating Cost As a Factor<br />

Headline Please Xxxxxxx xxx Xxxxxxxxxx<br />

In Xxxxxxxxxx Applying NGS Xxxxxxxxxxxxxxxxxxx<br />

Analytics to Medicine<br />

Appistry’s ® Ayrris / BIO is making NGS analytics clinically actionable and cost-effective<br />

As we near the close of 2011, the<br />

applications for Next Generation<br />

Sequencing have never been as<br />

mainstream as they are now. Over the<br />

past year, we have continued to witness<br />

the evolution of this capability and its<br />

associated methodologies—from a mere<br />

set of disparate research projects to a<br />

clinically actionable set of solutions that<br />

can readily help fight real world diseases.<br />

Appistry, developer of the open, scalable<br />

Ayrris TM As we near the close of 2011, the<br />

applications for Next Generation<br />

Sequencing have never been as<br />

mainstream as they are now. Over the<br />

past year, we have continued to witness<br />

the evolution of this capability and its<br />

associated methodologies—from a mere<br />

set of disparate research projects to a<br />

clinically actionable set of solutions that<br />

can readily help fight real world diseases.<br />

Appistry, developer of the open, scalable<br />

Ayrris /BIO technology for advanced<br />

data management in genomic analysis,<br />

exists at the fulcrum of this powerful<br />

evolution—building the applications that<br />

make the promise of NGS a reality across<br />

our healthcare ecosystem. After all, the<br />

core value of NGS will not be realized until<br />

it is utilized not just in extreme disease<br />

management situations like critical care,<br />

but also within the entire lifecycle of a<br />

patient and integrated completely into<br />

our medical systems.<br />

So where do we stand now? Previously,<br />

NGS’s potential permeated only<br />

throughout research, but it is now turning<br />

the corner and rapidly moving into the<br />

clinical space. One example is prenatal<br />

scans, which will begin in earnest across<br />

thousands of newborns in 2015. This<br />

data will provide a genetic roadmap for<br />

all stages of life.<br />

While this application of NGS provides<br />

invaluable patient data, the data<br />

is complex—exponentially increasing a<br />

hospital’s data management and analytic<br />

burden. It also requires significant<br />

standardization, commoditization and<br />

automation of NGS analytics as applied<br />

to medicine.<br />

Unfortunately, with multiple sequences,<br />

multiple analyses and a holistic<br />

perspective of the applicability of genomics<br />

into medicine now on tap, the technologies<br />

currently available in the clinics<br />

TM The applications for Next Generation<br />

Sequencing have never been<br />

as mainstream as they are now.<br />

We have continued to witness the evolution<br />

of this capability and its associated methodologies—from<br />

a mere set of disparate<br />

research projects to a clinically actionable<br />

set of solutions that can readily help fight<br />

real-world /BIO diseases. technology Appistry, for developer advanced of<br />

the data open, management scalable Ayrris in genomic / BIO technology analysis,<br />

for advanced data management in genomic<br />

analysis, exists at the fulcrum of this powerful<br />

evolution—building the applications that<br />

make the promise of NGS a reality across<br />

the healthcare ecosystem. After all, the core<br />

value exists of at NGS the will fulcrum not be realized of this until powerful it is<br />

utilized<br />

evolution—building<br />

not just in extreme<br />

the applications<br />

disease manage-<br />

that<br />

make the promise of NGS a reality across<br />

ment situations like critical care, but inte-<br />

our healthcare ecosystem. After all, the<br />

grated completely into our medical systems<br />

core value of NGS will not be realized until<br />

within it is utilized the entire not lifecycle just in of extreme a patient. disease<br />

management situations like critical care,<br />

Where but also do within we stand the entire now? lifecycle of a<br />

Previously, patient and NGS’s integrated potential completely permeated only into<br />

throughout our medical research, systems. but it is now turning<br />

the So corner where and rapidly do we moving stand into now? the Previclinicalously, space. NGS’s One example potential is permeated prenatal scans, only<br />

which throughout will begin research, in earnest but across it is now thousands turning<br />

of<br />

the<br />

newborns<br />

corner<br />

in<br />

and<br />

2015.<br />

rapidly<br />

This<br />

moving<br />

data will<br />

into<br />

provide<br />

the<br />

clinical space. One example is prenatal<br />

a genetic roadmap for all stages of life.<br />

scans, which will begin in earnest across<br />

While this application of NGS provides<br />

thousands of newborns in 2015. This<br />

invaluable data will provide patient data, a genetic the data roadmap is com- for<br />

plex—exponentially all stages of life. increasing a hospital’s<br />

data While management this application and analytic of burden. NGS pro-<br />

It vides also requires invaluable significant patient standardization,<br />

data, the data<br />

commoditization is complex—exponentially and automation increasing of NGS a<br />

analytics hospital’s as data applied management to medicine. and analytic<br />

Unfortunately, burden. It also with requires multiple sequences, significant<br />

multiple standardization, analyses and commoditization a holistic perspecand<br />

tive<br />

automation<br />

of the applicability<br />

of NGS analytics<br />

of genomics<br />

as applied<br />

into<br />

to medicine.<br />

medicine now on tap, the technologies<br />

Unfortunately, with multiple sequenc-<br />

currently available in the clinics simply<br />

es, multiple analyses and a holistic<br />

cannot perspective support of the influx applicability of data of required genom-<br />

for ics analysis. into medicine Couple now this with on tap, the increasing the tech-<br />

consumerization nologies currently of medicine, available and in the we clinics are<br />

simply cannot support the influx of data<br />

required for analysis. Couple this with the<br />

increasing consumerization of medicine<br />

left and with we are existing left legacy with existing technologies legacy that tech-<br />

are nologies just too that brittle are and just too too cost-ineffective.<br />

brittle and too<br />

cost-ineffective.<br />

Where So where does the does solution the solution lie? lie? Not<br />

Not likely likely in million in million dollar systems that that will<br />

will<br />

never never scale scale or reach and reach the necessary the necessary levels of<br />

adoption. levels of Delivering adoption. a Appistry’s focused solution Ayrris/BIO that<br />

provides<br />

provides<br />

exact<br />

a focused<br />

answers<br />

solution<br />

to the<br />

that<br />

questions<br />

provides<br />

the exact answer to the questions com-<br />

commonly asked by both the clinician<br />

monly asked by both the clinician and the<br />

and the patient, Appistry’s Ayrris / BIO is<br />

patient, turning the myriad of available<br />

turning<br />

data into<br />

the<br />

actionable<br />

myriad of available<br />

intelligence.<br />

data into<br />

actionable Ayrris/BIO intelligence. will enable the application<br />

Ayrris of NGS / BIO as enables value-added the application analysis<br />

of built NGS into as value-added existing medical analysis processing built<br />

into and existing the targeting medical of processing drugs relative and the to<br />

targeting disease subtyping of drugs relative identified to disease via genetic<br />

subtyping testing. Ayrris/BIO identified via will genetic also testing. speed the<br />

Ayrris ability / to BIO target also drugs speeds via the synthesis. ability to Over<br />

target time, NGS drugs will via scale synthesis. to meet Over the time, needs NGS<br />

and expertise of the hospital—and this<br />

will scale to meet the needs and expertise<br />

includes custom-built appliances such<br />

of the hospital. This includes custom-built<br />

as Ayrris/BIO that will run in parallel with<br />

appliances,<br />

existing EMR<br />

such<br />

systems.<br />

as Ayrris<br />

Costs<br />

/ BIO,<br />

for<br />

which<br />

NGS<br />

run<br />

will<br />

in decrease parallel with to less existing than EMR $5 systems. per month Costs per<br />

for patient, NGS will with decrease an all-in to less cost than of less $5 per than<br />

Full-spectrum<br />

analyses of<br />

genetic data,<br />

starting at<br />

$ 99 per patient.<br />

$1000 (including analysis) per year.<br />

The vision is clear: NGS applied to<br />

medicine is no longer a research project,<br />

but month is per an patient, evolving with clinically an all-in actionable cost of less<br />

activity. than $1000 The (including market for analysis) this is turning per year. into<br />

a $100bn The vision industry, is clear: with NGS many applied multiples<br />

of to medicine growth per is no year. longer Appistry’s a research Ayrris/ project,<br />

BIO but is technology an evolving provides clinically actionable a full-spectrum<br />

analyses activity. The of market genetic for this data, is turning from basic into<br />

Exome a $100bn to Whole industry, Genome with many to RNA multiples analysis of<br />

at growth class per leading year. Appistry’s quality and Ayrris best / BIO in tech- the<br />

world prices, starting at $99 per patient<br />

nology provides full-spectrum analyses of<br />

analysis. These medically actionable<br />

genetic data, from basic Exome to Whole<br />

analyses are available as a service or<br />

via<br />

Genome<br />

an appliance<br />

to RNA analysis<br />

today.<br />

at class leading<br />

quality and best-in-the-world prices,<br />

starting at $99 per patient analysis. These<br />

Appistry medically actionable – Discovery analyses at are the available<br />

pace today as of a human self-hosted thought appliance that can<br />

run in your existing environment or via our<br />

remotely hosted pay-per-use service.<br />

Discovery at the pace<br />

of human thought<br />

For more information:<br />

For more information:<br />

www.Appistry.com www.Appistry.com<br />

314.336.5080 314.336.5080<br />

info@appistry.com info@appistry.com<br />

www.bio-itworld.com NOVEMBER | DECEMBER 2011 BIO•IT WORLD [ 31<br />

www.bio-itworld.com NOVEMBER | DECEMBER 2011 BIO•IT WORLD [ 1 ]


Feature<br />

i A next-generation<br />

sequencing company<br />

introduces a new Cloud portal<br />

for its new benchtop<br />

sequencer<br />

i A cloud computing<br />

firm hosts a key bioscience<br />

database in the Google<br />

cloud<br />

CONTENTS<br />

By Kevin Davies<br />

IN<br />

MANY WAYS, CLOUD COMPUTING has become<br />

an ever-present commodity since Bio•IT World published<br />

a special issue on the subject (Nov 2009). In<br />

September 2011, we held our first standalone conference<br />

on the topic. Experts—users and vendors<br />

alike—gathered for two days of sharing insights and<br />

progress. The takeaway was that more and more users were comfortable<br />

with the flexibility, cost, and even security afforded by the cloud. And while<br />

Amazon’s omniscient cloud capabilities were a recurring theme, what was<br />

even more impressive was the growing ecosystem of commercial and opensource<br />

initiatives offering a host of cloud-based services and applications.<br />

[32] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

THE UTILIT Y OF<br />

CLOUD<br />

COMPUTING<br />

i An Ivy League school launches<br />

an on-demand research computing<br />

service running an open-source<br />

cloud platform<br />

i A software provider<br />

offers $10,000 in free compute<br />

time to the most imaginative<br />

biomedical question<br />

i A big pharma<br />

company spins up a 30,000-core<br />

supercomputer in the cloud


Amylin Pharmaceuticals’ Steve Philpott<br />

was one of the first biotech CIOs to<br />

enthusiastically embrace “the big switch”<br />

to the cloud. “Our IT cost infrastructure<br />

is 50% less than when we started [in<br />

2008]. My CFO really likes us!” Philpott<br />

said. “The box has disappeared… Do you<br />

care where your last Google search came<br />

from? No.”<br />

But the cloud still raises doubts and<br />

concerns. A major issue, discussed below,<br />

surrounds security and regulation. Another<br />

is whether the cloud can truly<br />

handle the masses of next-gen data that<br />

is being generated. “Can the cloud satisfy<br />

the requirement or can the requirement<br />

satisfy reality?” asks Eagle Genomics<br />

CEO Richard Holland. “Sometimes it’s<br />

not the cloud at all but the Internet—the<br />

whole concept of the network and trying<br />

to transfer that amount of data. Do you<br />

really need to be transferring that much<br />

data in the first place? I think people are<br />

generating so much data now, and they’re<br />

expecting to do with it what they always<br />

could, and that just might not be possible.<br />

You might have to rethink the whole paradigm<br />

of how this works.”<br />

From A to Z<br />

When it comes to the cloud, it boils down<br />

to infrastructure-as-a-service, and that<br />

means Amazon, “pretty much by default,”<br />

says Holland (see, “Eagle Eye on<br />

the Cloud”). That said, there are many<br />

other players, including Rackspace, Penguin,<br />

GoGrid, Nimbus, and Eucalyptus<br />

(open-source).<br />

Johnson & Johnson’s John Bowles said<br />

that the Amazon environment was “an<br />

eye-opener in terms of infrastructureas-a-service…<br />

Seldom mentioned in big<br />

pharma is the “opportunity cost.” If it<br />

takes six months to get a machine through<br />

CapX, there’s no cost for that time.” J&J’s<br />

tranSMART knowledge base (see, “Running<br />

tranSMART for the Drug Development<br />

Marathon,” Bio•IT World, Jan 2010)<br />

went live two years ago. “Without the<br />

cloud environment, we’d still be arguing!”<br />

said Bowles.<br />

According to Amazon Web Services<br />

(AWS) senior evangelist, Jeff Barr, AWS<br />

is like electricity—a utility that you pay<br />

as you use. “On demand, run by experts,”<br />

he said. With its roots tracing back to the<br />

Amazon<br />

Machine<br />

Image (AMI)<br />

CloudWatch<br />

Auto<br />

Scaling<br />

Region<br />

Availability Zone<br />

EC2 Instance<br />

Security<br />

Group(s)<br />

Elastic IP<br />

Address<br />

Load Balancing<br />

Amazon’s EC2 architecture remains a widely popular Infrastructure-as-a-Service.<br />

Eagle Eye on the Cloud<br />

Richard Holland, CEO of the fast-growing<br />

UK consultancy Eagle Genomics,<br />

can attest to the growing use of cloud<br />

computing. From a strength working<br />

with pharma and biotech in pipelines<br />

and data analysis, Holland says business<br />

is growing and diversifying into areas<br />

such as plant science and animal health.<br />

“We’re open-source, very transparent.<br />

We use the best tool for the job.<br />

Many customers want pilots or proofof-concepts,<br />

eager to give the cloud a go,<br />

let us demonstrate how it can work for<br />

them. When those succeed, which they<br />

usually do, we make them into production<br />

systems. Other customers are more<br />

forward thinking—they’ve already made<br />

commitment and have more detailed<br />

projects.”<br />

Eagle has collaborated with Taverna,<br />

the open-source workflow tool (see, “Democratizing<br />

Informatics for the Long-<br />

Tail Scientist,” Bio•IT World, March<br />

2011), in a partnership with the University<br />

of Manchester, handling genetic data<br />

for diagnostics. Another big customer<br />

specializes in microbial genome analysis,<br />

and is seeking to graduate from a<br />

“creaking cluster” to the cloud. Eagle is<br />

also working with the Pistoia Alliance,<br />

Elastic<br />

Block<br />

Storage<br />

EBS<br />

Snapshot<br />

Amazon S3<br />

Ephemeral<br />

Storage<br />

EBS<br />

Snapshot<br />

where Eagle will help run a competition<br />

to compress next-gen sequencing data.<br />

Holland says the major stumbling<br />

block to wider adoption of cloud computing<br />

is not so much getting data up<br />

and down (“It’s not a solved problem, but<br />

there are some decent solutions.”) but<br />

“implementing additional security layers<br />

on top of Amazon to convince people<br />

it is secure. Amazon’s infrastructure is<br />

perfectly fine, but they don’t make it any<br />

easier to implement additional layers<br />

above the operation system [than anyone<br />

else].”<br />

Eagle hasn’t had the bandwidth to<br />

tackle any clinical data projects thus<br />

far. “It’s straightforward to set up the IT<br />

system,” says Holland, “but it’s the paper<br />

trail that goes with it. We’re too small to<br />

handle that at this stage.” That would appear<br />

to be the provenance of the global<br />

consulting firms, “but they don’t have the<br />

cloud expertise.” Hopefully the two will<br />

come together at some point.<br />

Holland said he was struck by how<br />

popular open-source software is in<br />

the cloud. Many commercial licensing<br />

models haven’t adopted, he said. “Opensource<br />

is the only thing that can cope at<br />

the moment.” K.D.<br />

www.bio-itworld.com NOVEMBER | DECEMBER 2011 BIO•IT WORLD [ 33 ]<br />

CONTENTS


CONTENTS<br />

Feature<br />

1960s (commodity computing, mass-produced<br />

computers), Barr said: “We’re past<br />

the innovators and early adopters—we’re<br />

at the early majority point.”<br />

The advantages of the cloud are well<br />

known by now: no capital expenditure,<br />

pay-as-you-go, elastic capacity, and (in<br />

principle) improved time to market. “You<br />

can iterate and cycle more quickly. People<br />

love this elasticity,” said Barr. Trying to<br />

predict demand using a terrestrial data<br />

center is notoriously tricky, and inevitably<br />

leads to either an “opportunity cost”<br />

(compute power laying idle) or an inability<br />

to serve customers (demand exceeds<br />

supply). “The cloud can scale up or down,<br />

so the infrastructure matches the actual<br />

demand,” he said.<br />

The Amazon cloud is spread across<br />

six geographic regions: the US (East and<br />

West Coast), Singapore, Tokyo, Europe,<br />

and one reserved for the U.S. federal<br />

government. Users have full control over<br />

where their data are stored and processed.<br />

“If you have regulatory issues and<br />

your data must remain in Europe, that’s<br />

(continued on page 36)<br />

Genome Analysis<br />

in the Cloud<br />

According to Harvard Medical School’s<br />

Dennis Wall, whole-genome analysis<br />

in the cloud is poised to have the same<br />

impact as the development of the autoanalyzer<br />

in the 1950s, a now ubiquitous<br />

device for blood analysis.<br />

Earlier this year, Wall, Peter Tonellato,<br />

and colleagues built a model to estimate<br />

the cloud runtime based on the size and<br />

complexity of human genomes being<br />

compared, to pre-determine the optimal<br />

order of the jobs being submitted. For<br />

example, an experiment requiring nearly<br />

250,000 genome-to-genome comparisons<br />

on Amazon EC2 required some 200<br />

hours, costing $8,000—about 40% less<br />

than expected. This highly adaptable<br />

model is “potentially of significant benefit<br />

to labs seeking to take advantage of the<br />

cloud as an alternative to local computing<br />

infrastructure,” Wall blogged earlier<br />

Chris Dagdigian<br />

[34] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

MARK GABRENYA/FILE<br />

FUSARO VA ET AL. PLOS COMPUT BIOL 7(8): E1002147<br />

A<br />

Prototyping<br />

Amazon Web Services<br />

7 GB memory<br />

1.7 TB disk space<br />

$0.68/CPU hr.<br />

Work Flow Overview<br />

Truncated test set of NGS reads.<br />

2 �les with 10,000 reads per �le.<br />

[3 GB ref. genome + 2.2 MB read �les]<br />

Align reads and determine SNP<br />

calls using MAQ.<br />

[5 hours]<br />

Final alignment output �le<br />

[1 MB]<br />

Internet<br />

$3.85 $49.60 $320.10<br />

The cloud is being deployed for clinical whole genome analysis at Harvard Medical School.<br />

this year.<br />

Last summer, the same group published<br />

a report in PLoS Computational<br />

Biology outlining the steps required to<br />

B<br />

C<br />

Developing Scalable Application Scaled Application<br />

Amazon Web Services Amazon Web Services<br />

Cluster management<br />

software<br />

Development and testing<br />

Work Flow Overview Work Flow Overview<br />

Test set of NGS reads.<br />

32 �les with 1 million reads per �le.<br />

[3.34 GB read �les]<br />

Align reads and determine SNP<br />

calls using MAQ.<br />

[12 hours]<br />

Final alignment output �le<br />

[1.3 GB]<br />

Whole genome set of NGS reads.<br />

606 �les with ~7 million reads per �le.<br />

[370 GB read �les]<br />

Align reads and determine SNP<br />

calls using MAQ.<br />

[10 hours * 38 instances]<br />

Final alignment output �le<br />

[142 GB]<br />

Use your local computer to connect to instances using secure shell (ssh)<br />

and transfer data using secure copy (scp).<br />

perform a whole-genome analysis in the<br />

cloud. A full-scale alignment of a human<br />

genome took about 10 hours on 36 instances<br />

for a modest cost of $320. K.D.<br />

x N


The 2012 Bio•IT World Best Practices<br />

competition has released its call for entries.<br />

Since 2003, Bio•IT World’s Best Practices<br />

competition has been recognizing outstanding<br />

examples of technology and strategic innovation<br />

initiatives across the drug discovery enterprise.<br />

The awards attract an elite group of<br />

life science professionals: executives,<br />

entrepreneurs, innovators, researchers<br />

and clinicians responsible for developing<br />

and implementing innovative solutions for<br />

streamlining the drug development and clinical<br />

trial process. Bio•IT World’s distinguished peerreview<br />

panel of judges has reviewed more than<br />

400 entries in the program’s history.<br />

Entries will be accepted in six categories:<br />

Clinical & Health-IT; IT Infrastructure/HPC;<br />

Informatics; Knowledge Management;<br />

Research & Drug Discovery; and Personalized<br />

& Translational Medicine. The 2012 winners will<br />

receive a unique crystal award to be presented<br />

at the Bio-IT World Conference & Expo in<br />

Boston, April 24-26, 2012. Winners and entrants<br />

will also be featured in Bio•IT World.<br />

For more information on the program and to download the entry form,<br />

please visit www.bio-itworld.com/bestpractices.<br />

Deadline for Entry: January 13, 2012 | Early Bird Deadline: December 16, 2011


CONTENTS<br />

Feature<br />

(continued from page 34)<br />

fine,” he said. In addition to paying on demand,<br />

users can buy spot price instances,<br />

the price changing minute to minute.<br />

“This enables you to bid on unused Amazon<br />

EC2 capacity,” said Barr. “You can use<br />

this to obtain capacity very economically.”<br />

A recently published example comes<br />

from Peter Tonellato’s group at Harvard<br />

Medical School, building a pipeline for<br />

clinical genome analysis in the cloud (see,<br />

“Genome Analysis in the Cloud”).<br />

Barr was excited about Amazon’s new<br />

relational database service, which allows<br />

users to launch a new database in a matter<br />

of seconds. Compare that to a MySQL or<br />

Oracle database, Barr said, which might<br />

take a year to get up and running. This<br />

could allow users to offload common<br />

administration tasks—OS and database<br />

upgrades, backups etc.—to AWS. Other<br />

new initiatives include Cloud Formation<br />

and AWS Elastic Beanstalk (a simple way<br />

to deploy/manage applications). “We’ve<br />

moved from individual resources to entire<br />

apps, entire stacks, being programmable<br />

and scriptable,” said Barr.<br />

A recent announcement by DNAnexus<br />

(see p. 41) revealed that a mirror of<br />

the NIH Short Read Archive of DNA<br />

sequence data will be hosted on the<br />

Google cloud. BioTeam principal Chris<br />

Dagdigian says the platform differs from<br />

Amazon’s. “Google has a more integrated<br />

platform that you run on top of, while<br />

AWS offers infrastructure elements that<br />

can be assembled and combined in many<br />

different ways,” said Dagdigian. “If the<br />

Google/DNAnexus collaboration delivers<br />

an easy-to-use compute platform with<br />

integrated ‘big data’ support, it could be<br />

quite interesting.”<br />

Dagdigian remains a fervent backer<br />

of Amazon’s cloud infrastructure. “VMware—that’s<br />

not a cloud,” he told the<br />

crowd in La Jolla. “If you don’t have an<br />

API, or self-service, or email to humans,<br />

it’s not a cloud. If you have a 50% failure<br />

rate, it’s a stupid cloud.”<br />

Dagdigian sees “a whole new world”<br />

when it comes to moving high-performance<br />

computing (HPC) into the cloud.<br />

Instead of building a generic system<br />

accessible by a few groups, one can<br />

now stand up dedicated, individually<br />

optimized resources for each HPC use<br />

Cycle Time<br />

[36] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

Earlier this year, Cycle Computing<br />

span up 10,000 cores for Genentech<br />

for $1,060/hour. The company<br />

trumped that with a 30,000-core<br />

cluster spanning three regions for<br />

$1,280/hr for a leading<br />

pharma customer. “Why<br />

use 30,000 cores? It’s<br />

no longer about utilization,”<br />

said CEO Jason<br />

Stowe. Take data reanalysis—the<br />

cloud makes<br />

it possible to test new<br />

algorithms on historical<br />

data in a way never possible<br />

before.<br />

Cycle recently announced<br />

that it will<br />

partner with Pacific Biosciences<br />

to optimize the<br />

cloud-based version of PacBio’s SMRT<br />

Analysis software, supporting a workflow<br />

that includes sample preparation,<br />

sequencing and completed data analysis<br />

in less than one day. A beta version<br />

of the solution is expected by the end<br />

of 2011.<br />

In terms of managing data, Stowe<br />

spoke enthusiastically about Opscode’s<br />

Chef software for automating the<br />

cloud. “BioTeam discovered Chef a<br />

couple of years ago, and we’re equally<br />

big fans. It’s very developer friendly,”<br />

said Stowe. “It would be much more<br />

difficult to run at our scales without the<br />

thought that’s been put into Chef from<br />

the start,” added Cycle’s Andrew Ka-<br />

case. When it comes to hybrid clouds<br />

and “cloud bursting,” Dagdigian recommended<br />

a buy-don’t-build strategy. Data<br />

movement is a pain. “I’m a fan of open<br />

source, but if you’re doing it, buy rather<br />

than build,” he said. Companies such as<br />

Cycle Computing (see, “Cycle Time”) and<br />

Univa UD have happy customers, he said.<br />

“You can’t rewrite everything,” said<br />

Dagdigian. “Life sciences informatics<br />

has hundreds of codes that will never be<br />

rewritten. They’ll never change and will<br />

be needed for years to come.” The future<br />

MARK GABRENYA/FILE<br />

Jason Stowe<br />

zmorek. In keeping with the culinary<br />

theme, Cycle uses Grill as a monitoring<br />

solution for any Chef infrastructure in<br />

the cloud.<br />

Stowe also announced plans to offer<br />

$10,000 in free computing<br />

time in a competition<br />

“to help researchers<br />

answer questions that<br />

will help humanity.” All<br />

the attention on how<br />

many cores could potentially<br />

be spun up in<br />

the cloud left Stowe a<br />

little puzzled. “I worried<br />

that in all this glitter,<br />

we would miss what<br />

is truly gold,” he said.<br />

“Researchers are in the<br />

long-term habit of sizing<br />

their questions to the compute cluster<br />

they have, rather than the other way<br />

around. This isn’t the way we should<br />

work. We should provision compute at<br />

the scale the questions need.”<br />

Stowe wants “to wreck the status<br />

quo of HPC clusters and computational<br />

science. We will enable those crazy<br />

questions from the misfit geniuses, the<br />

ones so big that you would never even<br />

ask them… “This is about to get truly<br />

exciting… because someone is going to<br />

take these clusters and cure cancer, or<br />

Alzheimer’s, or my personal affliction,<br />

type 1 diabetes. And hopefully cure<br />

them faster because they have better<br />

tools.” K.D.<br />

of Big Data, he said, lies with tools such<br />

as Hadoop and Map Reduce. “Small<br />

groups will write such apps, publish,<br />

open-source, and we’ll all plagiarize from<br />

them,” he said.<br />

While many users want high availability<br />

and resiliency, Dagdigian said that<br />

“HPC nerds” want speed, “I’d pay Amazon<br />

extra if they’d guarantee servers in the<br />

same rack,” he said. “HPC is an edge case<br />

for obscene-scale IaaS clouds. We need to<br />

engineer around this. We have to know<br />

where the bottlenecks are.”


Assay Depot’s Cloud Services<br />

“Our goal is to empower the researcher.<br />

There’s no way for the average<br />

scientist to explore the world of services<br />

available to them.” So says Chris<br />

Petersen, co-founder and CIO of Assay<br />

Depot, which currently boasts more<br />

than 600 vendors and several thousand<br />

services on its website. The goal<br />

is to offer researchers a comprehensive<br />

list of services for pharma, such as<br />

animal models, antibodies, compound<br />

synthesis, and allow them to communicate<br />

and rate vendors.<br />

Assay Depot also builds white-label<br />

or private versions of the marketplace<br />

for individual pharma clients. Pfizer<br />

and two other big pharmas have already<br />

signed up, with other pilot projects<br />

underway. “They can customize it<br />

and integrate with finance or sample<br />

management systems,” says Petersen.<br />

“And they can create private reviews,<br />

to learn from your colleagues’ past<br />

experiences.”<br />

“It’s all about capturing institutional<br />

knowledge. Some researchers might<br />

know a lot about a particular vendor in<br />

China that other colleagues might not<br />

know. In massive companies, many<br />

groups don’t know what other groups<br />

Dagdigian couldn’t stop raving about<br />

MIT’s StarCluster (“It’s magical,” he said.),<br />

Opscode Chef, and GlusterFS (now part<br />

of Red Hat), particularly for scale-out<br />

NAS on the cloud. And he said CODO-<br />

NiS was a promising start-up for storage<br />

and security. As for future applications of<br />

cloud computing, Dagdigian insisted that<br />

Amazon itself was “no bottleneck—it’s<br />

always the server or the Internet.” “Directto-S3”<br />

file transfer solutions from Aspera<br />

(see, “Aspera’s fasp Track for High-Speed<br />

Data Delivery,” Bio•IT World, Nov 2010)<br />

also looked very promising.<br />

Leading the Charge<br />

With the need to maximize IT efficiency,<br />

Amylin’s Steve Philpott led the charge to<br />

rethink IT under considerable financial<br />

pressure (see, “Amylin, Amazon, and<br />

the Cloud,” Bio•IT World, Nov 2009).<br />

are doing.” Vendors join Assay Depot<br />

for free, and can respond to requests<br />

for information. Assay Depot charges<br />

a small annual fee if they want to list<br />

their services, and pharmas pay a hosting<br />

fee if they want a private version.<br />

Assay Depot’s pharma prospects<br />

were initially very suspicious about<br />

the cloud, says Petersen. “There was<br />

some pushback! We got them to use<br />

it by telling them we’d give them their<br />

own server in the cloud behind the<br />

firewall. We host [private] marketplaces<br />

on Amazon EC2, each hosted<br />

on its own set of servers in a security<br />

group. Often in the past, when we<br />

used to say EC2 they’d get worried;<br />

now they say, check, we’ve already<br />

audited them.”<br />

Petersen enthusiastically endorsed<br />

Opscode’s Chef. “It is how we do the<br />

automated provisioning of our server. I<br />

can’t stress enough how important it’s<br />

been to automate the infrastructure…<br />

The cloud has blurred the line between<br />

operations and software development,<br />

which really helps a startup. Since<br />

Amazon has taken that hardware and<br />

turned it into software, that line has<br />

really blurred.” K.D.<br />

“We have access to tremendous capabilities<br />

without having to build capabilities—10,000<br />

cores, new apps cheap,” said<br />

Amylin’s Todd Stewart. “Some tools—<br />

manufacturing, ERP, etc.—will always be<br />

in the data center. But let’s<br />

find process that will work<br />

in the cloud.”<br />

With both campus data<br />

centers full, Amylin led to<br />

pilot projects, including<br />

more than a dozen softwareas-a-service<br />

processes. Stewart<br />

noted that there was only<br />

one validated app in the<br />

cloud. “In general, that’s still<br />

Steve Philpott<br />

something we’re struggling<br />

with,” he said. CRM and call center apps<br />

have moved over. And Amylin has used<br />

Nirvanix Cloud storage for two years—“a<br />

cubby hole on the Internet,” said Stew-<br />

FRANK ROGOZIENSKI/FILE<br />

Chris Peterson<br />

art. “We hope to go tapeless on backups<br />

shortly.”<br />

“Chef is something we’ve had a look<br />

at,” says Eagle Genomics’ Holland. “As you<br />

scale up, it becomes less and less practical<br />

to use anything else, to be<br />

honest. You could write it<br />

yourself with a bunch of<br />

Python scripts, but someone<br />

else has done it, so why<br />

bother?!”<br />

“We launched a 10,000node<br />

cluster with one click<br />

[using Chef],” said Cycle<br />

Computing’s Andrew<br />

Kaczorek.<br />

Chris Brown, Opscode’s<br />

chief technical officer (and co-developer<br />

of Amazon EC2) said: “We’re software<br />

architects and system administrators.<br />

We’ve run Amazon.com and Xbox live.<br />

www.bio-itworld.com NOVEMBER | DECEMBER 2011 BIO•IT WORLD [ 37 ]<br />

CONTENTS


CONTENTS<br />

Feature<br />

[We’re good at] automating infrastructure<br />

at scale.” The cloud, Brown said, is<br />

not necessarily cheaper than standard<br />

hosting. “Do you have the money, time,<br />

experience? What are you willing to pay<br />

for?” he asked. “We take the experience<br />

and plug it in for you. You want to manage<br />

1,000 machines instantly. Google,<br />

Amazon—they have 100 people. Where<br />

can you find a team?”<br />

Chef is many things: a library for<br />

configuration management, a systems<br />

integration platform, and an API for the<br />

entire infrastructure. “Our mantra is to<br />

enable you to construct or reconstruct<br />

your business from nothing but a source<br />

code repository, an application data<br />

backup, and machines.”<br />

“Big pharma is concerned about auditing:<br />

now we can show a snapshot of<br />

cookbooks running on a node. We can<br />

reproduce and launch a cluster that mirrors<br />

what it was at that date and time. It is<br />

very powerful,” said Kaczorek.<br />

More Clouds<br />

The number of cloud applications and<br />

offerings is too many to list. Appistry’s<br />

new Ayrris Bio offering is poised to make<br />

a big impact on life sciences organizations<br />

(see, “Appistry’s Fabric Computing”).<br />

Assay Depot hosts fully audited, private<br />

marketplaces for pharma clients (see,<br />

“Assay Depot’s Cloud Services”). Complete<br />

Genomics selected the Bionimbus opensource<br />

community cloud as a mirror for<br />

a major genome dataset. The University<br />

of Maryland’s CloVR is a portable virtual<br />

machine launched on a desktop that can<br />

manage additional resources on the cloud<br />

(EC2, academic clouds) for large-scale<br />

sequence analysis.<br />

The San Diego Supercomputer Center<br />

is rolling out a cloud, “a private data<br />

storage cloud to enable the presentation<br />

and sharing of scientific data, with rental<br />

and owner pricing,” said SDSC’s Richard<br />

Moore. It will have an elastic design with<br />

an initial capacity of 2 petabytes, although<br />

the emphasis will be more on access and<br />

sharing. Chemaxon’s David Deng presented<br />

a collaboration with GlaxoSmithKline<br />

(GSK) on cloud computing, in which<br />

13,500 potential anti-malarial drugs have<br />

been made free available, hosted on EC2.<br />

Users access the data using ChemAxon’s<br />

[38] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

Appistry’s Fabric Computing<br />

Appistry (the name derives from the exome, RNA-seq and whole-genome<br />

phrase “Application Artistry”) offers a analysis.<br />

variety of cloud computing offerings in “We have everything set up and<br />

life sciences and bioinformatics. Michael ready to go right now,” says Groner. “Give<br />

Groner, co-founder and chief architect, us your files; we’re ready.” As for data de-<br />

said the company’s mislivery,<br />

Appistry is getting<br />

sion came from the belief<br />

hooked up on Internet 2,<br />

that engineers “spend<br />

but hard drive delivery<br />

too much time building<br />

works well. Groner says<br />

in scale management<br />

the company is offering<br />

adaptability into our<br />

“head-snapping prices”<br />

software, more than the<br />

as low as $250/run and<br />

value add we want to<br />

alignments for less than<br />

build in. That seemed<br />

$1,000.<br />

wrong.”<br />

Groner ticks off sev-<br />

Groner says his coeral<br />

advantages of Apfounder,<br />

Bob Lazzano,<br />

recognized the value of<br />

Michael Groner<br />

pistry’s architecture. “We<br />

don’t rely on virtualiza-<br />

linking cheap computers via software tion—our software is as close to the bare<br />

back in 2000. “He had insight that metal as possible (unlike Amazon and<br />

predicted cloud computing seven years other public clouds).” But the one he<br />

before it took off!” says Groner. He called stresses is the usage of computational<br />

it “fabric computing,” the idea being storage, or as Groner puts it, “where<br />

that Appistry hides hardware beneath we move work to the data.” Minimizing<br />

the fabric so that it looks like a single data movement is essential to solving<br />

computational entity. “We recognize that Big Data analytics in less time, he says.<br />

our favorite phrase lost,” says Groner, not “Our execution environment applies<br />

sounding too worried.<br />

our HPC patterns on to existing, un-<br />

That sounds a bit like Al Gore inmodified tools to execute subsets of the<br />

venting the Internet, but Appistry of- analytics on the ‘best’ machine possible,”<br />

fers a software tier that ties machines which is often the machine holding the<br />

together. “It makes multiple machines data to be processed.<br />

act as one, and hides the failures/details In addition to running data in its own<br />

from you, so you can have your ma- environment, Appistry also resells the<br />

chines adapt, scale, and modify to your appliance (up to a single rack) and cloud<br />

environment, and you can focus on the architecture, so users with very high data<br />

value of your application,” says Groner. volumes or security issues can set up the<br />

Appistry is harnessing its experience same system in their own environment.<br />

in clouds, high-performance comput- “We realized that for some solutions,<br />

ing and analytics—FedEx is one of its instead of just taking a tool perspective,<br />

biggest customers—with its new life sci- we could sell the complete solution,” says<br />

ences offering called Ayrris Bio. Appistry Groner. “We hired some bioinformati-<br />

offers its own public service based on cians and put together this complete<br />

its cloud on a per usage basis, including solution.” K.D.<br />

Instant JChem database management<br />

tool, requiring no local software installation.<br />

Deng admitted that security was not<br />

a huge issue in this particular case, but<br />

added his colleagues are eager to set up<br />

other collaborations.<br />

Former Microsoft executive and BioIT<br />

Alliance founder Don Rule wrapped up<br />

the La Jolla conference. “Cloud computing<br />

is a very powerful enabler despite the<br />

hype,” said Rule. “It’s an important enabler<br />

for personalized medicine.” Ironically,<br />

Rule is experimenting with EC2 to run an<br />

encrypted HIPAA-compliant database. •


CONTENTS<br />

Next-Gen Data<br />

New Rules for Archon<br />

Genomics X Prize<br />

Competition kicks off in<br />

2013 with Medco as<br />

presenting sponsor.<br />

BY KEVIN DAVIES<br />

The Archon Genomics X Prize,<br />

which offers a $10 million purse<br />

for a significant breakthrough in<br />

the whole genome sequencing of<br />

100 genomes, has announced adjustments<br />

to its rules, criteria, and timing, as<br />

well as officially welcoming a major new<br />

sponsor—Medco Health Solutions.<br />

The new competition will kick off on<br />

January 3, 2013, and end 30 days later. If<br />

there is no grand prize winner, the purse<br />

will be split between category winners for<br />

accuracy, completeness and haplotype<br />

phasing. “Although most races can only<br />

have one winner, we believe that after this<br />

race, the competition will benefit and entire<br />

industry,” wrote X PRIZE Foundation<br />

executives Larry Kedes and Grant Campany<br />

in a Nature Genetics commentary<br />

detailing the new rules.<br />

Launched in 2006 to great fanfare,<br />

the original competition rules stated that<br />

entrants had to sequence 100 human<br />

genomes in ten days or less at $10,000/<br />

genome to very high accuracy. The prize<br />

was initially funded by a Canadian diamond<br />

hunter named Stewart Blusson and<br />

his wife Marilyn. Further funding was<br />

supplied by genome sequencing pioneer<br />

J. Craig Venter, who in 2003 pledged<br />

$500,000 from his foundation for a<br />

major breakthrough in next-generation<br />

sequencing.<br />

But only eight organizations registered<br />

to compete, including 454 Life Sciences,<br />

ZS Genetics, and an offshoot of George<br />

Church’s Personal Genome Project. Some<br />

observers suggested that the stringency<br />

of the rules might have deterred groups<br />

from entering.<br />

“I hope this is the beginning of a new<br />

race for the human genome,” said Ryan<br />

[40] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

Phelan, the president and founder of<br />

DNA Direct by Medco. “Getting a medical<br />

grade genome is essential in translating<br />

the human genome into clinical practice.”<br />

Making the Grade<br />

In the announcement made last month,<br />

the X Prize Foundation announced several<br />

key changes to the structure of the<br />

competition.<br />

First, the competition rules have relaxed<br />

in some cases, tightened in others.<br />

Teams will be judged on accuracy, cost,<br />

speed and completeness of genome sequencing.<br />

The organizers believe that “the<br />

competition’s audacious target criteria for<br />

accuracy and completeness<br />

of sequencing could<br />

define for the first time a<br />

‘medical grade’ genome.”<br />

Teams are now given<br />

30 days to sequence the<br />

genomes of the 100 subjects,<br />

but the cost has<br />

been cut to $1,000 or less<br />

per genome, a reflection<br />

that mainstream NGS<br />

platforms are already<br />

well under the original<br />

$10,000 X PRIZE<br />

threshold. The genomes<br />

must be sequenced to an overall accuracy<br />

of 99.9999%, or no more than one error<br />

per million base pairs.<br />

Second, the pharmacy benefits manager<br />

Medco Health Solutions, has joined<br />

as the competition’s presenting sponsor.<br />

The prize will now be known as the<br />

Archon Genomics X PRIZE presented<br />

by Medco. Medco Research <strong>Institute</strong><br />

president Felix Frueh said that the new<br />

prize “is the game changer that’s needed<br />

to reach that final mile and transform<br />

the full promise of genomic research into<br />

practical clinical care.”<br />

Third, the 100 genomes to be sequenced<br />

will be those of volunteering<br />

(and consented) centenarians, rather<br />

than the eclectic group of celebrities,<br />

T eams will be<br />

judged on<br />

accuracy, cost,<br />

speed and<br />

completeness<br />

of sequencing.<br />

scientists and other volunteers originally<br />

envisioned. These centenarians, or the<br />

‘Medco 100 Over 100,’ are, in the words<br />

of the X PRIZE press release, “donating<br />

their genome to this competition in order<br />

to further medical science.”<br />

“Our goal is to assemble an ethnically<br />

diverse group of ‘genomic pioneers’ who<br />

are at least 105 years of age,” said Campany,<br />

senior director of the genomics X<br />

prize. “Such individuals who have evaded<br />

all of the common diseases associated<br />

with aging are effectively supercontrols<br />

whose genomes deserve to be scrutinized<br />

in contrast to the genotypes of the many<br />

disease cohorts currently under investigation,”<br />

commented the editors of Nature<br />

Genetics.<br />

The sequencing results will be compiled<br />

into a public database (such as<br />

dbGaP), raising the notion that the<br />

resulting data might help elucidate the<br />

presence of so-called ‘wellness’ genes. It<br />

should be noted, however,<br />

that several other<br />

initiatives to parse the<br />

genomes of centenarians<br />

are already underway.<br />

For example, Complete<br />

Genomics and Scripps<br />

Health announced earlier<br />

this month that they<br />

will collaborate on a<br />

project to sequence the<br />

genomes of 1,000 ‘wellderly’<br />

individuals.<br />

Medco’s Phelan said<br />

the company has been<br />

“driving innovation around personalized<br />

medicine for at least five years,” particularly<br />

in the area of genetic testing and<br />

pharmacogenomics. “Medco was one of<br />

the first, and is still one of the only, Fortune<br />

500 companies, to really embrace<br />

personalized medicine to this degree,”<br />

says Phelan.<br />

With the rapid developments in genome<br />

sequencing and interpretation,<br />

“Our sense was that we could help move<br />

the needle in this area. How could we not<br />

help usher in this new era of personalized<br />

medicine?...<br />

“Hopefully we’ll see a big explosion in<br />

the bioinformatics, getting this information<br />

then how do we interpret it? That’s<br />

the biggest white space.” •


DNAnexus to Mirror SRA<br />

Database in Google Cloud<br />

Mirror site for NCBI<br />

sequence repository<br />

facing funding challenges.<br />

BY KEVIN DAVIES<br />

DNAnexus will offer free hosting<br />

and access to the Short Read<br />

Archive (SRA), the funding-challenged<br />

trove of NGS read data<br />

hosted by the National Center for Biotechnology<br />

Information (NCBI).<br />

Earlier this year, NCBI announced<br />

that it would be phasing<br />

out funding for the<br />

SRA. “We realized this<br />

would be a huge loss for<br />

the community, but a<br />

great opportunity for us<br />

to step up and preserve<br />

it,” said DNAnexus cofounder<br />

and CEO, Andreas<br />

Sundquist.<br />

DNAnexus will host<br />

a publicly available<br />

copy of the SRA repository<br />

on Google’s Cloud<br />

Storage infrastructure<br />

(see sra.dnanexus.<br />

com).<br />

According to NCBI’s<br />

Jim Ostell, NCBI remains<br />

the primary archive for SRA, but<br />

he welcomed the offer from a reliable<br />

commercial source to provide an alternative<br />

hosting environment. “We agreed<br />

with them that there was certainly a need<br />

for nice packaged tool sets for people<br />

working with high-throughput sequence,”<br />

said Ostell. “If anyone finds it useful,<br />

either to explore and analyze the public<br />

data, or to work on pre-release data of<br />

their own, then that’s good.”<br />

However, Ostell stressed that DNAnexus<br />

is not taking over SRA. “They are not<br />

an archive, they don’t issue accession<br />

numbers, and are not part of any official<br />

NIH data publishing process… It’s been a<br />

strictly technical issue of transferring data,<br />

working with Google, and getting their<br />

platform in place.”<br />

While central NIH funding for the<br />

SRA is ending this month, NCBI will<br />

still accept certain classes of SRA data<br />

that don’t necessarily generate massive<br />

amounts of data, but are important for<br />

the scientific record. Some individual<br />

NIH institutes have agreed to fund NCBI<br />

directly to keep SRA going for their studies,<br />

says Ostell.<br />

“We’re doing this to help NCBI with<br />

their mission,” explained Sundquist. “Our<br />

hope is that the hosted version of the<br />

Search Across Different Objects<br />

SRA will provide a complementary way<br />

for researchers to access these data.” He<br />

projects that the SRA repository will grow<br />

tenfold each year. “The SRA has done a<br />

tremendous service to the research community<br />

by capturing these data and we<br />

want to help preserve it.”<br />

Sundquist says DNAnexus has cleaned<br />

up the SRA interface as “it’s been a little<br />

cumbersome to use.” Eventually, researchers<br />

might be able to submit data direct to<br />

DNAnexus to host in the SRA. “There is<br />

no sign up required for anyone who wants<br />

to use SRA.”<br />

“No-one thinks the government<br />

should be providing access in perpetu-<br />

ity,” says Sundquist. “When the SRA was<br />

originally built, it was a different era,<br />

a different volume of data.” Sundquist<br />

expects the archive to swell to hundreds,<br />

possibly thousands of times its present<br />

size in the years ahead. “[SRA] will be<br />

a tiny bit of data compared to five years<br />

from now. Think what it will be like when<br />

we’re sequencing millions or tens of millions<br />

of genomes!”<br />

Google Backing<br />

DNAnexus also announced it had received<br />

funding from Google Ventures (see<br />

below) in a $15-million round, a relationship<br />

that Sundquist says shows Google’s<br />

commitment to help “democratize DNA<br />

data.” A copy of all the data in the public<br />

SRA will be hosted in Google Cloud Storage.<br />

“The DNAnexus SRA website is an<br />

example of a ‘big data’ initiative that benefits<br />

from rethinking<br />

the interface in a 100%<br />

web-enabled world,”<br />

said Eric Morse, head<br />

Single Search Box<br />

DNAnexus is offering free access and easy navigation to the SRA database.<br />

Ranked List of<br />

Search Results<br />

of business development,<br />

Google Cloud<br />

Storage.<br />

The Google tie-in<br />

is interesting, as most<br />

of the DNAnexus infrastructure<br />

is built on<br />

Amazon’s EC2 Cloud.<br />

“Now we’re working<br />

with both Amazon<br />

and Google on providing<br />

access to large genomic<br />

datasets,” notes<br />

Sundquist.<br />

Sundquist also an-<br />

nounced a significant cut in pricing for<br />

academic customers. “In some ways, the<br />

academic community is the key to driving<br />

this space forward. Because of the great<br />

response, we’ve slashed our prices substantially<br />

by half for academia, effective<br />

immediately.”<br />

Sundquist says DNAnexus is “absolutely<br />

focused” on genome interpretation, recognizing<br />

a huge opportunity for growth.<br />

“For us, it’s not just about the ‘$1 million<br />

interpretation’ for one genome, you have<br />

to think about this interpretation and<br />

scale it up to thousands of genomes. That’s<br />

a whole different domain, a huge space<br />

that no-one has built anything around.” •<br />

www.bio-itworld.com NOVEMBER | DECEMBER 2011 BIO•IT WORLD [ 41 ]<br />

CONTENTS


CONTENTS<br />

Next-Gen Data<br />

Inova/Complete Partnership<br />

Signals ‘Next-Gen Medicine’<br />

Initial agreement focuses on pre-term babies.<br />

BY KEVIN DAVIES<br />

The CEO of the Inova Translational<br />

Medicine <strong>Institute</strong> (ITMI), which recently<br />

signed a major deal to sequence the<br />

genomes of hundreds of pre-term babies,<br />

has shed light on the motivation and goals<br />

of the partnership.<br />

The Complete Genomics deal with<br />

ITMI calls for the sequencing of 1,500 genomes—including<br />

250 pre-term babies,<br />

their parents and an equal number of<br />

controls. Complete Genomics will begin<br />

delivering sequencing results to ITMI in<br />

the next 2-3 months and expects to finish<br />

most of the 1,500 genomes in the first<br />

quarter of 2012.<br />

ITMI CEO John Niederhuber believes<br />

that the project will yield clues as to the<br />

genetic basis of premature births, and<br />

potentially even therapeutic targets for<br />

various obstetrical abnormalities.<br />

ITMI is a not-for-profit research institute<br />

within the Inova Health System in<br />

Northern Virginia. Niederhuber joined<br />

ITMI one year ago following his tenure as<br />

director of the National Cancer <strong>Institute</strong><br />

(NCI). In that role, Niederhuber had<br />

already immersed himself in the potential<br />

medical benefits of next-generation<br />

sequencing (NGS). He launched a pilot<br />

program of The Cancer Genome Atlas<br />

(TCGA) with Francis Collins, director at<br />

the National Human Genome Research<br />

<strong>Institute</strong> at the time.<br />

Next-Gen Medicine<br />

“When I left the government last July<br />

[2010], I was wondering what could I<br />

do next?” Niederhuber says. “If we all<br />

believe the future of medicine is going to<br />

be transformed by this type of technology<br />

and others including proteomics, then<br />

maybe I could have some impact by working<br />

out how one is going to use this? Not<br />

just for research but making a difference<br />

at the point of care.”<br />

Niederhuber’s initial focus was on can-<br />

[42] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

cer and other adult diseases, but he concluded<br />

that he would be echoing efforts<br />

going on elsewhere. On the other hand, he<br />

realized that Inova served a huge population<br />

of patients, with dozens of babies<br />

delivered every day, not to mention one<br />

of the largest regional neonatal intensive<br />

units (NICUs) at Inova Fairfax Hospital.<br />

“Maybe the future of medicine is about<br />

being able to do this care characterization<br />

before birth or at birth and build from<br />

there, understanding the potential targets<br />

for managing risk and having a reference<br />

point if disease develops down the road.<br />

That’s really next-gen medicine! We know<br />

very little about the causes of obstetrical<br />

diseases. We have no clue at all about<br />

what causes preterm delivery,” he says.<br />

Niederhuber received backing from<br />

his colleagues, including Inova CEO Knox<br />

Singleton. “They said, ‘No, you’re not<br />

crazy!’” Although he wondered if physicians<br />

and patients would be receptive,<br />

the ITMI proposal sailed through IRB<br />

approval.<br />

ITMI plans to sequence 250 pre-term<br />

babies with no known congenital abnormalities<br />

and 250 full-term newborns,<br />

along with their parents, for a total of<br />

1,500 genomes. ITMI has been recruiting<br />

families since last June, with more than<br />

150 entered to date. The first group of<br />

samples has already been sent to Com-<br />

plete Genomics for sequencing.<br />

Niederhuber says he had a lot<br />

of confidence in Complete Genomics<br />

“because of our past work<br />

at NCI”. He considered BGI as a<br />

potential outsourcing partner, but<br />

didn’t talk to the Chinese institute<br />

about this specific project. “That<br />

was because of its pilot nature, and<br />

I suppose I wanted more control<br />

and accessibility,” he says.<br />

The initial goal for Niederhuber<br />

and his collaborators is “to<br />

mine that data and find markers<br />

for pre-term delivery.” But that will raise<br />

many more questions: “How are we going<br />

to integrate this information with the<br />

phenotypic information of patient care?<br />

How are we going to make that useful<br />

information at the point-of-care with the<br />

physician, who may not know much about<br />

genetics?”<br />

A huge challenge will be the medical<br />

interpretation on the sequences. Niederhuber<br />

says that “conversations with a<br />

variety of entities” are taking place, with<br />

the goal of developing new interpretation<br />

tools, either in-house or in conjunction<br />

with others. “We will work with our own<br />

people, but we’ll also contract with others<br />

in partnership to apply the tools that<br />

exist. We’re very committed to trying not<br />

only to work on the data itself but also<br />

develop the tools to fill that space on the<br />

analytics side for patient care.”<br />

The next phase of the program will<br />

include a longitudinal cohort of 2,000<br />

offspring and is being finalized currently.<br />

Niederhuber outlined some of the likely<br />

areas of investigation: “What if we go to<br />

the obstetrics arena in the first trimester,<br />

and we consent the mothers, and look at<br />

proteins, genetics and so on, and follow<br />

through pregnancy? What if we followed<br />

after birth? What if we expand beyond<br />

parents into grandparents? That would<br />

be interesting.” •


Genia’s Nanopore/Microchip Technology<br />

Gains Life Technologies’ Support<br />

4th-generation sequencing<br />

platform taking shape in<br />

Silicon Valley.<br />

BY KEVIN DAVIES<br />

SAN FRANCISCO—The genesis of<br />

Genia, a promising Silicon Valley nanopore<br />

sequencing start-up, took place—not<br />

for the first time—with a serendipitous<br />

meeting at a popular branch of Starbucks<br />

in Menlo Park, California.<br />

Roger Chen was reading a book on the<br />

origins of life. Stefan Roever was buying<br />

a cappuccino. The two men struck up a<br />

conversation. Chen said he was developing<br />

a DNA sequencing machine in his<br />

garage, to which Roever replied, “What’s<br />

that?!”<br />

But it wasn’t long before Roever, a serial<br />

entrepreneur, was offering to help assemble<br />

a team and write a business plan.<br />

He recruited a couple of additional angel<br />

investors and signed on as CEO.<br />

“If Ion Torrent—electrical detection<br />

but requiring amplification—and Pacific<br />

Biosciences—single-molecule but optical—are<br />

3rd generation [sequencing<br />

technologies], then we’re 4th generation:<br />

single molecule, electrical detection,”<br />

says Roever. “That’s the holy<br />

grail, because it combines<br />

low-cost instruments with<br />

simple sample prep. So we’d<br />

like to think of it as last-gen!”<br />

Genia has since caught<br />

the attention of the big guns.<br />

In April 2011, Genia closed<br />

a strategic investment with<br />

Life Technologies that Dietrich<br />

Stephan, who sits on<br />

Genia’s scientific advisory<br />

board, calls “a double digit”<br />

[millions of dollars] investment.<br />

“Life is putting signifi-<br />

Roger Chen<br />

cant additional in-house resources behind<br />

the product,” he says.<br />

A Life Technologies spokesperson reserved<br />

comment on the investment.<br />

Genia is building an integrated circuit and says sequencing is just one potential application.<br />

“Genia is an amazing technology,”<br />

Stephan told Bio•IT World. “They have<br />

developed key proprietary innovations<br />

around the pore that allow single molecules<br />

of single-stranded DNA to move<br />

through the pore slowly, so the sequence<br />

can be measured accurately. Key innovations<br />

around the array-based sensor<br />

allow the pores to be electronically selfassembled<br />

into lipid films, and the DNA<br />

molecules to be moved bidirectionally<br />

under tight control.”<br />

Stephan continued: “The sensor itself<br />

is truly transformative and allows very<br />

small signals to be seen high above the<br />

noise floor, which is one<br />

of the issues all the other<br />

nanopore companies are<br />

struggling with, as well as<br />

allowing massively parallel<br />

measurements to be made<br />

directly on an integrated<br />

circuit.”<br />

Founding Four<br />

Roever moved to California<br />

in 2000 after building an<br />

Internet banking company<br />

in his native Germany in the<br />

late ’90s, which he took public<br />

and grew to 2,000 people. He has since<br />

been involved with various technology<br />

start-ups, although he admits he’s neither<br />

a biology guy nor an electrical engineer.<br />

“Right now, we’re more focused on building<br />

a product. Technology development is<br />

what I’ve done all my life,” he says.<br />

The name “Genia” was actually one of<br />

dozens of unused candidates that Roever<br />

received in an earlier naming study for a<br />

previous company that cost him $50,000.<br />

“I dug out that list and one of them stuck.<br />

Sounds like a good name for a sequencing<br />

company!”<br />

Stephan believes that one of Genia’s<br />

key differentiators is that most of the<br />

firm’s senior management has roots<br />

in the semiconductor industry. Chen<br />

(chief technology officer) and the other<br />

co-founders, Pratima Rao (VP marketing)<br />

and Randy Davis (VP R&D), are all<br />

alums of Maxim Integrated Products,<br />

one of the leading analog-to-digital chip<br />

companies, who gravitated independently<br />

to biochemistry. Chen worked with David<br />

Deamer in nanopore sequencing at the<br />

University of California Santa Cruz. Davis<br />

managed the UCSF cancer center core<br />

lab, and Rao previously headed product<br />

marketing for Affymetrix.<br />

In addition to Stephan, the scientific<br />

advisory board includes George Church<br />

(Harvard Medical School), and two of<br />

Stephan’s former Navigenics colleagues,<br />

Sean George (CEO Locus Development)<br />

and computational biologist Eran Halperin.<br />

They are joined by Bob Dobkin,<br />

founder of Linear Technology.<br />

www.bio-itworld.com NOVEMBER | DECEMBER 2011 BIO•IT WORLD [ 43 ]<br />

CONTENTS


CONTENTS<br />

Next-Gen Data<br />

Analog Advantage<br />

A number of companies are vying to<br />

commercialize nanopore sequencing<br />

technology, including Oxford Nanopore<br />

in the UK, NABsys in Providence, RI, and<br />

NobleGen. So what does Roever consider<br />

is Genia’s advantage?<br />

“At our core, we’re an electronics<br />

platform,” says Roever. “We’ve got some<br />

developments on biochemistry, but we’re<br />

developing a massively parallel analog<br />

sensor to do nanopore-based sequencing,<br />

supporting a variety of chemistries on the<br />

[chip] surface. But at the core, we’re a<br />

chip company.”<br />

“Most chips today are purely digital—<br />

digital input, digital output, with some<br />

processing in the middle,” Roever continues.<br />

“A few chips take analog inputs, e.g.<br />

temperature sensors or a mobile phone<br />

sensor. It’s a bit of a black art.”<br />

Chen previously showed that by using<br />

custom electronics—state-of-the-art<br />

analog-to-digital processing with very<br />

small currents (picoamp range) and signal<br />

(femtoamp range)—it is possible to<br />

count “literally hundreds of electrons,”<br />

potentially a much finer resolution than<br />

competing technologies.<br />

A popular model in a nanopore sequencing<br />

scheme is to measure a single<br />

molecule of DNA traversing the pore.<br />

“The single base difference you’re looking<br />

for is effectively about 10 atoms,” says<br />

Roever. Those 10 atoms can be detected<br />

by measuring the electronic stream. “The<br />

signal is in the hundreds of electrons.<br />

You’re looking at the very edge of what’s<br />

electrically detectable.”<br />

While other companies are engineering<br />

genetically modified pores, Genia is<br />

building an integrated circuit—essentially<br />

a checkerboard array of analog-to-digital<br />

sensors or electrodes exposed on the chip<br />

surface. Each electrode is a few microns<br />

in diameter, potentially enabling as many<br />

as 1 million electrodes to be packed onto a<br />

chip. The analog-to-digital signal conversion<br />

occurs on the chip.<br />

The process sounds deceptively simple:<br />

put the DNA in solution above the array of<br />

nanopores and sensors. “We can make it<br />

all electronically—setting up the bilayer<br />

and the pore, plug the chip in, power it<br />

on to make the bilayers and pores,” says<br />

Roever. Next, apply the potential such that<br />

[44] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

‘ We’re<br />

developing a massively<br />

parallel analog sensor to do<br />

nanopore-based sequencing,<br />

supporting a variety of chemistries<br />

on the [chip] surface.”<br />

current flows through the pores. “We detect<br />

the operating cells and start sucking<br />

in the DNA and measuring it… Sequencing<br />

is just one application. For example,<br />

you could put transporter proteins in the<br />

bilayer for drug discovery applications.”<br />

There are two advantages to this system,<br />

says Roever. “First, we can resolve<br />

signal much better than if you have to take<br />

signal off and process it outside. It’s easier<br />

for us to see single base differences. We<br />

can react to any changes by moving the<br />

DNA back and forth across the pore.”<br />

“Second, we can dynamically build<br />

these lipid bilayer/nanopore complexes on<br />

the chip surface. Instead of doing that mechanically,<br />

we’ve got software to make the<br />

bilayer and start the nanopore... Instead<br />

of a single sensor or a few hundred or a<br />

thousand, we have hundreds of thousands<br />

or 1 million.”<br />

Slow Motion<br />

The knock on nanopores has been the<br />

need to hinder the passage of singlestranded<br />

DNA through the pore. “We’ve<br />

got a way to slow it down, by a combination<br />

of electronics and some biochemistry,”<br />

says Roever. The current platform<br />

requires less than 2 seconds to read a base,<br />

but Roever expects to push that down to<br />

less than 1 second/base. “There could be<br />

1 million sensors at 1 base/second,” says<br />

Roever.<br />

Roever says Genia can actively control<br />

the DNA template, moving it back-andforth<br />

through the nanopore if required<br />

in a kind of flossing action. “We can<br />

oversample, rewind and read again. You<br />

change the applied voltage and the DNA<br />

goes backward. If you capture the DNA in<br />

the pore, you can ‘dental floss’ it—you can<br />

Stefan Roever, Genia<br />

read it 10-20 times.” Roever would not<br />

detail the read-out mechanism, other than<br />

to say, “Our approach relies on some IT to<br />

reassemble those sequences.”<br />

The chips themselves, he says, will<br />

be extremely affordable and have a cost<br />

structure similar to a standard semiconductor<br />

chip today. Patents have been filed,<br />

says Roever, who adds: “What we have<br />

to license will depend on what chemistry<br />

we decide to use, what pore we use, what<br />

technique we wind up doing... But for the<br />

core approach, we think we have freedom<br />

to operate.”<br />

One of the next goals for Genia is to<br />

build an instrument, which Roever says<br />

shouldn’t be any more elaborate than Ion<br />

Torrent’s Personal Genome Machine. “Ultimately,<br />

it could be a chip in a handheld<br />

reader connected to a PC,” he says. A milestone<br />

early next year will be receipt of the<br />

first working chip, containing hundreds of<br />

sensors, from a Taiwanese foundry.<br />

Roever says Genia is essentially “a platform<br />

to let nature do its thing.” He plays<br />

down talk of competition. “I don’t see Oxford<br />

Nanopore as a competitor—there’s no<br />

reason their chemistry couldn’t run on our<br />

platform, right? If they develop a working<br />

chemistry, get a better mutated pore, why<br />

couldn’t they run on our platform?”<br />

The Genia website dares to proclaim<br />

“the $100 genome.” Roever checks himself,<br />

but will say this: “Within the next<br />

decade, you’ll go to the doctor, the nurse<br />

will take a saliva sample and sequence<br />

everything in there. By the time you see<br />

the doctor, he’ll say it’s just a rhinovirus or<br />

Epstein-Barr virus or what have you. Instead<br />

of the guessing game today, people<br />

will look back to this time as medieval<br />

medicine!” •


IT/Workflow<br />

Making Semantic Sense of<br />

Unstructured Data<br />

Sophia’s Digital Librarian software relates documents without taxonomies.<br />

BY KEVIN DAVIES<br />

A<br />

somewhat stifling aspect of many<br />

sophisticated semantic search<br />

tools is the need to build ontologies<br />

and taxonomies to organize<br />

the data. Nonsense, says David Patterson,<br />

the co-founder and CEO of Sophia<br />

Search, who believes in freedom from<br />

taxonomies or ontologies.<br />

“People are so attuned to building<br />

taxonomies and ontologies because they<br />

think they need to,” says the Northern<br />

Irishman. “Our message is, that just isn’t<br />

true! We understand the purpose they<br />

serve, but one of our long-term goals has<br />

been to build a search tool that doesn’t<br />

rely on background knowledge structures.<br />

Let’s free people up from the overheads<br />

and expense of these knowledge<br />

structures.”<br />

The Sophia Digital Librarian search<br />

tool thrives on finding relationships between<br />

documents without the need for<br />

any kind of taxonomy. Just set the software<br />

loose on a collection of unstructured<br />

data, and it does the rest.<br />

Patterson co-founded Sophia with<br />

Vladimir Dobrynin, a professor at St.<br />

Petersburg State University and Sophia’s<br />

chief science officer. Patterson’s background<br />

is in artificial intelligence. He was<br />

director of an artificial intelligence R&D<br />

lab at the University of Ulster, working on<br />

data mining, machine learning, and information<br />

retrieval. Patterson has collaborated<br />

with Dobrynin since 2003. After<br />

some five years of R&D, the researchers<br />

came up with a prototype version of Sophia.<br />

The Sophia product engineering<br />

team is based in Belfast, while the R&D<br />

team is based in St. Petersburg, Russia<br />

and sales in Silicon Valley, California.<br />

Semantic Searching<br />

“The Sophia Digital Librarian is all about<br />

content enrichment within organizations<br />

or repositories,” says Patterson. “It helps<br />

organizations improve the findability of<br />

the content they have in their organization<br />

or external repositories.” Interest<br />

is high among pharma companies, life<br />

science organizations and the scientific<br />

publishing community.<br />

The Digital Librarian is built on Sophia’s<br />

patented contextual discovery engine,”<br />

says Patterson. “We’re empowering<br />

organizations to become more innovative<br />

and creative through discovery. Search<br />

should not just be about retrieving information<br />

you expect to find but also about<br />

uncovering and discovering new things<br />

you weren’t aware of.”<br />

Taxonomies can stifle<br />

innovation, says Patterson,<br />

by constraining<br />

staff to all think<br />

conventionally and<br />

uniformly. “You’re not<br />

allowing your employees<br />

to think freely… If<br />

we are constrained by<br />

the boundaries of conventional<br />

thinking, we<br />

limit creativity and our<br />

ability to discover new<br />

things. Sophia removes<br />

these constraints and<br />

David Patterson<br />

enables users to discover unknown relationships<br />

and knowledge from within<br />

their content—that’s the real power behind<br />

what we do.”<br />

After reading through documents in<br />

an organization’s repository, the Sophia<br />

Librarian gets to work. “It extracts the<br />

meaning, organizes them into topics,<br />

understands what they’re about, captures<br />

metadata describing their topic and<br />

subtopics, and attaches tags to the document<br />

to make them more findable,” says<br />

Patterson. Tags can be extracted from<br />

the document, but the tool also creates<br />

semantic tags—relevant words or phrases<br />

that do not actually crop up in the source<br />

document.<br />

“We also identify the most similar<br />

documents among all the documents<br />

in the corpus,” says Patterson. “Then we<br />

rank the “nearest neighbors”—the most<br />

semantically similar documents. This<br />

quickly brings to the user a list of related<br />

documents and helps users focus and find<br />

information relevant to their needs.”<br />

The metadata—Sophia calls it the<br />

“semantic profile”—are exposed as XML,<br />

through a series of web services to make<br />

the information available to any search or<br />

content management tool. The goal, says<br />

Patterson, is to augment<br />

current search tools and<br />

make them smart.<br />

In a quick demo, Patterson<br />

uses Sophia to automatically<br />

create a semantic<br />

profile for a document<br />

taken from the Web using<br />

knowledge extracted<br />

from a corpus of 1.8 million<br />

news stories spanning<br />

20 years. The articles<br />

have been automatically<br />

indexed by Sophia and<br />

semantic profiles created<br />

for each document. “Sophia<br />

uses knowledge extracted from these<br />

documents to come up with a semantic<br />

profile—topic, tags and nearest neighbors—for<br />

the new document,” he says. It<br />

uses knowledge extracted from the news<br />

corpus to intelligently assign metadata to<br />

the new document.<br />

A similar search can be done using an<br />

abstract from PubMed on Alzheimer’s<br />

disease, for example, retrieving tags related<br />

to dementia, amyloid protein, and<br />

neurodegenerative diseases among others.<br />

The numerical score offers a gauge<br />

of the “distance” between the source<br />

document and any retrieved article. “Our<br />

www.bio-itworld.com NOVEMBER | DECEMBER 2011 BIO•IT WORLD [ 45 ]<br />

CONTENTS


CONTENTS<br />

IT/Workflow<br />

capabilities relate topics and documents<br />

together that wouldn’t have been otherwise<br />

known,” says Patterson. “We’re able<br />

to make connections years ahead of [traditional<br />

means of] discovery.”<br />

In time, it will be possible to filter<br />

those results by source or date, for example<br />

selecting just PDFs or documents<br />

within a specific date range. The functionality<br />

is there, says Patterson, it just requires<br />

using a web services user interface.<br />

Customer Stories<br />

Sophia’s sales director Jeff Bierach says<br />

the company has a number of customers<br />

in the U.S. already, and is targeting many<br />

more in the life sciences community and<br />

publishing sectors. As discussed in a<br />

recent Bio•IT World guest commentary<br />

(see, “Reevaluating the Role of Research<br />

Librarian in Pharma R&D,” Bio•IT World,<br />

Sept 2011), many pharma companies have<br />

been downsizing their librarian staff for<br />

some time.<br />

“Look at what’s been happening in<br />

pharma companies over the last 3-5<br />

years,” says Bierach. “Genentech used<br />

to have a staff of over 20 librarians—researchers<br />

finding information. They have<br />

zero now. That’s really a big play for us, to<br />

automate a lot of this capability around<br />

discovery and document classification.<br />

Those are really time consuming tasks<br />

that we can automate.”<br />

One big pharma company had spent<br />

three man-years building custom queries<br />

for PubMed to extract information<br />

on over 30 different topics they track.<br />

“Based on the last two years of MED-<br />

LINE abstracts which we have indexed,<br />

our client was able to reproduce those<br />

queries in Sophia in less than one week,”<br />

says Bierach. “We essentially solved his<br />

entire job that took three years to get to<br />

where he is now. There’s a huge amount<br />

of value there.<br />

“We’re starting to create semantic profiles<br />

for content of a pharma company in<br />

San Francisco,” adds Patterson. “They use<br />

Microsoft FAST [search tool] and want<br />

to add semantic profiles to the content.”<br />

With stagnant drug discovery pipelines,<br />

Patterson also sees an opportunity in drug<br />

repositioning, helping to find correlations<br />

at significant cost savings.<br />

While most content search applica-<br />

[46] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

A search done using a PubMed abstract on Alzheimer’s disease retrieves related tags, and the<br />

estimates the “distance” between the source document and any retrieved article.<br />

tions focus on bespoke internal datasets<br />

rather than the web, Patterson says<br />

Sophia is looking to work with partners,<br />

including a Google reseller in Chicago.<br />

“We’ll supply a layer between the Google<br />

search appliance and Sophia.”<br />

Another promising target segment is<br />

the publishing sector, which could leverage<br />

Sophia to ‘up-sell’ related journal articles<br />

to customers. Another early project<br />

involved helping a publishing company<br />

sift through 20 years of legacy content.<br />

“Within three days, we were able to see<br />

what information was evergreen. The client<br />

said that saved him 9 man-months of<br />

effort,” says Patterson.<br />

Version 1.2 of the Sophia Digital Librarian<br />

is now in full commercial release,<br />

but Patterson stresses that his team is still<br />

building functionality. Because the company<br />

is relatively small, “we can move very<br />

quickly,” says Bierach. •


The PAREXEL Biopharmaceutical R&D Statistical<br />

Sourcebook 2011/2012 is the leading resource for statistics,<br />

trends, and proprietary market intelligence and analysis on<br />

the biopharmaceutical industry. Supported by thousands<br />

of graphs, illustrations, and analysis, the Sourcebook<br />

provides the latest intelligence on every aspect of<br />

biopharmaceutical development – from product discovery,<br />

to R&D performance and productivity, to time-to-market<br />

trends.<br />

With key analysis and contributions from leading<br />

consultancies and experts, the Sourcebook provides<br />

real-world data and analysis, including:<br />

• New proprietary analysis on US clinical trial starts,<br />

segmented by therapeutic category, as well as overall active<br />

clinical trials<br />

• Emerging data on worldwide and company-specific R&D<br />

pipelines and product launch trends<br />

• An all-new and comprehensive analysis of clinical research<br />

off-shoring revealing which pharma companies are now<br />

locating their new clinical trials overseas<br />

• New analysis on emerging trends in pharma and biotech<br />

licensing deals and other partnerships critical to industry’s<br />

efforts<br />

• Drug approval statistics compiled from FDA, EMA, and other<br />

regulatory agencies<br />

• New global R&D spending trends and other<br />

international R&D data from key markets<br />

• International statistics on drug<br />

development output<br />

• And much more!<br />

Plus, NEW in the<br />

2011/2012 edition:<br />

• All-new analyses and actual/<br />

projected metrics on the<br />

biosimilars market<br />

• A series of new<br />

“dashboards” on costs<br />

by phase of development,<br />

R&D attrition rates, product<br />

development times, and<br />

other areas<br />

• Forecasting models on<br />

biopharma sales, R&D<br />

spending, the pharma/<br />

biotech markets, and other<br />

meaningful industry metrics<br />

• New analyses on patient<br />

recruitment into clinical trials<br />

The PAREXEL Biopharmaceutical<br />

R&D Statistical<br />

Sourcebook 2011/2012 is a<br />

must-have resource for the<br />

drug development industry.<br />

It is an invaluable resource<br />

for executives and managers<br />

working in the pharma<br />

and biotech industries. The<br />

Sourcebook puts real-world<br />

data sets at your fingertips<br />

for presentations, reports,<br />

business development efforts,<br />

strategic meetings, and critical<br />

decision-making analyses.<br />

The 2011/2012 edition will once<br />

again be offered in electronic<br />

format for individual users,<br />

small groups, business units,<br />

or for company-wide access.<br />

To reserve the PAREXEL<br />

Biopharmaceutical R&D<br />

Statistical Sourcebook<br />

2011/2012, call 1-800-856-2556<br />

or email customer.service@<br />

BarnettInternational.com/<br />

Publications.


CONTENTS<br />

IT/Workflow<br />

The British Are Coming,<br />

Connecting People and Patients<br />

BT Global Services’ Yury Rozenman discusses the<br />

changing landscape of pharma services.<br />

BY KEVIN DAVIES<br />

Why would a chemist who has worked<br />

for some of the leading bio-IT companies<br />

of the 15-20 years, including Celera Genomics,<br />

Applied Biosystems, D.E. Shaw<br />

Ventures and IBM, find himself working<br />

for the company formerly known as British<br />

Telecom?<br />

“I was just as surprised,” said<br />

Yury Rozenman, who is BT Global<br />

Services’ head of marketing strategy<br />

and solution development for<br />

life sciences (see, “A Long Road to<br />

BT”). “Why would BT Global Services<br />

be interested in developing<br />

solutions for pharma?”<br />

Why, indeed. At the time, he<br />

explained, BT began to realize<br />

that, even as companies such as<br />

IBM present solutions for data<br />

integration and connectivity, “the<br />

missing element was connecting<br />

people, making sure they could<br />

make sense of the information<br />

and bring greater value to pharma<br />

companies.”<br />

One of the most exciting examples<br />

of enterprise communications<br />

services offered by BT Global Services<br />

is the management of all patient<br />

records for the UK’s National<br />

Health Service (NHS), including<br />

solutions such as e-prescriptions<br />

and pharmacovigilance. “We knew<br />

we could take those assets and<br />

apply them globally to the health<br />

care and life sciences ecosystem—<br />

pharma, biotech, CROs, and potentially<br />

even payors and providers,” says<br />

Rozenman.<br />

BT’s bread and butter is in what<br />

Rozenman calls “transformational deals,”<br />

where pharma companies acknowledge<br />

that tasks such as managing a global network<br />

is not their core expertise. BT wins<br />

if it can demonstrate it can bring better<br />

[48] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

MARK GABRENYA/FILE<br />

return on investment and implement useful<br />

services on top of the network—and do<br />

it all, of course, in accordance with tight<br />

regulatory requirements (e.g. GxP, FDA<br />

21 CFR Part 11, etc).<br />

In 2005, BT signed a large US pharma<br />

to manage its enterprise communications<br />

Yury Rozenman heads BT’s market strategy for life sciences.<br />

services, as well as Swiss and Japanese<br />

companies. “But it’s always difficult to get<br />

public release,” Rozenman sighs, unable<br />

to divulge the client’s name.<br />

“Most of our clients are multinational<br />

corporations,” says Rozenman. “We connect<br />

people around the world with data/<br />

voice applications. For the most part<br />

we try to focus on companies that need<br />

global reach to support their internal and<br />

external operations. We work with the<br />

top 25-30 pharma companies, as well as<br />

CROs and medium-sized biotechs.”<br />

A large percentage of BT’s core services<br />

fall around strategic outsourcing—<br />

largely LAN/WAN security and unified<br />

communications. “That’s how we start<br />

usually. That’s our way in,” says Rozenman.<br />

“We need to prove we can do that<br />

well. We work with HP, Dell, EDS,<br />

IBM, it depends—the pharma<br />

company will carve up their IT<br />

outsourcing operations into towers,<br />

so servers/storage might go<br />

to IBM, communications to BT,<br />

laptops to HP, [consulting] to Accenture…<br />

We have to work across<br />

all the vendors.”<br />

More Than Core<br />

In recent years, however, BT has<br />

been hearing from customers<br />

that supplying core services isn’t<br />

necessarily sufficient. Paraphrasing,<br />

Rozenman hears things like:<br />

“You have expertise, but this is<br />

not enough. For us to continue to<br />

work with BT, you need to provide<br />

solutions not just for cost containment,<br />

but start creating more<br />

value, a balance between cost and<br />

payment.”<br />

“We work with pharma to<br />

identify what they want us to support.<br />

That usually differentiates us<br />

in terms of price and capabilities<br />

from other telecoms,” says Rozenman.<br />

Those needs increasingly<br />

include the ability to handle large<br />

data volumes and ensure they can<br />

travel around the world securely. And<br />

there are specific areas around R&D collaboration,<br />

chronic disease management,<br />

patient compliance, and the supply chain.<br />

One of those projects, with Liverpool<br />

City Council, is a “home of the future”<br />

pilot to study the management of chronic<br />

diseases for the elderly population. This


pilot focused on supporting patients in<br />

their home environment, using sensors<br />

and a self-learning system to learn about<br />

the patient’s behavior and detect signs of<br />

trouble such as a fall in the home, lack of<br />

movement, etc.<br />

On the supply chain side, Rozenman’s<br />

team is helping track, trace and authenticate<br />

drug supplies, in some cases down<br />

to a specific bottle, in an effort to combat<br />

rampant counterfeiting of approved<br />

drugs, identifying “grey markets” and<br />

potentially initiating drug recalls.<br />

Support of patient compliance, persistence<br />

and adherence to dosing during<br />

clinical trials (and approved drugs) is<br />

another key area for BT solution development.<br />

Even highly successful and effective<br />

drugs such as Gleevec show non-compliance<br />

rates around 30 percent. “If there is a<br />

drop in compliance, the pharma sponsor<br />

will allocate more drug to drive compliance.<br />

We need an intervention mechanism,”<br />

says Rozenman. “It’s not about<br />

reminders. It’s about the ability to connect<br />

the patient to the doctor, providing<br />

the necessary support mechanism, and in<br />

cases of adverse effects, providing patient<br />

training and options to help patients alleviate<br />

symptoms,” says Rozenman.<br />

Cloud Sourcing<br />

BT has great interest in cloud computing<br />

applications, recently running a “hothouse”<br />

event in London to bring stakeholders<br />

together. “There is significant<br />

interest from pharma around private<br />

clouds or hybrid clouds, not public,”<br />

says Rozenman. “Having said that, we’re<br />

running a very large cloud service for financial<br />

firms, called Radianz. It connects<br />

trading floor operations, content services,<br />

and transactional operations. We’re in the<br />

process of developing something similar<br />

for pharma.”<br />

“We have not launched the platform in<br />

the full sense yet,” says Rozenman, but the<br />

company has run pilots for a UK pharma<br />

and seen some ideas incorporated into the<br />

Pistoia Alliance. It is also working with<br />

the MIT Media Lab and the BioTeam, as<br />

well as a large ISV “to become a foundational<br />

partner in the ecosystem… running<br />

apps in the BT Cloud environment.”<br />

The needs of financial services and life<br />

sciences are very different, however. The<br />

A Long Road to BT<br />

Yury Rozenman has spent most of his career in the life sciences and pharma since<br />

graduating from the University of Illinois 25 years ago. A chemist by training, he<br />

worked for GD Searle in Chicago (now part of Pfizer), running analytical development<br />

before switching to licensing and acquisitions, searching for new compounds<br />

to bring into the pipeline.<br />

In 1993, he joined Applied Biosystems, helping develop DNA sequencing and<br />

genotyping technologies. When ABI launched a sister company, Celera Genomics,<br />

in 1998, Rozenman worked for Craig Venter and colleagues in field operations,<br />

interfacing with pharma companies seeking access to gene databases, advising on<br />

IT infrastructure.<br />

Rozenman later went to work for D.E. Shaw Ventures, which started among<br />

others Amazon, Schrodinger, and an online service called Juno (which merged with<br />

NetZero in 2001) for computational analysis. Following stints with Platform Computing<br />

and IBM, where he worked with Carol Kovac, again focusing on industry<br />

solutions for pharma, he joined BT Global Services. K.D.<br />

key attribute for financial firms is very<br />

low latency. “It’s not necessarily compute<br />

intensive, but very network centric. They<br />

have to move data, provide security, and<br />

encryption.”<br />

Pharma, by contrast, typically doesn’t<br />

run a single app but data pipelines, and<br />

thus requires many data sources around<br />

the world to be connected. “It’s not Windows<br />

but Linux or UNIX, so you need<br />

multiple platforms. And if you’re running<br />

a pipeline, you are running different programs.<br />

If it is clinical data, you may not be<br />

able to move data due to EU regulations<br />

(privacy protection laws). Everything we<br />

can do in the financial cloud we’ll take to<br />

apply [to pharma].”<br />

Rozenman says BT Global Services<br />

hopes to help ISVs deliver on demand<br />

services. By 2014, pharma says it wants to<br />

externalize their R&D or clinical services.<br />

“Within six months, we’ll have some proof<br />

of concepts completed and additional<br />

requirements to launch Phase 2. We want<br />

an app store, but we’re discussing with<br />

Schrodinger, Eagle Genomics and others…<br />

how to increase the use of in silico<br />

biology and chemistry through cloudbased<br />

services.”<br />

BT is in the business of “business enabling,”<br />

says Rozenman, and prefers to<br />

build solutions around the clients’ needs<br />

rather than create its own software. “We<br />

need to calculate: is it cheaper to move<br />

data to the app or stage the app in a<br />

virtual data center around the world?<br />

Depending on the requirement, data<br />

size, and resources, we can make decision<br />

automatically whether we stage or move.<br />

There’s enough intelligence in our cloud<br />

that we can stage very quickly.”<br />

BT Global Services has its own security<br />

group that develops solutions for<br />

pharma and other industries. “We look at<br />

regulations, how infrastructure needs to<br />

be rolled out depending on the vertical.<br />

HIPPA rules to anonymize data if you<br />

need to.”<br />

“Perception is important,” says Rozenman.<br />

“We can enforce rules, privatize data<br />

and IP, or share with single clients. We<br />

can create rules to ensure that is in place.<br />

We’ll also do ethical hacking if they want<br />

us… to try to break the security.”<br />

While clinical endpoints are classical<br />

data points, Rozenman says BT’s system<br />

is capable of taking genotype/phenotype<br />

information. “As we move forward in<br />

diagnostics, with complete genomes, you<br />

can reliably use the information to pick<br />

the right people for clinical trials, and<br />

quickly do genotyping, identify the right<br />

cohorts… Instead of 3,000 patients, you<br />

can get away with smaller, more focused<br />

clinical trials. We’d like to use our R&D<br />

platform to support early research informatics<br />

and simulated clinical trials.<br />

We’re in discussions in software around<br />

simulating clinical trials, to structure protocols<br />

and run more efficient trials. This is<br />

a very interesting area for our health care<br />

practice.” •<br />

www.bio-itworld.com NOVEMBER | DECEMBER 2011 BIO•IT WORLD [ 49 ]<br />

CONTENTS


CONTENTS<br />

Free Trials & Downloads<br />

Company: Brainloop<br />

Product: Brainloop Secure Dataroom<br />

Description: Brainloop offers a highly<br />

secure online workspace for drug partnering<br />

plus collaboration on clinical trials,<br />

regulatory filings, manufacturing.<br />

Fast setup and prevent printing, forwarding,<br />

saving.<br />

Try it for free: http://bit.ly/FreeBrainloop<br />

Company: Real Time Genomics<br />

Product: RTG Investigator<br />

Description: Real Time Genomics offers<br />

RTG Investigator sequence analysis software<br />

to approved biological investigators<br />

with a renewable individual license good<br />

for 12 months.<br />

BioBASE Try it for free: http://bit.ly/FreeRTG<br />

Company: BIOBASE<br />

Products: TRANSFAC and ExPlain<br />

Description: TRANSFAC contains data on<br />

eukaryotic transcription factors, their<br />

binding sites, and regulated genes.<br />

ExPlain is a companion data analysis<br />

system that combines promoter and<br />

pathway analysis tools.<br />

Try for free: http://bit.ly/FreeBiobase<br />

[50] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

Company: Enlis Genomics<br />

Product: Enlis Genome Research Edition<br />

Description: Enlis offers a genomic analysis<br />

tool for academic and industry<br />

researchers. The software enables analysis<br />

and visualization of genomic<br />

sequence data with unparalleled clarity.<br />

Try it for free: http://bit.ly/FreeEnlis<br />

Company: goBalto<br />

Product: goBalto Tracker<br />

Description: The easiest way to start your<br />

clinical trial on the web: Introducing<br />

goBalto Tracker the fastest way to get<br />

your investigators up and running.<br />

Try it for free: http://bit.ly/FreeGoBalto<br />

Company: CLC bio<br />

Products: CLC Genomics Workbench<br />

Description: CLC Genomics Workbench,<br />

for analyzing and visualizing Next Generation<br />

Sequencing data, incorporates cutting-edge<br />

technology and algorithms,<br />

while also supporting and integrating<br />

with the rest of your typical NGS workflow.<br />

Try it for free: http://bit.ly/FreeCLCbio<br />

Feature Your Free Download<br />

To have your free trial or download listed on this page, contact<br />

Allison Proffitt, Managing Editor at aproffitt@healthtech.com<br />

Ask a question or Cut n’ Paste//Results<br />

Statins have demonstrated their<br />

anti‐apoptotic effects by blocking A‐<br />

beta 1‐42 induced neuronal death.<br />

Statins modulate the expression of<br />

molecules …<br />

Do statins have<br />

an effect on Akt<br />

phosphorylation?<br />

Open at the exact<br />

Page . . .<br />

Open at the exact<br />

Paragraph!<br />

Company: Praxeon<br />

Product: Document Lens<br />

Description: See firsthand how DocumentLens<br />

can help you organize your<br />

scientific articles, focus on the knowledge<br />

contained within, and enhance<br />

collaboration between groups.<br />

Try it for free: http://bit.ly/FreePraxeon<br />

Company: Omixon<br />

Product: Omixon Letter Space Toolkit<br />

Description: Find insertions and deletions<br />

1


New Products<br />

Watch a video about<br />

the BaseSpace Cloud<br />

Platform for MiSeq.<br />

Illumina’s Benchtop Sequencer<br />

Illumina’s much anticipated<br />

benchtop next-generation sequencing<br />

(NGS) instrument, the MiSeq,<br />

is shipping now and is available<br />

with a series of enhancements<br />

including streamlined sample processing<br />

and a new Cloud platform<br />

called BaseSpace that supports the<br />

platform. (see, “Illumina Launches<br />

BaseSpace Cloud Platform for<br />

MiSeq”) The goal of the new Base-<br />

Space Cloud platform is to provide<br />

a seamless pathway to get genomics<br />

data to the Cloud, expand on<br />

the space, and provide different<br />

types of analysis tools. The Cloud<br />

<strong>Up</strong>dated Data Integration<br />

IO Informatics has released Knowledge Explorer,<br />

version 3.6, a tool that makes it possible for informaticians,<br />

data analysts and integration experts to integrate<br />

data with truly unmatched ease and efficiency.<br />

Enhancements and improvements make it easier<br />

for researchers to discover models, signatures and<br />

profiles by uncovering actionable meaning from their<br />

data and by contextualizing complex relationships for<br />

concise visual comprehension.<br />

Product: Knowledge Explorer v3.6<br />

Company: IO Informatics<br />

For more information: www.io-informatics.com<br />

platform is deployed using Amazon<br />

EC2. “The addition of BaseSpace<br />

eliminates the need for expensive<br />

IT infrastructure, simplifying the<br />

process of adopting a personal<br />

sequencer for labs of any size and<br />

experience,” commented Illumina<br />

CEO Jay Flatley.<br />

The MiSeq runtime is 27 hours<br />

for a 2x150-basepair run, for a total<br />

throughput of 1.7 Gigabases. The<br />

instrument retails for $125,000.<br />

Product: MiSeq<br />

Company: Illumina<br />

For more information:<br />

www.illumina.com<br />

Data Collection on the Web<br />

Oracle has announced the release of OutcomeLogix<br />

On Demand 3.0, a Web-based,<br />

scalable data collection system that enables<br />

life science companies and contract research<br />

organizations (CROs) to capture therapy<br />

outcome information from health care<br />

providers and patients in late-stage studies.<br />

The new version incorporates several features<br />

to help meet increasing requirements<br />

to improve product safety and demonstrate<br />

long-term treatment efficacy: increased<br />

scalability, improved user interface, updated<br />

prompting tools, enhanced configurability,<br />

and expanded multi-lingual support.<br />

Product: OutcomeLogix On Demand 3.0<br />

Company: Oracle<br />

For more information: www.oracle.com<br />

Image Analysis for<br />

Biomarker Discovery<br />

Definiens has released the company’s new<br />

Quantitative Digital Pathology portfolio,<br />

designed to advance translational research<br />

in pathology, specifically in biomarker<br />

development. The new portfolio will offer<br />

updated versions of Definiens Tissue Studio,<br />

Definiens Developer XD as well as the introduction<br />

of a novel component, Definiens<br />

Image Miner, which closely integrates data<br />

mining with image analysis to accelerate<br />

biomarker research.<br />

Product: Quantitative Digital Pathology<br />

portfolio<br />

Company: Definiens<br />

For more information: quantitative-digitalpathology.definiens.com<br />

Speedy Bioinformatics<br />

TimeLogic has announced the release of the next generation<br />

in Field Programmable Gate Array (FPGA) based DeCypher<br />

systems for biocomputing. The upcoming system, codenamed<br />

J1, will be released early next year with the equivalent performance<br />

of several thousand 3 Ghz processor cores in a single 1U<br />

server. Performance gains with the new system are expected to<br />

be tremendous. For example, Tera-BLAST, TimeLogic’s implementation<br />

of the popular BLAST algorithm, will run hundreds<br />

of times faster than the current generation of SeqCruncher bioinformatics<br />

accelerators and will allow a single 1U server, with<br />

accelerated BLAST, to replace a full rack of multi-core servers.<br />

Product: FPGA system codenamed J1<br />

Company: Active Motif<br />

For more information: www.timelogic.com<br />

www.bio-itworld.com NOVEMBER | DECEMBER 2011 BIO•IT WORLD [ 51 ]<br />

CONTENTS


Educational Opportunities<br />

Keep abreast of the variety of educational events in the life science industry<br />

that will help you with your business and professional needs. To preview<br />

a more in-depth listing of educational offerings, visit the “Events” section<br />

of bio-itworld.com.<br />

To list an educational event, email lscimeme@healthtech.com.<br />

Featured Events<br />

<strong>CHI</strong> Events<br />

For more information on these<br />

conferences and other <strong>CHI</strong> events,<br />

visit healthtech.com.<br />

Biobanking: Maximizing Your Investment<br />

December 6-8, 2010 | Providence, RI<br />

The Compound Management Forum<br />

December 7-8, 2010 | Providence, RI<br />

Clinical Project Management Clinical<br />

Project Management<br />

December 8-9, 2010 | Philadelphia, PA<br />

January 9-13, 2012 | Coronado, CA<br />

High-Content Analysis<br />

January 10-13, 2012 | San Francisco, CA<br />

Live-Cell Imaging<br />

January 12-13, 2012 | San Francisco, CA<br />

Biomarker Assay Development<br />

January 31 - February 1, 2012 | San Diego, CA<br />

Barnett Educational Services<br />

Barnett Live Seminars<br />

Data Management in the Electronic Data<br />

Capture Arena<br />

December 1-2, 2011 | San Diego, CA<br />

Monitoring Clinical Drug Studies:<br />

Advanced<br />

December 1-2, 2011 | San Diego, CA<br />

Patient Registry Programs<br />

December 1-2, 2011 | San Diego, CA<br />

Drug Safety and Pharmacovigilance<br />

December 5-6, 2011 | Boston, MA<br />

Regulatory Intelligence<br />

December 7, 2011 | Philadelphia, PA<br />

Executive Summit West: Drug-Diagnostic<br />

Co-Development<br />

January 31 - February 1, 2012 | San Diego, CA<br />

Summit for Clinical Operations<br />

Executives (SCOPE)<br />

February 7-9, 2012 | Miami, FL<br />

Cloud Computing<br />

February 19-20, 2012 | San Francisco, CA<br />

Next Generation Pathology<br />

February 19-20, 2012 | San Francisco, CA<br />

February 21-23, 2012 | San Francisco, CA<br />

Integrated R&D Informatics for<br />

Knowledge Management<br />

February 21-23, 2012 | San Francisco, CA<br />

Bioinformatics & Cancerinformatics<br />

2012<br />

February 21-23, 2012 | San Francisco, CA<br />

March 5-8, 2012 | San Diego, CA<br />

microRNA in Human Disease and<br />

Development<br />

March 12-13, 2012 | <strong>Cambridge</strong>, MA<br />

font is Petita Bold<br />

Visit BarnettInternational.com for detailed information on Barnett’s live<br />

seminars, interactive web seminars, on-site training programs, customized eLearning<br />

development services and publications.<br />

Conducting Clinical Trials in Resource-<br />

Limited Settings<br />

December 8-9, 2011 | Boston, MA<br />

Monitoring Clinical Drug Studies:<br />

Intermediate<br />

December 8-9, 2011 | Boston, MA<br />

Medical Device GCP Overview<br />

December 8-9, 2011 | Boston, MA<br />

[52] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

Statistical Concepts for Non-Statisticians<br />

December 8-9, 2011 | Boston, MA<br />

Working with CROs<br />

December 5-6, 2011 | Philadelphia, PA<br />

Pharmacovigilance Audit<br />

December 13, 2011 | San Diego, CA<br />

April 24-26, 2012 | Boston, MA<br />

PEGS: the essential protein engineering<br />

summit<br />

April 30 - May 4, 2012 | Boston, MA<br />

Strategic Alliance Management Congress<br />

May 7-9, 2012 | Philadelphia, PA<br />

Biomarker World Congress<br />

May 21-23, 2012 | Philadelphia, PA<br />

World Pharmaceutical Congress<br />

June 5-7, 2012 | Philadelphia, PA<br />

June 6-8, 2012 | Singapore<br />

Effectively Writing the Investigators<br />

Brochure<br />

December 14, 2011 | Philadelphia, PA<br />

Global IND Submissions<br />

December 15-16, 2011 | Philadelphia, PA<br />

Institutional Review Boards (IRBs)<br />

December 15-16, 2011 | Philadelphia, PA<br />

Barnett Web Seminars<br />

Introduction to Data Management<br />

December 5, 2011<br />

NEW! Implications of the New FDA<br />

Guideline for a Risk-Based Approach to<br />

Monitoring<br />

December 5, 2011<br />

NEW! Disqualification of Clinical<br />

Investigators: Proposed Rule and FDA<br />

Transparency Initiative<br />

December 6, 2011<br />

NEW! Electronic Source Documentation:


Webcasts, White Papers, and Podcasts<br />

Visit bio-itworld.com to browse our<br />

extensive list of complimentary Life<br />

Science white papers, podcasts and<br />

webcasts.<br />

To learn more about developing a<br />

multimedia lead generating<br />

solution, contact Lisa Scimemi at<br />

lscimemi@healthtech.com<br />

Whitepapers<br />

A BPM-Approach to Adverse Event<br />

Management<br />

Sponsored by Pegasystems<br />

Safety management is<br />

one of the most difficult<br />

requirements imposed<br />

on the life sciences<br />

industry. Companies<br />

confront a tangle of<br />

safety monitoring<br />

requirements that span<br />

both pre- and postmarket<br />

approval activi-<br />

By joHn russell<br />

ties and vary by IRB/IEC governance policies,<br />

product type, and different global regulatory<br />

agencies. Learn how Pegasystems BPM and its<br />

Adverse Event Case Processing Solution (AECP)<br />

can help companies:<br />

• Transform adverse event management systems<br />

• Lower cost with increased productivity<br />

One client successfully utilized Pega BPM to<br />

establish paperless adverse event reporting<br />

across 23 countries, resulting in upwards of a<br />

Navigating the New FDA Draft Guidance<br />

December 6, 2011<br />

NEW! FDA’s Bioresearch Monitoring (BIMO)<br />

Program: New Guidance for Inspection of<br />

Sponsors, CROs, and Monitors<br />

December 7, 2011<br />

NEW! Financial Disclosure: New FDA<br />

Draft Guidance for Clinical Investigator<br />

Reporting<br />

December 7, 2011<br />

Drug Safety and Pharmacovigilance<br />

December 8, 2011<br />

Site Management and the Art of<br />

Assertiveness<br />

December 8, 2011<br />

A BPM-Approach to<br />

Safety Management<br />

Produced by <strong>Cambridge</strong> <strong>Healthtech</strong><br />

Media Group Custom Publishing www.pega.com<br />

Monitoring Phase I Clinical Trials<br />

December 9, 2011<br />

NEW! De-Risk Your Protocol by Developing<br />

an Operational Advisory Board<br />

December 9, 2011<br />

100% increase in productivity with near 50%<br />

cost reduction.<br />

Visit: www.bio-itworld.com to download<br />

Accelerating Decision Making Processes in<br />

Life Sciences R&D<br />

Sponsored by HP and Microsoft<br />

Life sciences organizations need an easy way to<br />

access and share information to make timely<br />

R&D decisions. Read this white paper to learn:<br />

• How IT and business challenges complicate<br />

decision-making efforts<br />

• How to avoid missed opportunities, duplicate<br />

work, and inefficient use of time<br />

• How a jointly developed HP and Microsoft<br />

business intelligence appliance provides a<br />

way for life sciences researchers and managers<br />

to accelerate their analysis and decisionmaking<br />

work.<br />

Visit: www.bio-itworld.com to download<br />

Embedding Oracle Technologies in<br />

Life Sciences Solutions<br />

Sponsored by Oracle<br />

Data drives science, but can slow it down. If<br />

data-structured or unstructured- is made<br />

more easily accessible, more easily shared,<br />

and more readily analyzed it will produce better<br />

results. In life science and healthcare that<br />

can mean the difference between a new drug<br />

finding its way to the healthcare market and<br />

patient, or simply getting lost in the data glut.<br />

Oracle’s Embedded technologies can lower<br />

problematic data hurdles while producing<br />

innovation and productive outcomes. This<br />

paper discusses how Independent Software<br />

Vendors, Original Equipment Manufacturers<br />

and, of course, users can quickly and easily<br />

implement solutions that leverage Oracle<br />

embedded technologies.<br />

The paper covers benefits for both vendors<br />

and users including:<br />

• Vendor advantages: Faster time to market,<br />

greater differentiation, customer satisfaction,<br />

enterprise class support and security.<br />

• User advantages: Improved Data handling,<br />

faster clinical trials, improved business<br />

reporting, improved ROI.<br />

Visit: www.bio-itworld.com to download<br />

Podcast<br />

Metrics that Matter: How Actionable Data<br />

Can Drive Better Decisions<br />

With the widespread<br />

adoption of eClinical<br />

technology, clinical operations<br />

departments have<br />

access to unprecedented<br />

amounts of data. This<br />

podcast will discuss how<br />

clinical business analytics<br />

can help sponsors effectively mine that data to<br />

make more informed decisions.<br />

Industry executives will address the follow-<br />

ing questions:<br />

• What are the key technical and organizational<br />

challenges to efficiency in clinical<br />

operations?<br />

• How is the definition of actionable data<br />

evolving?<br />

• How can clinical business analytics be an<br />

agent of change for an organization?<br />

• What technical and organizational issues<br />

should sponsors consider when seeking a<br />

business analytics solution?<br />

Speakers: Stephen Young, Senior Product<br />

Director, Medidata Solutions and Laurie<br />

Halloran, CEO & President, Halloran Consulting<br />

Group<br />

Listen Now — Visit: www.bio-itworld.com to<br />

download<br />

For more details on this newly<br />

published report, visit<br />

InsightPharmaReports.com<br />

Web Symposia Series covers a broad array<br />

of topics within the life sciences and drug<br />

development enterprise.<br />

• Register for upcoming web symposia<br />

• Listen to recorded web events<br />

• Purchase a DVD or Electronic Version<br />

• Sponsor a symposium on a topic of<br />

your choice<br />

For details on the Web Symposia<br />

Series, visit www.bio-itworld<br />

symposia.com or email lscimemi@<br />

healthtech.com<br />

www.bio-itworld.com NOVEMBER | DECEMBER 2011 BIO•IT WORLD [53]


CONTENTS<br />

The Russell Transcript<br />

Crescendo<br />

Bioscience’s<br />

Aspirations<br />

JOHN RUSSELL<br />

The rise of personalized medicine—however broadly<br />

or narrowly we define it—has been stymied in part<br />

by a lack of effective biomarkers. Cancer is perhaps<br />

an early and growing exception. The trend to profile<br />

a specific patient’s tumor for markers to help physicians<br />

pick the best therapeutic regime is a good example<br />

of the growing sophistication of biomarkers and their longterm<br />

potential as the cost of performing such tests declines.<br />

Of course many diseases could (and eventually will) benefit<br />

from the kind of biomarkers becoming common in cancer. Biomarkers<br />

are necessary linchpins to enable modern medicine to<br />

translate advances in understanding disease biology into practical<br />

diagnosis, prognosis, and treatment. One young company,<br />

Crescendo Bioscience, has set its sights on developing and commercializing<br />

markers for autoimmune diseases. It’s a ripe area<br />

and the investing community seems to agree. In September,<br />

Crescendo completed a $31 million Series C round and also<br />

struck a strategic investment deal with Myriad Genetics for<br />

$25 million.<br />

Crescendo’s first target is rheumatoid arthritis (RA). CEO<br />

William Hagstrom says, “It was interesting. You could walk<br />

around trade exhibits floors and see all manner of descriptions<br />

of RA biology and pathways and mechanisms of how drugs<br />

affected them. There wasn’t a single booth or company talking<br />

about how they were measuring those things so physicians<br />

could better understand what to do with these drugs.”<br />

RA is a solid target on two critical grounds. First, like a great<br />

many biomarkers, current RA markers are both coarse and almost<br />

entirely subjective.<br />

‘’You don’t cure RA and most autoimmune patients generally.<br />

You seek to put them into clinical remission and the standard<br />

measures for assessing inflammatory disease activity all had to<br />

do with external manipulation of patients joints or digits and<br />

doing calculations which were qualitative and subjective as opposed<br />

to being based the biology of the disease,” Hagstrom says.<br />

Second, there are around 3,500 office-based rheumatologists<br />

in the U.S. alone who treat patients with many different autoimmune<br />

diseases, and the typical RA patient has four visits each<br />

[54] BIO•IT WORLD NOVEMBER | DECEMBER 2011 www.bio-itworld.com<br />

year. Marketing to that physician community and their patients<br />

is a doable task and a sizeable market. The chronic nature of the<br />

disease has also created an active patient community seeking to<br />

self-manage their disease and receptive to new approaches.<br />

Markers for Disease Biology<br />

The cornerstone of Crescendo’s approach is to develop quantitative<br />

markers based on detailed models of disease biology.<br />

Its first test—Vectra DA—launched last year at ACR is a blood<br />

test that measures 12 key proteins consistently associated with<br />

RA disease activity and integrates them into a single objective<br />

score for easy interpretation.<br />

The test is intended to provide physicians with a quantitative<br />

glimpse into RA activity and was developed much the way<br />

modern drugs are. Drawing from the literature and using computational<br />

tools such as Ingenuity Systems’ pathway database<br />

and analysis software and Entelos’ RA Physiolab platform, Crescendo<br />

built a detailed picture of RA biology.<br />

“We were technology agnostic at the front end and chose to<br />

screen very large numbers of biomarkers on several different<br />

technology platforms (e.g. gene expression) to determine which<br />

markers correlated [with] quantitative disease activity,” says<br />

Hagstrom. Of an initial list of 390 markers, Crescendo identified<br />

a small subset to take into a formal development program.<br />

“We optimized the biomarkers,<br />

put them on a single technology<br />

Listen to the audio podcast<br />

“Entrepreneurial Journeys in<br />

Healthcare” with William Hagstrom.<br />

platform, and built a cohort of 25<br />

centers across North America called<br />

INFORM, which was used as the<br />

William Hagstrom<br />

primary sample set for efforts to develop<br />

an algorithm able to deliver a quantitative score on a scale<br />

of 1 to 100 to discern a patient’s level of disease activity,” he says.<br />

Validation studies in Europe and the U.S. followed. Crescendo<br />

has a CLIA lab in San Francisco and is a member of the Batter<br />

<strong>Up</strong> consortium (see, “Batter <strong>Up</strong>: A Stratified Approach to<br />

Rheumatoid Arthritis,” Bio•IT World, March 2011) working on<br />

multiple approaches to RA treatment.<br />

Currently, Vectra DA is not linked to any specific therapeutic.<br />

“A physician can see in the face of various drug therapies<br />

whether the disease activity is declining, stable, or increasing.<br />

But we are not measuring drug effect, nor are we seeking to<br />

predict a drug response or a drug selection at this point in time.<br />

Those are targets of future product development activities,”<br />

Hagstrom says. Crescendo already reports the levels of the test’s<br />

constituent proteins.<br />

It will be interesting to track Crescendo’s progress and its<br />

impact on broader industry efforts to develop and commercialize<br />

biomarkers required to deliver personalized medicine.


10th Annual BioPartnering North America<br />

Growing Life Science<br />

Business in the Pacific Rim<br />

www.techvision.com/bpn<br />

10th Annual BioPartnering North America<br />

Vancouver Convention Centre | Vancouver, BC, Canada<br />

26-28 February 2012<br />

Now in its 10th year of success, BioPartnering North America will build on<br />

strong relationships focusing more clearly on the Pacific Rim. Vancouver is<br />

well placed to provide these crossroads, where companies from around the<br />

world can access new business opportunities. With strong support from trade<br />

associations in Asia, Europe, North America, and elsewhere, together with<br />

creative input from our Advisory Boards, we are planning to take BPN to a<br />

higher level, and this year’s theme indicates this shift in emphasis:<br />

Growing Life Science Business in the Pacific Rim.<br />

PRODUCED BY: POWERED BY: HOSTED BY:


Life Sciences<br />

Innovation Forum<br />

Harness Innovation in a Shifting Global Business Climate<br />

January 26-27, 2012 Marriott Forrestal Village Conference Center, Princeton, NJ<br />

Featuring Insights into the Latest Innovations in Life Sciences, including:<br />

• How to Foster Meaningful and Compliant Relationships with Patients and<br />

Physicians<br />

• Maximize Resource Allocation, Avoid Bottlenecks, and Bring Increased<br />

Effi ciency to Your Clinical and Regulatory processes<br />

• Determine the Impact of Electronic Medical Records on Your Impact Today<br />

and Innovate for the Future<br />

• Leverage the Impact of Mobile Devices to Improve Internal<br />

Communications and External Collaboration<br />

• Gain New Effi ciencies and Time Savings through Effective Clinical Trial<br />

Management Technologies<br />

• Increase Operational Effectiveness through the Development of Advanced<br />

Translational Medicine Models<br />

• Enable Global Collaboration and Compliance with Regulatory Bodies<br />

• Hear the Latest Trends and Innovations around Global Submissions<br />

• Utilize Virtual and Collaborative Technologies to Improve Speed and<br />

Decision Making Ability<br />

• And Much More!<br />

Premier Sponsor:<br />

Bio-IT World<br />

Subscribers receive<br />

10% off standard<br />

registration fees!<br />

Use code C210BIO<br />

when registering!<br />

Offi cial Media Partner:<br />

To Register Now, Call 866-207-6528 or Visit Our Website www.LSInnovation.com

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!