12.07.2015 Views

View - ResearchGate

View - ResearchGate

View - ResearchGate

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

38 Wang and Ochs1. Download the LS-NMFrun.zip file and place it in a directory (i.e. [Apple Inc.,Cupertino, CA] folder) LS-NMF. Unzip this file (double-click on LS-NMFrun.zipon a Macintosh or Windows [Microsoft Inc., Redmond, WA] computer, give thecommand gunzip LS-NMFrun.zip on a Linux or Solaris [Sun Microsystem Inc.,Santa Clara, CA] computer).2. Download the LS-NMF_DATA.zip file and place it in a directory SampleData.Unzip this file as well. The reduced data is made up of ratio values for 169 genesacross 20 time-points.3. Download the ClutrFree tool. This can be placed in any directory, as it is anexecutable jar file.4. If one wishes to setup the ASAP, download the system and follow the installationinstructions.3.2. Preprocessing the DataThe important advantage of the LS-NMF algorithm is its nonnegative constraints,which matches the biology of mRNA levels (no negative quantities),whereas reduces the mathematical space required to be searched to identify Aand P matrices that can explain the observed data. As many researchers providelog-transformed data, it is necessary to transform such data into ratios. The transformationto use depends on the original log-transform, but most typically it is2 logratio as the log 2ratios are most commonly used. Such transformations can bedone using a spreadsheet program. For original data, it is merely necessary togenerate ratios for two color arrays or use expression estimates from Affymetrix(Santa Clara, CA), such as provided by robust multichip analysis (12).The input data format used by the downloaded LS-NMF package is the sameas most commonly used microarray analysis tools, such as the Multiexperiment<strong>View</strong>er (MEV) tool (13), i.e., matrix D and σ are stored tab-delimited in files.Robust multichip analysis provides this format as a standard output. Matrix Dshould be stored in a file named FILENAME.txt, and matrix σ is stored inFILENAME.unc. The format has the first row in FILENAME.txt and FILE-NAME.unc as a header, which labels the conditions, one column for each condition.The first column of each file provides the gene ID (e.g., probeset ID,gene name, and so on) (see Note 3).The sample data set is already in this format, with the upper left portion ofthe tab-delimited cdc25-sep1.txt file appearing astime0 time1 time2 time3 time4SPAC222.09 1.1886973 1.4043094 1.2545819 0.9742178 0.8308763SPAC977.10 1.4588974 1.4858006 1.6240937 1.5055444 1.7027408SPAC821.06 1.0014285 1.0467643 1.0772699 1.0653352 1.1393371SPAC821.09 1.3300012 1.6415249 2.7272928 1.8456596 1.8935258SPAC821.11 1.0598825 0.9647703 0.93459004 0.9713597 1.0335108SPAC23C4.13 1.0806911 1.2341665 1.4772284 1.5164454 1.3917305

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!