View - ResearchGate
View - ResearchGate View - ResearchGate
3Estimating Gene Function With Least SquaresNonnegative Matrix FactorizationGuoli Wang and Michael F. OchsSummaryNonnegative matrix factorization is a machine learning algorithm that has extracted informationfrom data in a number of fields, including imaging and spectral analysis, text mining, andmicroarray data analysis. One limitation with the method for linking genes through microarraydata in order to estimate gene function is the high variance observed in transcription levelsbetween different genes. Least squares nonnegative matrix factorization uses estimates of theuncertainties on the mRNA levels for each gene in each condition, to guide the algorithm to alocal minimum in normalized χ 2 , rather than a Euclidean distance or divergence between thereconstructed data and the data itself. Herein, application of this method to microarray data isdemonstrated in order to predict gene function.Key Words: Clustering; least squares; microarray data analysis; nonnegative matrix factorization(NMF); pattern recognition; machine learning.1. IntroductionNonnegative matrix factorization (NMF) was introduced by Lee and Seung forimage decomposition (1). Because of benefits in both interpretation and implementation,NMF was soon adopted in other research, including text mining (2),spectral decomposition (3), multiple sequence alignment (4), and neurophysiology(5). The application of NMF to microarray data analysis showed that it couldbe superior to clustering techniques for prediction of gene function (6,7). Oneissue that has limited application of NMF in many areas is that the patterns foundwithin the data are diffuse, leading to attempts to limit the distributions throughsparse matrix methods (e.g., see ref. 8). In addition, because measurements onmRNA levels of different genes show large differences in variance, a method thatutilizes variance estimates was recently introduced to improve predictions of geneFrom: Methods in Molecular Biology, vol. 408: Gene Function AnalysisEdited by: M. Ochs © Humana Press Inc., Totowa, NJ35
- Page 42: 8 Bidaut• alphaA: this is the num
- Page 46: 10 Bidautcomputing the maximum corr
- Page 50: 12 BidautFig. 3. The complete Clutr
- Page 54: Table 3Some Identified Patterns (5,
- Page 58: 16 BidautFig. 4. This is a comparis
- Page 62: 18 BidautReferences1. Hughes, T. R.
- Page 66: 20 Kirov et al.way to associate gen
- Page 70: 22 Kirov et al.based on a study ass
- Page 74: 24 Kirov et al.1. Retrieve the gene
- Page 78: 26Fig. 1. Functional associations f
- Page 82: 28 Kirov et al.Fig. 2. Pathway anal
- Page 86: 30 Kirov et al.3. Gene symbols usag
- Page 90: 32 Kirov et al.9. OBO_Team, Open Bi
- Page 96: 36 Wang and Ochsfunction (9). Herei
- Page 100: 38 Wang and Ochs1. Download the LS-
- Page 104: 40 Wang and OchsFig. 1. The PattRun
- Page 108: 42 Wang and OchsFig. 3. The PattRun
- Page 112: 44 Wang and OchsFig. 4. The gene ta
- Page 116: 46 Wang and Ochsresults posttreatme
- Page 120: 4From Promoter Analysis to Transcri
- Page 124: Prediction Using PAINT 51even in si
- Page 128: Prediction Using PAINT 53Fig. 1. A
- Page 132: Prediction Using PAINT 55first exon
- Page 138: 58 Gonye et al.Fig. 3. A network vi
- Page 142: 60 Gonye et al.exGeneList.txt) is a
3Estimating Gene Function With Least SquaresNonnegative Matrix FactorizationGuoli Wang and Michael F. OchsSummaryNonnegative matrix factorization is a machine learning algorithm that has extracted informationfrom data in a number of fields, including imaging and spectral analysis, text mining, andmicroarray data analysis. One limitation with the method for linking genes through microarraydata in order to estimate gene function is the high variance observed in transcription levelsbetween different genes. Least squares nonnegative matrix factorization uses estimates of theuncertainties on the mRNA levels for each gene in each condition, to guide the algorithm to alocal minimum in normalized χ 2 , rather than a Euclidean distance or divergence between thereconstructed data and the data itself. Herein, application of this method to microarray data isdemonstrated in order to predict gene function.Key Words: Clustering; least squares; microarray data analysis; nonnegative matrix factorization(NMF); pattern recognition; machine learning.1. IntroductionNonnegative matrix factorization (NMF) was introduced by Lee and Seung forimage decomposition (1). Because of benefits in both interpretation and implementation,NMF was soon adopted in other research, including text mining (2),spectral decomposition (3), multiple sequence alignment (4), and neurophysiology(5). The application of NMF to microarray data analysis showed that it couldbe superior to clustering techniques for prediction of gene function (6,7). Oneissue that has limited application of NMF in many areas is that the patterns foundwithin the data are diffuse, leading to attempts to limit the distributions throughsparse matrix methods (e.g., see ref. 8). In addition, because measurements onmRNA levels of different genes show large differences in variance, a method thatutilizes variance estimates was recently introduced to improve predictions of geneFrom: Methods in Molecular Biology, vol. 408: Gene Function AnalysisEdited by: M. Ochs © Humana Press Inc., Totowa, NJ35