View - ResearchGate
View - ResearchGate View - ResearchGate
Gene Function Inference in Deletion Mutants 3The components necessary for the analysis are the following:• The Sun Java virtual machine runtime environment, available from Sun Microsystems(Palo Alto, CA) (http://www.sun.com), which allows to run the toolsused in this chapter. This is often installed by default on new computers.• The BD program (BDrun), part of the BDtools package available from the FoxChase Cancer Center Bioinformatics website (8), under the form of an archiveBDtools.tar.gz. This package is extracted with the following command line, whichcreates a directory “BDtools” containing binaries and documentation:$ tar xzf BDtools.tar.gz.• The ClutrFree program is also available from the Fox Chase Cancer Center Bioinformaticswebsite (9). This program is available as an executable jar file-clutrfree.jar.To install, download it and save it in the desired location (e.g., /home/ghbidaut/clutrfree/clutrfree.jar). Documentation is available as a pdf file from the same website.2.2. Data Set and Annotations2.2.1. Filtered Data SetThe filtered reduced data set is available as a tar archive from the supportingwebsite of this chapter (see ref. 10).To extract the archive, the following steps must be executed:1. The archive “filtered_rosetta_dataset.tar.gz” must be downloaded from the supportingwebsite to the hard drive.2. The archive is extracted by the following command line:$ tar xvf filtered_rosetta_dataset.tar.gz.3. This creates two file: Fr764_228_ratio.txt and Fr764_228_ratio.unc.The .txt file contains the gene-expression ratios of experiment over control (thisis not a log ratio). The .unc file (at the same format) contains the correspondinguncertainties derived from the gene-specific error model. This is a reduced versionof the original data set based on gene variation across experiment: genes showinga variation of at least threefold across experiments, and experiments characterizedby a variation of twofold across at least two of the remaining genes were retained,leaving a total of 764 genes across 228 conditions. The file format respects thestandard American Standard Code for Information Interchange (ASCII) tabdelimitedformat used in many analysis packages. Each row represents a gene transcriptionalprofile across mutants. The format is detailed in Table 1.2.2.2. Annotation of Genes and Conditions• For gene annotation, the MIPS ontologies (Munich Information Center for ProteinSequences, Munich, Germany) are being used. They are accessible through theirwebsite, and a Perl script (automips.pl) to retrieve and format them is provided.The script is downloadable from the supporting website and is invoked using thefollowing command line:$ auto_mips list_of_genes.txt -o annotation.txt.
- Page 2: Gene Function Analysis
- Page 6: METHODS IN MOLECULAR BIOLOGYGene Fu
- Page 12: PrefaceThis volume of Methods in Mo
- Page 16: Prefaceixcolleagues demonstrate how
- Page 20: xiiContentsPART III EXPERIMENTAL ME
- Page 26: ICOMPUTATIONAL METHODS I
- Page 36: Gene Function Inference in Deletion
- Page 40: Gene Function Inference in Deletion
- Page 44: Gene Function Inference in Deletion
- Page 48: Gene Function Inference in Deletion
- Page 52: Gene Function Inference in Deletion
- Page 56: [10.03]:Cell 2.6 [30]:Cellular 2.57
- Page 60: Gene Function Inference in Deletion
- Page 64: 2Association Analysis for Large-Sca
- Page 68: Table 1Tools That Can Perform Gene
- Page 72: Association Analysis for Large-Scal
- Page 76: Association Analysis for Large-Scal
Gene Function Inference in Deletion Mutants 3The components necessary for the analysis are the following:• The Sun Java virtual machine runtime environment, available from Sun Microsystems(Palo Alto, CA) (http://www.sun.com), which allows to run the toolsused in this chapter. This is often installed by default on new computers.• The BD program (BDrun), part of the BDtools package available from the FoxChase Cancer Center Bioinformatics website (8), under the form of an archiveBDtools.tar.gz. This package is extracted with the following command line, whichcreates a directory “BDtools” containing binaries and documentation:$ tar xzf BDtools.tar.gz.• The ClutrFree program is also available from the Fox Chase Cancer Center Bioinformaticswebsite (9). This program is available as an executable jar file-clutrfree.jar.To install, download it and save it in the desired location (e.g., /home/ghbidaut/clutrfree/clutrfree.jar). Documentation is available as a pdf file from the same website.2.2. Data Set and Annotations2.2.1. Filtered Data SetThe filtered reduced data set is available as a tar archive from the supportingwebsite of this chapter (see ref. 10).To extract the archive, the following steps must be executed:1. The archive “filtered_rosetta_dataset.tar.gz” must be downloaded from the supportingwebsite to the hard drive.2. The archive is extracted by the following command line:$ tar xvf filtered_rosetta_dataset.tar.gz.3. This creates two file: Fr764_228_ratio.txt and Fr764_228_ratio.unc.The .txt file contains the gene-expression ratios of experiment over control (thisis not a log ratio). The .unc file (at the same format) contains the correspondinguncertainties derived from the gene-specific error model. This is a reduced versionof the original data set based on gene variation across experiment: genes showinga variation of at least threefold across experiments, and experiments characterizedby a variation of twofold across at least two of the remaining genes were retained,leaving a total of 764 genes across 228 conditions. The file format respects thestandard American Standard Code for Information Interchange (ASCII) tabdelimitedformat used in many analysis packages. Each row represents a gene transcriptionalprofile across mutants. The format is detailed in Table 1.2.2.2. Annotation of Genes and Conditions• For gene annotation, the MIPS ontologies (Munich Information Center for ProteinSequences, Munich, Germany) are being used. They are accessible through theirwebsite, and a Perl script (automips.pl) to retrieve and format them is provided.The script is downloadable from the supporting website and is invoked using thefollowing command line:$ auto_mips list_of_genes.txt -o annotation.txt.