11.07.2015 Views

Bioinformatics for DNA Sequence Analysis.pdf - Index of

Bioinformatics for DNA Sequence Analysis.pdf - Index of

Bioinformatics for DNA Sequence Analysis.pdf - Index of

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

48 Katoh, Asimenos, and Tohassuming case a if the type <strong>of</strong> alignment is unclear, becausethe assumption <strong>of</strong> case a is the safest <strong>of</strong> the three. Once an initialalignment is obtained, then trimming the alignment to include onlythe relevant homologous parts can be done manually, and then themethods designed <strong>for</strong> case c can be applied. In addition, considerthe trade<strong>of</strong>f between computational costs and accuracy: highlyaccurate methods tend to be time- and space-consuming. Thethree methods (G-INS-i, L-INS-i, and E-INS-i) explained here canprocess only < 500 sequences on a standard desktop computer.Case a – E-INS-i When long internal gaps are expected. Thiscase corresponds to an SSU rRNA alignment composed <strong>of</strong> distantlyrelated organisms, such as from the three different domains (Eukarya,Bacteria, and Archaea), or a c<strong>DNA</strong> alignment with splicing variants,both <strong>of</strong> which need long gaps to be inserted. E-INS-i is suitable <strong>for</strong>such data. It employs a generalized affine gap cost (33) in the pairwisealignment stage, in which the un-alignable regions are left unaligned.% mafft --genafpair --maxiterate 1000 input_file >output_fileAn alias is available:% mafft-einsi input_file > output_fileCase b – L-INS-i Locally alignable. When only one alignableblock surrounded by long terminal gaps is expected, L-INS-i isrecommended.% mafft --localpair --maxiterate 1000 input_file >output_fileor% mafft-linsi input_file> output_fileCase c – G-INS-i Globally alignable. If the entire region <strong>of</strong>input sequences is expected to be aligned, we recommend G-INS-ithat assumes global homology.% mafft --globalpair --maxiterate 1000 input_file>output_fileor% mafft-ginsi input_file> output_fileTo obtain a high-quality MSA from the biological point <strong>of</strong>view, we recommend trying multiple independent methods (seeNote 5), different parameter sets (see Note 4), and comparingvarious alignments by eye (see Note 6).2.4. Adding New<strong>Sequence</strong>(s) to anExisting AlignmentIn this section, we explain how to align a group <strong>of</strong> alignedsequences with another group <strong>of</strong> aligned sequences or with unalignedsequences. The tools explained here can be useful <strong>for</strong> asemi-automatic alignment or <strong>for</strong> combining ‘‘eye-ball’’ alignmentsand automated alignments, although they are not needed <strong>for</strong> fullyautomated sequence analyses.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!