mafft | Align multiple amino acid or nucleotide sequences
kandi X-RAY | mafft Summary
kandi X-RAY | mafft Summary
Align multiple amino acid or nucleotide sequences.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of mafft
mafft Key Features
mafft Examples and Code Snippets
Community Discussions
Trending Discussions on mafft
QUESTION
I have three text files:
...ANSWER
Answered 2019-Nov-29 at 00:10If you don't mind using re
module for parsing the files, you can use this example:
QUESTION
I have three files, they look like this:
...ANSWER
Answered 2019-Nov-28 at 19:05Try this :
QUESTION
I have some data that describes an ordered set of discrete events (or states). There are 34 possible states, which may occur in any order and may repeat. Each sequence of events can contain any number of events, and crucially there are more than 2 sequences of events. My eventual aim is to cluster these sequences into similar subsets, but my hunch is that this cannot be meaningful unless these sequences are aligned such that equivalent events occupy the same position in all sequences.
I'm very familiar with multiple alignment of biological sequences, but all the software I've come across for this (MUSCLE, MAFFT, T-COFFEE, Clustal*, etc) require DNA, RNA or AA sequences, and I have more states than any of these, so I can't get them to work.
I've found various implementations of the pairwise alignment algorithms such as Needleman-Wunsch in R, but so far haven't come across any generic (non-biological) implementations of any multiple sequence alignment algorithms.
For example, say my data looks like this:
...ANSWER
Answered 2019-Apr-20 at 18:04Assuming that we need to match with LETTERS
, one option is str_match
, then change the NA
to -
, paste
QUESTION
I've been attempting to use the MAFFT command line tool as a means to identify coding regions within a genome. My general process is to align the amino acid consensus sequence of a gene to a translated reading frame of a target sequence. My method has been largely successful. However, I've noticed some peculiar alignments which will unfortunately impede my annotation method. The following is one such example (Note - I've also included a pairwise alignment from the Pairwise2 Biopython module to demonstrate my desired output. Unfortunately, the computation time for Pairwise2 is nearly 20 times slower than MAFFT command line):
...ANSWER
Answered 2017-Aug-01 at 22:21Managed to figure out a possible solution. The alignment of the example sequences provided results in a long terminal/end gap which should not be present. Changing the MAFFT alignment algorithm using localpair, lexp, and lop had no effect (causing me a good deal of confusion). However, I have noticed differences in the alignment output when each input sequence is reversed. Oddly, the only way I was able to remove the terminal/end gap was to set the lop (gap opening penalty) to a lesser amount relative to lexp (gap extension penalty). I suspect my solution is niche and may not be applicable to other similar occurrences of terminal gaps. Changing the alignment settings also likely reduces the optimal alignment.
Going forward, I plan to use an automated process to run alignments of consensus sequences to raw sequences. In the event I detect irregularities with the alignment output (specifically terminal gaps), I'll attempt to reverse the input sequences and apply custom alignment settings. I suppose if that isn't a consistent solution, I'll figure out a way to refine the alignment output directly.
For anyone curious, I used a lexp value of -1.5 and lop value of 0.5 (now included in a hashed out line in my example code).
QUESTION
I've been trying to use the Mafft alignment tool from Bio.Align.Applications. Currently, I've had success writing my sequence information out to temporary text files that are then read by MafftCommandline(). However, I'd like to avoid redundant steps as much as possible, so I've been trying to write to a memory file instead using io.StringIO(). This is where I've been having problems. I can't get MafftCommandline() to read internal files made by io.StringIO(). I've confirmed that the internal files are compatible with functions such as AlignIO.read(). The following is my test code:
...ANSWER
Answered 2017-Jul-06 at 05:36I can't get MafftCommandline() to read internal files made by io.StringIO().
This is not surprising for a couple of reasons:
As you're aware, Biopython doesn't implement Mafft, it simply provides a convenient interface to setup a call to
mafft
in/usr/local/bin
. Themafft
executable runs as a separate process that does not have access to your Python program's internal memory, including your StringIO file.The mafft program only works with an input file, it doesn't even allow
stdin
as a data source. (Though it does allowstdout
as a data sink.) So ultimately, there must be a file in the file system formafft
to open. Thus the need for your temporary file.
Perhaps tempfile.NamedTemporaryFile()
or tempfile.mkstemp()
might be a reasonable compromise.
QUESTION
I am using mafft from the biopython package to align my sequences:
...ANSWER
Answered 2017-Jun-29 at 08:33You would need to set the parameter via the object property, e.g.
QUESTION
I am interested in running mafft from biopython. It works fine (code below).
...ANSWER
Answered 2017-May-23 at 12:49You do this by setting the adjustdirection
property of mafft_cline
. By default, it is False
.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install mafft
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page