[ Documentation top | FUGUE Home ]
melody -blast -t PAS.tem -plus PAS.map
fugueali -seq input.fa -prf PAS.fugshould produce a sequence-structure alignment identical to that from the alignment server.
You can also download the default list of profiles from ftp://mizuguchilab.org/software/fugue/data/allprf.lst. Each line in this file specifies the location of a structural profile.
PAS.fug and the other profiles are
locally stored as in
allprf.lst and if you have
PSI-BLAST and seals installed locally, the following command
will produce results more or less identical to those
obtained from the web interface:
run_fugue -seq input.fa -list allprf.lst
This will create the output file
fugue.html. See How to
interpret the output page on how to read this file.
If you want to create your own profiles, go through the following
steps. (You could then add your own profiles to
allprf.lst; see below.)
mystructure.pdb, simply type
joy mystructure.pdbIf you want to use a particular chain (e.g., A) of a particular structure which is already in PDB (e.g., 1abc), you don't have to prepare a PDB file. You can simply type
joy 1abcA(For this to work, you have to have a local copy of PDB and set the environment variable JOY_PDBDIR.)
If you have prepared a structural alignment, save it in
myfamily.ali and type
joy myfamily.aliYou need the PDB files for all the structures in the alignment (see here for more information). At this stage, do not include sequence-only entries in the .ali file. Make sure JOY has finished normally and created the file
myfamily.tem. As an example, look at the
PAS.alifile in the above ftp directory, from which
PAS.temcan be created. (You need to download all the .psa, .hdb, .sst and .cof files.)
homblast -seq myfamily.aliThis will create the file
myfamily.map. Note that this command may take a long time if you have many structures in your .ali file and they are large proteins and/or have many homologues.
(Note) This command will create
a directory named
blast in which psi-blast is run (up to five
iterations with the default inclusion cut-off). Once the .map file has
been created properly, you can delete the whole directory. In fact,
homblast will not run if a directory named
blast exists so to rerun
the command, you need to delete this directory.
melody -t myfamily.tem [ -blast -plus myfamily.map ]
This will create the file
myfamily.fug, which is a structural
that FUGUE requires.
To enrich the profile with sequence information, use the option
-plus myfamily.map, where the .map file was produced in step 3
above. If this file was created by
above), you should also
add the option
-blast. This assumes
that the quality of the alignment in the .map file is not very high and
several filters are applied accordingly.
Alternatively, you can specify your own multiple sequence alignment
saved in a PIR
file. In this case, you may omit the
run_blast -seq myseq.faThis will create a multiple alignment in the file
fugueali -seq myseq.inp -prf myfamily.fug [ -joy -blast ]For more information, type
/my_directory/myprofileA.fugA default list of profiles can be downloaded from ftp://mizuguchilab.org/software/fugue/data/allprf.lst. You can modify this and add your own profiles.
run_fugue -seq myseq.fa -list myprofiles.lstor if you already have an alignment, type
fugueseq -seq myali.aln -list myprofiles.lst
fugueseq -seq candidate1.fa -prf mystr.fugExamine the Z-scores and alignment quality. Alternatively, you can save all these candidate sequences in a single fasta file (candidates.fa) and type
fugueseq -seq candidate2.fa -prf mystr.fug
fugueprf -prf mystr.fug -seq candidates.faThis will be a slow operation and may cause some problems when some of the candidate sequences include domains not belonging to this family. The program will report a list of Z-scores and save the top hits in the file hit.seq. Either way, save the selected homologous sequences (unaligned) in a fasta file (homseq.fa).
chorus -prf mystr.fug -seq homseq.fa -keeporder [ -plus ] -o homseq_mystr.aliThis will add the sequences in the file homseq.fa one-by-one to the original structure-based alignment. If the -plus options is specified, the combined structure-sequence profile is updated every time a new sequence is added. Without this option, each sequence is compared against the original structural profile.
fugueali -prf mystr.fug -seq aligned_homseq.alin -o homseq_mystr.ali
fugueali is a program for producing sequence-structure alignments; given a sequence (or alignment) and a structural profile, it produces an optimal alignment between the two.
The homology recognition and the sequence-structure alignment are related but distinct operations and we have decided that they are best carried out by two separate programs. In fact, recognition and alignment are performed separately even within fugueseq. (see the note "Automatic selection of alignment algorithms".)
The main reason for using different alignment modes for recognition and alignment is that parameters/algorithms good for recognition may not be suitable for producing good alignment, and vice versa. To recognize the homology between sequence and structure, the matching of one or more short fragments may produce a significant signal, despite that other regions may be completely mis-aligned (fugue never uses local alignment algorithm). This is not a problem for homology detection.
However, when we produce alignment, we want to choose the parameters/algorithms to maximize the number of correctly aligned residue pairs and the optimal choice threre could be different. In general, what we observed during the past experiments is that when there is a significant difference in length between sequence and structure, the global-local (2 or 3) algorithm often produces better alignment than the global (0) algorithm. However, homology detection performance does not follow the same pattern.
Thus, by default, fugueseq uses different rules to decide which alignment mode to use for homology detection (z-score) and alignment.
fugueseq includes other features, not present in fugueali and thus fugueali may not produce the same alignment as fugueseq does. For example, fugueali keeps all the input sequences intact, while fugueseq, unless run with the '-keepseq' option, may remove some sequences according to PID, as well as some gap-rich columns in the input sequence alignment.
Another important difference is that fugueseq modifies gap penalties to emphasise the structural core, where structural conservations are more likely to be detected, while fugueali does not. Because of this, the overall alignment quality of fugueseq is usually not as good as that of fugueali, although the differences sometimes may be small.
In summary, the alignment produced by fugueseq (with default options) is a compromise between homology detection performance and alignment quality. In order to optimize the alignment when interesting homology is detected, fugueali should be used to re-do the alignment.
0 Global -- Modified Needleman & WunschThe automatic selection uses the following definition:
1 Local -- Smith & Waterman
2 GloLocSeq -- Global algorithm but hangout in the sequence NOT penalized
3 GloLocPrf -- Global algorithm but hangout in the profile NOT penalized
4 LocLoc -- Global algorithm but all hangout NOT penalized
9 AUTOMATIC -- select method automatically
seq_len/prf_len >= 1.5 Method = 2 (GloLocSeq)See the further note on the alignment algorithm below.
< 0.6667 Method = 3 (GloLocPrf)
Otherwise Method = 0 (Global)
4-1) If the initial choice is global (default)
4-2) If the initial choice is a non-global algorithm
This step is independent of the Z-score calculation. By default, the program uses an automatically selected algorithm for output alignments. However, if a particular algorithm is specified by the -A option, the automatic selection is disabled and the specified algorithm is used instead.