Friday, July 29, 2011

ClustalW2 Command line Arguments

CLUSTAL 2.0.12 Multiple Sequence Alignments


                DATA (sequences)

-INFILE=file.ext                             :input sequences.
-PROFILE1=file.ext  and  -PROFILE2=file.ext  :profiles (old alignment).


                VERBS (do things)

-OPTIONS            :list the command line parameters
-HELP  or -CHECK    :outline the command line params.
-FULLHELP           :output full help content.
-ALIGN              :do full multiple alignment.
-TREE               :calculate NJ tree.
-PIM                :output percent identity matrix (while calculating the tree)
-BOOTSTRAP(=n)      :bootstrap a NJ tree (n= number of bootstraps; def. = 1000).
-CONVERT            :output the input sequences in a different file format.


                PARAMETERS (set things)

***General settings:****
-INTERACTIVE :read command line, then enter normal interactive menus
-QUICKTREE   :use FAST algorithm for the alignment guide tree
-TYPE=       :PROTEIN or DNA sequences
-NEGATIVE    :protein alignment with negative values in matrix
-OUTFILE=    :sequence alignment file name
-OUTPUT=     :GCG, GDE, PHYLIP, PIR or NEXUS
-OUTORDER=   :INPUT or ALIGNED
-CASE        :LOWER or UPPER (for GDE output only)
-SEQNOS=     :OFF or ON (for Clustal output only)
-SEQNO_RANGE=:OFF or ON (NEW: for all output formats)
-RANGE=m,n   :sequence range to write starting m to m+n
-MAXSEQLEN=n :maximum allowed input sequence length
-QUIET       :Reduce console output to minimum
-STATS=      :Log some alignents statistics to file

***Fast Pairwise Alignments:***
-KTUPLE=n    :word size
-TOPDIAGS=n  :number of best diags.
-WINDOW=n    :window around best diags.
-PAIRGAP=n   :gap penalty
-SCORE       :PERCENT or ABSOLUTE


***Slow Pairwise Alignments:***
-PWMATRIX=    :Protein weight matrix=BLOSUM, PAM, GONNET, ID or filename
-PWDNAMATRIX= :DNA weight matrix=IUB, CLUSTALW or filename
-PWGAPOPEN=f  :gap opening penalty      
-PWGAPEXT=f   :gap opening penalty


***Multiple Alignments:***
-NEWTREE=      :file for new guide tree
-USETREE=      :file for old guide tree
-MATRIX=       :Protein weight matrix=BLOSUM, PAM, GONNET, ID or filename
-DNAMATRIX=    :DNA weight matrix=IUB, CLUSTALW or filename
-GAPOPEN=f     :gap opening penalty      
-GAPEXT=f      :gap extension penalty
-ENDGAPS       :no end gap separation pen.
-GAPDIST=n     :gap separation pen. range
-NOPGAP        :residue-specific gaps off
-NOHGAP        :hydrophilic gaps off
-HGAPRESIDUES= :list hydrophilic res.  
-MAXDIV=n      :% ident. for delay
-TYPE=         :PROTEIN or DNA
-TRANSWEIGHT=f :transitions weighting
-ITERATION=    :NONE or TREE or ALIGNMENT
-NUMITER=n     :maximum number of iterations to perform
-NOWEIGHTS     :disable sequence weighting


***Profile Alignments:***
-PROFILE      :Merge two alignments by profile alignment
-NEWTREE1=    :file for new guide tree for profile1
-NEWTREE2=    :file for new guide tree for profile2
-USETREE1=    :file for old guide tree for profile1
-USETREE2=    :file for old guide tree for profile2


***Sequence to Profile Alignments:***
-SEQUENCES   :Sequentially add profile2 sequences to profile1 alignment
-NEWTREE=    :file for new guide tree
-USETREE=    :file for old guide tree


***Structure Alignments:***
-NOSECSTR1     :do not use secondary structure-gap penalty mask for profile 1
-NOSECSTR2     :do not use secondary structure-gap penalty mask for profile 2
-SECSTROUT=STRUCTURE or MASK or BOTH or NONE   :output in alignment file
-HELIXGAP=n    :gap penalty for helix core residues
-STRANDGAP=n   :gap penalty for strand core residues
-LOOPGAP=n     :gap penalty for loop regions
-TERMINALGAP=n :gap penalty for structure termini
-HELIXENDIN=n  :number of residues inside helix to be treated as terminal
-HELIXENDOUT=n :number of residues outside helix to be treated as terminal
-STRANDENDIN=n :number of residues inside strand to be treated as terminal
-STRANDENDOUT=n:number of residues outside strand to be treated as terminal


***Trees:***
-OUTPUTTREE=nj OR phylip OR dist OR nexus
-SEED=n        :seed number for bootstraps.
-KIMURA        :use Kimura's correction.
-TOSSGAPS      :ignore positions with gaps.
-BOOTLABELS=node OR branch :position of bootstrap values in tree display
-CLUSTERING=   :NJ or UPGMA


>> HELP 0 <<             Help for tree output format options
Four output formats are offered: 1) Clustal, 2) Phylip, 3) Just the distances
4) Nexus
None of these formats displays the results graphically. Many packages can
display trees in the the PHYLIP format 2) below. It can also be imported into
the PHYLIP programs RETREE, DRAWTREE and DRAWGRAM for graphical display.
NEXUS format trees can be read by PAUP and MacClade.
1) Clustal format output.
This format is verbose and lists all of the distances between the sequences and
the number of alignment positions used for each. The tree is described at the
end of the file. It lists the sequences that are joined at each alignment step
and the branch lengths. After two sequences are joined, it is referred to later
as a NODE. The number of a NODE is the number of the lowest sequence in that
NODE.
2) Phylip format output.
This format is the New Hampshire format, used by many phylogenetic analysis
packages. It consists of a series of nested parentheses, describing the
branching order, with the sequence names and branch lengths. It can be used by
the RETREE, DRAWGRAM and DRAWTREE programs of the PHYLIP package to see the
trees graphically. This is the same format used during multiple alignment for
the guide trees.
Use this format with NJplot (Manolo Gouy), supplied with Clustal W. Some other
packages that can read and display New Hampshire format are TreeView (Mac/PC),
TreeTool (UNIX), and Phylowin.
3) The distances only.
This format just outputs a matrix of all the pairwise distances in a format
that can be used by the Phylip package. It used to be useful when one could not
produce distances from protein sequences in the Phylip package but is now
redundant (Protdist of Phylip 3.5 now does this).
4) NEXUS FORMAT TREE. This format is used by several popular phylogeny programs,
including PAUP and MacClade. The format is described fully in:
Maddison, D. R., D. L. Swofford and W. P. Maddison.  1997.
NEXUS: an extensible file format for systematic information.
Systematic Biology 46:590-621.
5) TOGGLE PHYLIP BOOTSTRAP POSITIONS
By default, the bootstrap values are placed on the nodes of the phylip format
output tree. This is inaccurate as the bootstrap values should be associated
with the tree branches and not the nodes. However, this format can be read and
displayed by TreeTool, TreeView and Phylowin. An option is available to
correctly place the bootstrap values on the branches with which they are
associated.

Setting up X Windows on Mac

For Snow Leopard.

First check /usr/etc/sshd_config and make sure that "X11 Forwarding yes" has been set.

Then login to remote server with ssh -X user@remote.server

Start remote desktop (e.g. gnome) with gnome-session