The programs in the RAMP suite are described one by one. The programs are grouped into six categories.
contacts
Usage:
contacts
[
conformation-file|
conformation-file-list]
[
output-file]
[
distance-cutoff]
Computes the contact score for protein conformations.
contacts
simply evaluates the number of contacts within a
given distance cutoff and uses it as a contact score.
contact_order
Usage:
contact_order
[
conformation-file|
conformation-file-list]
[
output-file]
[
distance-cutoff]
Computes the (relative) contact order for protein conformations.
contact_order
computes the (relative) contact order,
which is the average sequence separation between contacting residues
divided by the size of the protein. It is a measure of the average
context-sensitivity of a residue in a protein and has shown to
correlate well with folding rates (in other words, the distribution of
the percent contact orders of native structures will be different the
distribution based on random conformations).
The reference for this program is: Plaxco KW, Simons KT, Baker D. Contact order, transition state placement and the refolding rates of single domain proteins. Journal of Molecular Biology 277:985-994, 1998.
density_score
Usage:
density_score
[
conformation-file-list]
[
output-file]
[
dimension]
Does an all-against-all RMSD calculations between a set of conformations and calculates a score based on the distances.
density_score
takes a list of conformation (PDB) files
and does an all-against-all RMSD calculations between the set of
conformations and calculates a score based on the density of
conformations. If the dimension argument is not specified,
then the formula used is sum_ij log(rmsd_ij) for a given conformation
i. If the density is specified and is greater than zero, then
the formula sum_ij (1/rmsd_ij)^D where D is the
dimension argument is used for each conformation i.
Since the program only uses CA atoms, and does not store a matrix, a
much larger number of conformations can be handled.
See also
tdensity_score
.
electrostatics
Usage:
electrostatics
[
conformation-file|
conformation-file-list]
[
output-file]
Computes the electrostatics energy for protein conformations.
electrostatics
computes the total electrostatics energy
between the atoms in a protein molecule.
hcf
Usage:
hcf
[
conformation-file|
conformation-file-list]
[
output-file]
Computes the hydrophobic compactness for protein conformations.
hcf
computes the hydrophobic compactness (or
"moment") between the atoms in a protein molecule. This is a measure
of the square of the radius of gyration of the carbon atoms.
The reference for this program is: Samudrala R, Xia Y, Levitt M, Huang ES. A combined approach for ab initio construction of low resolution protein tertiary structures from sequence. In Altman R, Dunker K, Hunter L, Klein T, Lauderdale K, eds. Proceedings of the Pacific Symposium on Biocomputing, 505-516, 1999.
potential, potential_war, potential_rapdf, potential_rapdf_war, potential_rvpdf, potential_nvpdf
Usage:
potential [-cl]
conformation-file|
conformation-file-list|
loop-data-file
scoring-function-file
[
output-file]
Generates conditional probability scores for protein conformations.
potential
takes a conformation-file or
conformation-file-list (default, or with -c
) and a
scoring-function-file and generates the scores for the
conformations given the parametres in scoring-function-file.
If conformation-file argument has the PDB filename extension
.pdb
, then it is assumed to be a single file. Otherwise the
assumption is that it is a conformation-file-list. The
-l
option has the same behaviour but instead of a
conformation-file or conformation-file-list, a file
containing a loop description, in this format:
205 212
1vfa.pdb
1vfa_205-212.loops.pdb 78
is used to generate scores for different loops. In the above example,
205 and 212 are the starting and ending residues of the loop,
1vfa.pdb
is the name of the experimental structure,
1vfa_205-212.loops.pdb
is the name of the file containing all
the loops, one after the other in PDB format, and 78
is the
number of lines per loop conformation. The template file must have a
loop in place (even if it's garbage---I recommend inserting the first
loop in the loops file in the template structure) so an appropriate
amount of space can be allocated in advance. This design rationale is
because normally you have the experimental structure and you're
comparing scores of loops to the score of the experimental structure
loop, and also because it makes things slightly faster (no memory
allocation/freeing for each loop considered).
potential
is just a link to one of
potential_rapdf
, potential_rvpdf
,
potential_nvpdf
. potential_rapdf
is a
residue-specific all-atom conditional probability discriminatory
function. potential_rvpdf
is a residue-specific virtual-atom
conditional probability discriminatory
function. potential_nvpdf
is a non-residue-specific
virtual-atom probability discriminatory function.
potential_war
behaves in the same manner as
potential
(i.e., one of rapdf, rvpdf or nvpdf) but calculates
the scores of interactions between atoms within a residue.
The reference for these programs is: Samudrala R, Moult J. An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. Journal of Molecular Biology 275:893-914, 1998.
potential_interactions, potential_rapdf_interactions
Usage:
potential_interactions
conformation-pair-file|
conformation-pair-file-list
scoring-function-file
[
output-file]
Generates conditional probability scores for interacting protein conformations.
potential_interactions
behaves in a similar manner as
potential
but calculates the
scores of interactions between atoms in a pair of conformations
(specified by a conformation-file-pair (which is essentially
two conformation file names separted by a "|" character; for example:
filename1.pdb|filename2.pdb
).
The reference for this program is: Samudrala R, Moult J. An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. Journal of Molecular Biology 275:893-914, 1998.
potential_ss, potential_rapdf_ss
Usage:
potential_ss [-ehms]
conformation-file|
conformation-file-list
ss-file
scoring-function-file
[
output-file]
Generates conditional probability scores for protein conformations.
potential_ss
behaves in a similar manner as
potential
(i.e., one of rapdf, rvpdf or
nvpdf) but calculates the scores of interactions between atoms in
different secondary structure elements, as given by ss-file
(in DSSP format). The options are as follows:
e - calculates scores between atoms involved in nonlocal strand-strand interactions
h - calculates scores between atoms involved in nonlocal helix-helix interactions
m - calculates scores between atoms involved in nonlocal helix-strand interactions
s - calculates scores between atoms involved in all secondary structure interactions
The reference for this program is: Samudrala R, Moult J. An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. Journal of Molecular Biology 275:893-914, 1998.
score_profile
Usage:
score_profile
conformation-file
scoring-function-file
[
output-file]
Breaks down scores on a per-residue contribution basis.
score_profile
decomposes the contribution of single
residues based on their interactions with the surrounding environment
for a given conformation. The program uses the RAPDF all-atom pairwise
function (supplied as the second argument) to perform this analysis,
with the implicit assumption that contributions can be divided in a
pairwise manner.
The residue number and score contribution are output to the
optional output file argument or to stdout
if no output file
is specified. All the pairwise side chain contributions are output to
stderr
.
sequence_score
,sequence_similarity_score
,sequence_identity_score
Usage:
sequence_score
[
alignment-file|
alignment-file-list]
[
output-file]
Computes a sequence score (similarity and/or identity) given an alignment.
sequence_score
computes a sequence score (similarity,
when invoked as sequence_similarity_score
, and identity, when
involved as sequence_identity_score
; both when invoked as
sequence_score
) given an alignment (which is specified as two
lines in a file with the same line length).
The similarity score is computed using a matrix which is specified
by SEQUENCE_SIMILARITY_MATRIX_FILE
in
src/common/defines.h
.
solvation
Usage:
solvation
[
conformation-file|
conformation-file-list]
[
output-file]
Computes the solvation energy for protein conformations.
solvation
computes the total Van der Waals energy between the
atoms in a protein molecule.
ss_score
Usage:
ss_score
ss-file|
ss-file-list
ss-file
[
output-file]
Compares the secondary structure between two conformations.
ss_score
compares the secondary structure between two
conformations, specified using the DSSP format. The first argument can
be specified as a single file or as a list of files, and in the latter
case the comparison occurs between all the files in the
ss-file-list. The utility of this program is to compare the
secondary structure of a conformation constructed using modelling
techniques to a predicted secondary structure to ensure that the final
conformation results in a conformation colinear with the predicted
secondary structure. Normally this would work best in the case of
predicting sheets since it requires more than just local geometry to
be predicted accurately.
tdensity_score
Usage:
tdensity_score
[
conformation-file-list]
[
output-file]
Does an all-against-all torsion RMSD calculations between a set of conformations and calculates a score based on the distances.
tdensity_score
is similar to
density_score
except that phi/psi values are used instead
of Cartesian coordinates.
See also
density_score
.
vdw
Usage:
vdw
[
conformation-file|
conformation-file-list]
[
output-file]
Computes the VdW energy for protein conformations.
VdW
computes the total Van der Waals energy between the
atoms in a protein molecule.
potential_filter
Usage:
potential_filter
conformation-file|
conformation-file-list|
loop-data-file
scoring-function-file
[
output-file]
Filters protein conformations using different scoring functions.
potential_filter
filters protein conformations using a
combination of many of the other scoring functions described above,
including but not limited to
electrostatics
,
vdw
,
hcf
,
potential_rapdf
, and other terms such as volume, radius
of gyration, and so on. The exact combination and the scoring
functions to be used for filtering can be controlled via
potential_filter.c
and potential_filter.h
.
compile_raw_counts
Usage:
compile_raw_counts
[-aehmsw]
[
conformation-file[-pair]|conformation-file[-pair]-list]
[
raw-counts-file]
Compiles raw counts from a database of proteins.
compile_raw_counts
takes a conformation file or a list of
conformation files and compiles raw counts for all pairs of atoms
(167x167) for 18 distance bins.
The flags control the behaviour of how the raw counts are compiled:
a - compiles raw counts for all pairs of atoms (default)
e - compiles raw counts between atoms involved in nonlocal strand-strand interactions
h - compiles raw counts between atoms involved in nonlocal helix-helix interactions
m - compiles raw counts between atoms involved in nonlocal helix-strand interactions
s - compiles raw counts between atoms involved in nonlocal secondary structure interactions
w - compiles raw counts for pairs of atoms within a residue
A DSSP-format file with the extension dssp
must be present
for the secondary structure interaction calculation.
If instead of a single conformation file, a
conformation-file-pair specification is given (which is
essentially two conformation file names separted by a "|" character;
for example: filename1.pdb|filename2.pdb
), then the program
operates on only the interatomic contacts between the two files (all
the other behaviour is identical). This is useful to compile raw
counts for interface/interacting regions between different chains or
domains.
The reference for this program is: Samudrala R, Moult J. An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. Journal of Molecular Biology 275:893-914, 1998.
See also
compile_scores
,
potential
,
remove_raw_counts
.
compile_scores
Usage:
compile_scores
raw-counts-file
[
output-file]
Calculates conditional probability scores given a set of counts.
compile_scores
takes a set of raw counts for all pairs
of atoms (167x167) for 18 distance bins and generates the negative log
conditional probability scores for the conformations.
The reference for this program is: Samudrala R, Moult J. An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. Journal of Molecular Biology 275:893-914, 1998.
See also
compile_raw_counts
,
plot_scores
,
potential
.
plot_scores
Usage:
plot_scores
groups-list|group-pair
scoring-function-file
[
output-file]
Plots the conditional probability scores for a set of atom/group pairs.
plot_scores
plots the negative log conditional
probability scores for a set of atom/group pairs from the given
scoring function file. This is useful for viewing the distribution in
a graphical manner (the output can easily be piped into a program such
as gnuplot).
A single pair of group can be specified as an argument in this
form ACA-WN
(using one letter amino acid code to indicate an
alanine alpha-carbon and a tryptophan nitrogen) or a file containing a
list of such group (separated by '-' or a space (' ')).
Features: Obviously a file containing a list of groups cannot have a '-' in its name.
The reference for this program is: Samudrala R, Moult J. An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. Journal of Molecular Biology 275:893-914, 1998.
See also
compile_scores
,
potential
.
remove_raw_counts
Usage:
remove_raw_counts
[-ahsw]
conformation-file|conformation-file-list
raw-counts-file
Removes raw counts in a database of proteins from a set of raw counts.
remove_raw_counts
takes a conformation file or a list of
conformation files and substracts the raw counts for all pairs of
atoms (167x167) for 18 distance bins in those conformations from a
previously compiled set of raw counts. This is useful for
jack-knifing.
The flags behave in the same manner as in
compile_raw_counts
.
The reference for this program is: Samudrala R, Moult J. An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. Journal of Molecular Biology 275:893-914, 1998.
See also
compile_raw_counts
,
compile_scores
,
potential
.
mcgen_exhaustive
Usage:
mcgen_exhaustive
[-
discrete-state-model]
sequence-file
scoring-function-file
[
output-file-prefix]
[
num-conformations]
Exhaustively enumerates a sequence using a n-state phi/psi model.
mcgen_exhaustive
exhaustively enumerates a sequence using
a n-state phi/psi model. The default n-state
phi/psi model is the one in the file
ramp/lib/phi_psi_allowed_angles.default
. See
ramp/lib/phi_psi_allowed_angles.*
for a list of files
containing the available n-state models. The string
representing discrete-state-model is suffixed to
ramp/lib/phi_psi_allowed_angles
(with a "." between them) to
produce the filename containing the corresponding model.
The sequence is specified using one-letter amino acid code (generally in a single line) in sequence-file.
The output-file-prefix, if specified, will be the prefix
used for the output file, suffixed with .best.pdb
. If
num-conformations is specified, then many files of the form
output-file-prefix.best
n.pdb
(up
to num-conformations) will be generated.
Features: There are various limitations on the number of residues/atoms that can be handled, which you will find out about as you use the program.
See also
mcgen_exhaustive_loop
,
mcgen_exhaustive_loop_ss
,
mcgen_exhaustive_ss
.
mcgen_exhaustive_loop
Usage:
mcgen_exhaustive_loop
[-
discrete-state-model]
loop-data-file
scoring-function-file
[
output-file-prefix]
[
num-conformations]
Generates loop (missing) regions given a framework.
mcgen_exhaustive_loop
generates loops for a given region (as
specified in the loop-data-file) by exhaustively enumerating
all possible main chain conformations for that loop using an
n-state phi/psi model and selecting the best ones using the
RAPDF scoring function.
The default n-state phi/psi model is the one in the file
ramp/lib/phi_psi_allowed_angles.loop_search
. See
ramp/lib/phi_psi_allowed_angles.*
for a list of files
containing the available n-state models. The string
representing discrete-state-model is suffixed to
ramp/lib/phi_psi_allowed_angles
(with a "." between them)
to produce the filename containing the corresponding model. An
example invocation would be:
mcgen_exhaustive_loop -syssearch.14 loop-data-file ramp/lib/scores
where ramp/lib/phi_psi_allowed_angles.syssearch.14
will
represent the discrete state model, and the scoring will be specified
by the file ramp/lib/scores
.
The format of the loop-data-file is as follows:
205 212 RERDYRLD
1vfa.pdb
loops.pdb 10
constraint 205 212 5.50 1.0
constraint 212 213 3.80 1.0
In the above example, 205-215 is the residue range of the loop,
RERDYRLD
is the sequence, 1vfa.pdb
is the framework
upon which the loop will built, loops.pdb
is the file which
will contain the 10 loop conformations (residues 205-212 only) that
will be output, followed by a set of CA loop constraints. The string
"none 0
" can be used to suppress the output of this file. The
format of the constraints is similar to that used in
constraints_filter
, essentially
representing CA distances and (+/-) tolerance for pairs of
residues.
The output-file-prefix, if specified, will be the prefix
used for the output file, suffixed with .best.pdb
. If
num-conformations is specified, then many files of the form
output-file-prefix.best
n.pdb
(up
to num-conformations) containing the coordinates for all the
residues will be generated. The maximum number conformations output
will be capped by MAX_MAINCHAINS
in
ramp/src/common/defines.h
.
Features: since the loops are generated in torsion space, the psi value for the residue before the loop region begins must be known. But this can be known only if the CA atom is known for the first residue in the loop. Thus an extra residue must always be present (and therefore be built) on the framework to provide the psi angle for the first residue of the loop. If the loop is an N-terminal loop, then the phi angle for the last residue in the loop must be calculable from the coordinates, since the chain will be built in the reverse direction.
Constraints must be chosen wisely. The CA distance between the end of the loop and the next residue in the framework must also be used to tether the loop to the other end.
See also
constraints_filter
,
mcgen_exhaustive_loop_ss
,
mcgen_semfold_loop
,
mcgen_semfold_loop_ss
.
mcgen_exhaustive_loop_ss
Usage:
mcgen_exhaustive_loop_ss
[-
discrete-state-model]
loop-data-file
scoring-function-file
ss-file
[
output-file-prefix]
[
num-conformations]
Generates loop (missing) regions given a framework and a secondary structure specification.
mcgen_exhaustive_loop_ss
generates loops for a given
region (as specified in the loop-data-file) by exhaustively
enumerating all possible main chain conformations for that loop using
an n-state phi/psi model and selecting the best ones using
the RAPDF scoring function, taking into account the secondary
structure specified in ss-file (DSSP format). In all other
ways, its behaviour is identical to
mcgen_exhaustive_loop
.
See also
constraints_filter
,
mcgen_exhaustive_loop_ss
,
mcgen_semfold_loop
,
mcgen_semfold_loop_ss
.
mcgen_exhaustive_ss
Usage:
mcgen_exhaustive_ss
[-
discrete-state-model]
sequence-file
scoring-function-file
ss-file
[
output-file-prefix]
[
num-conformations]
Exhaustively enumerates a sequence using a n-state phi/psi model keeping secondary structure fixed.
mcgen_exhaustive_ss
exhaustively enumerates a sequence
using a n-state phi/psi model while assigning idealised
phi/psi values to helix and sheet residues (see
ramp/src/common/defines.h
). In all other ways, its behaviour
is similar to
mcgen_exhaustive
.
See also
mcgen_exhaustive
,
mcgen_exhaustive_loop
,
mcgen_ficsa_ss
,
mcgen_semfold_ss
,
phd_to_dssp
.
mcgen_ficsa_ss
Usage:
mcgen_ficsa_ss
sequence-file
scoring-function-file
ss-file
ntuplet-database-file
[
output-file]
[
random-seed]
Apply the fragment insertion conformational space annealing (FICSA) technique for predicting protein structure between secondary structure units.
mcgen_ficsa_ss
applies the FICSA method for predicting
protein structure, inserting fragments only for residues between
secondary structure units such as helices or sheets or both (depending
on how it is compiled). The secondary structure units are assigned
default phi/psi angles (see ramp/src/common/defines.h
).
The FICSA method is based on inserting small (usually
three-residue) fragments, either from a database, or generated using
discrete phi/psi models, randomly and a conformational space annealing
procedure to find combinations of these fragments that have the lowest
score. The ntuplet-database-file is generated using a
combination of
mcgen_ntuplets
and
prune_ntuplet_db
.
See also
mcgen_exhaustive_ss
,
mcgen_semfold_ss
,
mcgen_ntuplets
,
prune_ntuplet_db
.
mcgen_fit
Usage:
mcgen_fit
[-
discrete-state-model]
conformation-file
[
output-file-prefix]
[
num-conformations]
Fits a n-state phi/psi model to a given conformation.
mcgen_fit
fits a n-state phi/psi model to a
given conformation minimising the CA RMSD between the model and the
conformation.
The default n-state phi/psi model is the one in the file
ramp/lib/phi_psi_allowed_angles.default
. See
ramp/lib/phi_psi_allowed_angles.*
for a list of files
containing the available n-state models. The string
representing discrete-state-model is suffixed to
ramp/lib/phi_psi_allowed_angles
(with a "." between them) to
produce the filename containing the corresponding model.
The output-file-prefix, if specified, will be the prefix
used for the output file, suffixed with .best.pdb
. If
num-conformations is specified, then many files of the form
output-file-prefix.best
n.pdb
(up
to num-conformations) will be generated.
Features: There are various limitations on the number of residues/atoms that can be handled, which you will find out about as you use the program.
See also
mcgen_fit_ss
,
phd_to_dssp
.
mcgen_fit_ss
Usage:
mcgen_fit_ss
[-
discrete-state-model]
conformation-file
ss-file
[
output-file-prefix]
[
num-conformations]
Fits secondary structures to main chains using a n-state phi/psi model.
mcgen_fit_ss
fits to a given conformation using an
n-state phi/psi model minimising the CA RMSD but using
idealised secondary structure values for helix and sheet
residues.
The default n-state phi/psi model is the one in the file
ramp/lib/phi_psi_allowed_angles.default
. See
ramp/lib/phi_psi_allowed_angles.*
for a list of files
containing the available n-state models. The string
representing discrete-state-model is suffixed to
ramp/lib/phi_psi_allowed_angles
(with a "." between them)
to produce the filename containing the corresponding model.
The first thing you need is a ss-file (DSSP format)
containing the secondary structure. This can be created by running
dssp on the conformation-file (PDB format) and then modifying the
secondary structure assignment to your liking. Alternately, if you
have a PHD prediction, then you can save the prediction to a file, and
run
phd_to_dssp
on it.
Once you have created the ss-file, and have the conformation-file to which you want to fit the secondary structure, just type in the command with the conformation-file and the ss-file as the arguments.
The output-file-prefix, if specified, will be the prefix
used for the output file, suffixed with .best.pdb
. If
num-conformations is specified, then many files of the form
output-file-prefix.best
n.pdb
(up
to num-conformations) will be generated.
Features: There are various limitations on the number of residues/atoms that can be handled, which you will find out about as you use the program.
See also
mcgen_fit
,
phd_to_dssp
.
mcgen_idealise
Usage:
mcgen_idealise
conformation-file
num-iterations|rmsd-limit
[
output-file]
Converts an experimental structure to have ideal geometry.
mcgen_idealise
represents an experimental structure using
ideal bond lengths and bond angles by randomly perturbing the torsion
angles and using a steepest descent minimisation to lower the RMSD (as
specified by rmsd-limit or DEFAULT_RMSD_LIMIT
in
ramp/src/mcgen/mcgen_idealise.c
) for a given number of
iterations (as specified by num-iterations or
DEFAULT_NUM_ITERATIONS
in
ramp/src/mcgen/mcgen_idealise.c
).
Features: Since mcgen_idealise
uses steepest descent
minimisation, the optimisation process is not very efficient.
mcgen_list
Usage:
mcgen_list
[
torsion-angle-list]
[
output-file]
Generates a conformation using specified torsion valus
mcgen_list
takes as input a a list of torsion values,
specified using the format output by list_torsions
:
C - 1 - psi is 230.022
C - 1 - chi1 is 299.959
R - 2 - phi is 295.965
R - 2 - psi is 144.819
R - 2 - chi1 is 311.778
R - 2 - chi2 is 172.907
R - 2 - chi3 is 187.699
R - 2 - chi4 is 100.294
and generates a new conformation. Since experimental structures don't use idealised bond lengths and bond angles, reproducing the experimental structure using only a list of torsion values is not possible.
See also
mcgen_list_ss
.
mcgen_list_ss
Usage:
mcgen_list_ss
conformation-file
ss-file
[
output-file]
Generates a conformation using the phi/psi values from a secondary structure (DSSP) file.
mcgen_list_ss
takes as input a conformation (PDB) file
and a secondary structure (DSSP) file and generates a new conformation
using the torsion angles specified in the DSSP file. Idealised values
are used for helix (H
) and sheet (E
) residues, and
the secondary structure specification field in the DSSP file (using
F
, for "fixed") can also be used to ask the program use to the
existing phi/psi values for a given residue in
conformation-file.
Features: A PDB file is used to specify the sequence (this is bad
design in some ways, but a program like
mcgen_list
can be used to create the PDB file, and this
enables the use of phi/psi values already present in the
conformation).
See also
mcgen_list
.
mcgen_ntuplets
Usage:
mcgen_ntuplets
[
conformation-file-list]
[
n-tuple-size]
[
output-file]
Generate a database of phi/psi angles for fragments.
mcgen_ntuplets
takes as input a set of conformation files
and creates a database of fragments (with size specified by
n-tuple-size) and the list of phi/psi angles that are found
in the fragments. For each conformation in
conformation-file-list, the fragments are are generated from
the N-terminus until the end of the chain is reached. Fragments that
do not possess chain connectivity are discarded.
See also
prune_ntuplet_db
.
mcgen_semfold_fit_ss
Usage:
mcgen_semfold_fit_ss
sequence-file
scoring-function-file
ss-file
ntuplet-database-file
[
output-file]
[
num-conformations]
[
num-iterations]
[
random-seed]
Apply the segment matching and folding (SEMFOLD) technique to generate a fold that fits a particular structure.
mcgen_semfold_fit_ss
is identical to
mcgen_semfold_ss
, except that it
tries to mimic a known experimental conformation by minimising the
RMSDs of the decoys generated to the known structure (which is
specified as a file in PDB format called exp.pdb
in there
directory where this program is called).
This program is experimental.
See also
mcgen_semfold_ss
.
mcgen_semfold_loop
Usage:
mcgen_semfold_loop
loop-data-file
scoring-function-file
ntuplet-database-file
[
output-file-prefix]
[
num-conformations]
[
num-iterations]
[
random-seed]
Apply the segment matching and folding (SEMFOLD) technique for predicting protein structure in a specified loop region.
mcgen_semfold_loop
applies the SEMFOLD method for predicting
protein structure, inserting fragments only for residues in a
specified loop region.
The SEMFOLD method is based on inserting small (usually
three-residue) fragments randomly and a Monte Carlo/simulated
annealing procedure to find combinations of these fragments that have
the lowest score. The ntuplet-database-file is generated
using a combination of
mcgen_ntuplets
and
prune_ntuplet_db
. See
mcgen_exhaustive_loop
for the
format of the loop-data-file and more information on features
and usage.
See also
constraints_filter
,
mcgen_exhaustive_loop
,
mcgen_exhaustive_loop_ss
,
mcgen_semfold_loop_ss
.
mcgen_semfold_loop_ss
Usage:
mcgen_semfold_loop
loop-data-file
scoring-function-file
ss-file
ntuplet-database-file
[
output-file-prefix]
[
num-conformations]
[
num-iterations]
[
random-seed]
Apply the segment matching and folding (SEMFOLD) technique for predicting protein structure in a specified loop region taking secondary structures into account.
mcgen_semfold_loop_ss
applies the SEMFOLD method for predicting
protein structure, inserting fragments only for residues in a
specified loop region, taking secondary structures into account as
specified by the secondary structure (DSSP) file. In all other
respects, its behaviour is identical to
mcgen_semfold_loop
.
See also
constraints_filter
,
mcgen_exhaustive_loop
,
mcgen_exhaustive_loop_ss
,
mcgen_semfold_loop
.
mcgen_semfold_region_ss
Usage:
mcgen_semfold_region_ss
sequence-file
scoring-function-file
ss-file
ntuplet-database-file
region
[
output-file]
[
num-conformations]
[
num-iterations]
[
random-seed]
Apply the segment matching and folding (SEMFOLD) technique for predicting protein structure between secondary structure units for a particular region.
mcgen_semfold_region_ss
is identical to
mcgen_semfold_ss
, except that only
a particular region is allowed to move (the rest of the protein is
held fixed according to the torsion values specified in
ss-file).
region is specified by two residue numbers separated by
-
, i.e., 22-40
.
See also
mcgen_semfold_loop_ss
,
mcgen_semfold_ss
.
mcgen_semfold_ss
Usage:
mcgen_semfold_ss
sequence-file
scoring-function-file
ss-file
ntuplet-database-file
[
output-file]
[
num-conformations]
[
num-iterations]
[
random-seed]
Apply the segment matching and folding (SEMFOLD) technique for predicting protein structure between secondary structure units.
mcgen_semfold_ss
applies the SEMFOLD method for predicting
protein structure, inserting fragments only for residues between
secondary structure units such as helices or sheets or both (depending
on how it is compiled). The secondary structure units are assigned
default phi/psi angles (see ramp/src/common/defines.h
).
The SEMFOLD method is based on inserting small (usually
three-residue) fragments, either from a database, or generated using
discrete phi/psi models, randomly and a Monte Carlo/simulated
annealing procedure to find combinations of these fragments that have
the lowest score (on a cluster of machines, with the appropriate shell
scripts, a genetic algorithm can be performed between trajectories to
further optimise the scores). The ntuplet-database-file is
generated using a combination of
mcgen_ntuplets
and
prune_ntuplet_db
in cases where fragments are obtained
from a database; the file can also be generated by exhaustively
enumerating the discrete state phi/psi models included with the
distribution.
The reference for these programs is: Samudrala R, Levitt M. A comprehensive analysis of 40 blind protein structure predictions. BMC Structural Biology 2: 3-18 (2002).
See also
mcgen_exhaustive_loop
,
mcgen_semfold_loop
,
mcgen_ntuplets
,
prune_ntuplet_db
.
prune_ntuplet_db
Usage:
prune_ntuplet_db
sequence-file
ntuplet-database-file
ntuplet-size
[
output-file]
Filters a database of containing phi/psi angles for identical fragments.
prune_ntuplet_db
takes as input an ntuplet database of
phi/psi angles, as created by
mcgen_ntuplets
, and filters it such that only identical
tuplets that are present in the sequence of interest are output. This
reduces the database size considerably and saves on memory and enables
storage of large sized fragments for a given sequence. This program
is ideal for fragments of length <= 3 where exact matches are
available.
Additionally, by default, terminal ntuplets (for which the phi and
psi angles are not defined, respectively) are discarded. This latter
behaviour can be modified by commenting out the
ELIMINATE_TERMINAL_TUPLETS
variable in
ramp/src/mcgen/prune_ntuplet_db.c
.
See also
mcgen_ntuplets
,
prune_ntuplet_db_ss
.
prune_ntuplet_db_ss
Usage:
prune_ntuplet_db_ss
sequence-file
scoring-function-file
ss-file
ntuplet-database-file
ntuplet-size
[
output-file]
Filters a database of containing phi/psi angles for similar fragments.
prune_ntuplet_db_ss
takes as input an ntuplet database of
phi/psi angles, as created by
mcgen_ntuplets
, and filters it such that only similar
tuplets that are present in the sequence of interest are output. This
reduces the database size considerably and saves on memory and enables
storage of large sized fragments for a given sequence.
"Similarity" of sequences is determined by
scoring-function-file
which is expected to be a matrix with a
format as the one used by BLOSUM62
and the secondary
structure compatibility. This program is ideal for fragments of
length >= 4 where exact matches are unlikely.
Additionally, by default, terminal ntuplets (for which the phi and
psi angles are not defined, respectively) are discarded. This latter
behaviour can be modified by commenting out the
ELIMINATE_TERMINAL_TUPLETS
variable in
ramp/src/mcgen/prune_ntuplet_db.c
.
See also
mcgen_ntuplets
,
prune_ntuplet_db
.
scgen
, scgen_single
, scgen_double
Usage:
scgen
conformation-file
scoring-function-file
[
output-file]
[
num-sidechains]
Generates low scoring side chain conformations.
scgen
(which is just a link to scgen_single
or
scgen_double
) takes an existing conformation (PDB), a scoring
function, and generates the top scoring side chains (the number is
determined by num-sidechains) as ranked by the scoring
function.
The reference for these programs is: Samudrala R, Moult J. Determinants of side chain conformational preferences in protein structures. Protein Engineering 11: 991-997, 1998.
Features: scgen
will not generate any new
coordinates---it will only explore/twirl existing side chains. Use
scgen_build
to generate a set of
built side chains if no side chains are present.
scgen_build
Usage:
scgen_build
conformation-file
chi-angle-list
[
output-file]
Builds side chains given a list of chi angles.
scgen_build
takes a list of chi angles, in the format
generated by
list_torsions
:
C - 1 - chi1 is 299.959
R - 2 - chi1 is 311.778
R - 2 - chi2 is 172.907
R - 2 - chi3 is 187.699
R - 2 - chi4 is 100.294
and builds the side chain in a given conformation to the match the
values in the list (for those values that are specified). The side
chains are built from scratch using standard geometry, i.e., the side
chain coordinates in the conformation file are discarded. This has the
advantage that side chains that don't exist can be built with this
program, but if the side chains do exist, then it is better to
transform (twirl) the existing coordinates using
scgen_list
.
See also
scgen_build_missing
,
scgen_library
,
scgen_list
.
scgen_build_missing
Usage:
scgen_build_missing
[
conformation-file]
[
output-file]
Builds side chains for those residues with missing side chain atoms.
scgen_build_missing
builds side chains for those residues
with missing side chain atoms (using a standard library defined by
CHI_ALLOWED_ANGLES_FILE
). The side chains are built from
scratch using standard geometry, i.e., the side chain coordinates in
the conformation file are discarded. This has the advantage that side
chains that don't exist can be built with this program, but if the
side chains do exist, then it is better to transform (twirl) the
existing coordinates using
scgen_list
.
The reference for this program is: Samudrala R, Huang ES, Koehl P, Levitt M. Constructing side chains on near-native main chains for ab initio protein structure prediction. Protein Engineering, 7: 453-457, 2000.
See also
scgen_build
,
scgen_library
,
scgen_list
.
scgen_library
Usage:
scgen_library
conformation-file
[
output-file]
Generates side chains rotamer library values closest to experimental structure values.
scgen_library
takes a conformation file with all the
atoms and twirls the side chains so the rotamers represent the closest
library values to the corresponding experimental structure
values.
The reference for this program is: Samudrala R, Moult J. Determinants of side chain conformational preferences in protein structures. Protein Engineering 11: 991-997, 1998.
See also
scgen_build
,
scgen_list
.
scgen_list
Usage:
scgen_list
conformation-file
chi-angle-list
[
output-file]
Generate side chains given a list of chi angles.
scgen_list
takes a list of chi angles, in the format
generated by
list_torsions
:
C - 1 - chi1 is 299.959
R - 2 - chi1 is 311.778
R - 2 - chi2 is 172.907
R - 2 - chi3 is 187.699
R - 2 - chi4 is 100.294
and twirls the side chain in a given conformation to the match the values in the list.
The reference for this program is: Samudrala R, Moult J. Determinants of side chain conformational preferences in protein structures. Protein Engineering 11: 991-997, 1998.
See also
scgen_build
,
scgen_library
.
scgen_mutate
Usage:
scgen_mutate
conformation-file
alignment
[
output-file]
Mutates (changes) amino acids in a parent/template structure based on an alignment.
scgen_mutate
takes an alignment between a parent/template
sequence and a target sequence, the template conformation file, and
creates a minimum perturbation model of the target conformation using
the information in the template structure.
The alignment is specified in a file containing two lines in one-letter amino acid code, the first being the template sequence and the second being the target. Insertions and deletions are specified by dashes (and are not constructed). The coordinates for the main chain are copied over as is. Likewise for residues that are identical between the template and target sequences. For nonidentical residues, a chi-angle equivalence matrix is used to specify the chi angle values that are used from the parent structure to set the values in the template structure. Chi angles values that are not taken (or available) from the parent structure are obtained from a library containing a list of the most frequently observed chi angles as used by Randy Read's program MUTATE which performs an identical task. For example, if a V is mutated to a K, then the chi 1 of V is used as the chi 1 for K, and chi 2, chi 3, chi 4 for K are set from a library of standard values.
The reference for this program is: Samudrala R, Huang ES, Koehl P, Levitt M. Constructing side chains on near-native main chains for ab initio protein structure prediction. Protein Engineering, 7: 453-457, 2000.
This program is useful for producing an initial model for comparative/homology modelling.
cf
, cf_single
Usage:
cf
possibility-file-list
scoring-function-file
[
output-file-prefix]
[
num-conformations]
Selects the best scoring conformation mixing and matching between a set of possible conformations.
cf
selects the best scoring conformation mixing and
matching between a set of possible conformations (specified as nodes
in a graph). The PDB file format is used to specify a
possibility-file. Each possibility-file can contain
one main chain conformation for all the residues in the file.
Different (multiple) side chain conformations for a given residue can
be specified in the file also (the last atom record should have the
largest residue number). The possibility-file-list is a list
of different possibility files, where each file represents one main
conformation for a range of residues and one or more side chain
conformations per residue. cf
will find the optimal
arrangement (as evaluated by the all-atom distance-dependent pairwise
scoring function (
potential_rapdf
)
of the different possibilities by representing each possibility as a
node in a graph; each interaction between two possible conformations
(nodes) as a weighted edge; and finding the maximal completely
connected subgraph (clique) with the lowest total score.
The format of a possibility-file-list is as follows:
file1.pdb
file2.pdb
.
.
.
fileN.pdb
Each PDB file (up to MAX_MAINCHAINS
) defines a single main
chain. Each main chain position can have many different side chains
(stored in alt_atoms
in the residue
data structure).
To make nodes, the program obtains the score of every conformation
with respect to the local main chain region. Since each file contains
a single main chain, this is already given. However, the score of a
node is ignored in the final summation for a clique score to avoid
double counting. Zero weighted edges means no edge and any nonzero
value will indicate the strength of the interaction between the two
possible residue conformations represented by the node pair. Thus
weighted edges handle interconnectedness and nodes are measures of
local goodness. In abstract, a node is specified by a residue number,
a possible conformation of side chain, and a possible conformation of
a main chain. An edge is specified by a pair of nodes.
For the purposes of evaluating covalent consistency, a set of
crossover regions is required. These regions are specified as residue
ranges (pairs of residuue numbers separated by whitespace) in the file
possibility-file-list.crossover_regions
. For each
pair of crossover regions a and b, if residue
a and residue b for nodes i and
j are within a crossover region, then the main chain for node
i must be the same as the main chain for node
j.
Using a crossover region of "1 N
" where N is
the size of the protein means that no crossovers will be allowed.
Using a crossover region of "0 0
" will allow all positions to
crossover.
Possible side chain conformations can be generated using
scgen
. Main chain conformations for short regions can be
generated using
mcgen_exhaustive_loop
or
mcgen_semfold_loop
or any other loop generation program.
The default single residue per node representation does not exclude
the idea of twirling sidechains in a pairwise manner using
scgen_double
.
The lowest scoring cliques are stored in a data structure in a queue fashion. For each list of cliques, the highest scoring clique found thus far is stored, along with its index. When a clique with lower score is found, it replaces the existing lowest scoring clique, and the highest scoring clique's index is recalculated.
To speed up evaluation, only cliques matching a certain size are evaluated. Editing the environment of the protein to remove extraneous bits also results in a significant speedup.
Features: Since the graph uses one byte of storage space for its edges, the score of an edge is rounded so it can fit in a single byte. As a result, a small amount of precision is lost.
The reference for this program is: Samudrala R, Moult J. A graph-theoretic algorithm for comparative modelling of protein structure. Journal of Molecular Biology 279:287-302 (1998).
See also
scgen
,
mcgen_exhaustive_loop
,
mcgen_semfold_loop
.
1line_to_fasta
Usage:
1line_to_fasta
[
input-file]
[
identifier]
[
output-file]
Converts a 1line file to a FASTA file.
1line_to_fasta
takes a sequence in a 1line formatted file
(sequence in a single 1line or just the single-letter amino acid
characters only) and converts it into a FASTA formatted file.
See also
fasta_to_1line
.
add_to_resno
Usage:
add_to_resno
conformation-file
value
[
output-file]
Changes the residue numbering in a conformation (PDB) file.
add_to_resno
changes the residue numbering in a
conformation (PDB) file by adding the number specified by
value to the existing residue numbers. This is useful if
you're working with certain programs that require residue numbers to
start from 1, and you need to go back and forth between different
numberings.
See also
clean_pdb
,
insert_chain_id
.
average_xyz
Usage:
average_xyz
conformation-file-list
[
output-file]
Averages the XYZ coordinates in a set of conformation (PDB) files.
average_xyz
averages the XYZ Cartesian coordinates in a
set of conformation (PDB) files specified by
conformation-file-list. Averaging the Cartesian coordinates
of a set of conformations around a native structure in RMSD space has
been known to result in a conformation that is closer to the native
structure than any of the individual conformations. However, since
geometry information is not taken into account during the averaging
process, a minimisation step may be required to make the final
conformation look decent.
ca_only
, mc_only
, sc_only
Usage:
ca_only
[
conformation-file]
[
output-file]
Extracts the appropriate atoms from a conformation file.
ca_only
extracts the CA atoms from a conformation (PDB)
file. mc_only
extracts the main chain atoms from a
conformation (PDB) file. sc_only
extracts the side chain
atoms from a conformation (PDB) file.
clean_pdb
, renumber_pdb
, striphyd
Usage:
clean_pdb
[
conformation-file]
[
output-file]
Cleans up a conformation (PDB) file.
clean_pdb
removes extraneous information from a
conformation (PDB) file and cleans it up so it is usable by many of
the programs in the RAMP suite. This includes stripping the
conformation off of any hydrogen atoms (only action performed when
invoked as striphyd
), removing alternate atom conformations,
and renumbering the residues (only action performed when invoked as
renumber_pdb
).
Features: Since the PDB files can be so varied in how information
is recorded, this program might not always work. Also, you probably
should use
extract_pdb_chain
to extract the chain interest before
applying this program.
See also
add_to_resno
.
extract_pdb_chain
.
compare_structures
Usage:
compare_structurs
[-aclmors]
conformation-file
conformation-file|conformation-file-list
[
output-file]
Compares two structures with different sequences.
compare_structures
is similar to
fit
except that it can compare two structures that are
not identical in sequence. The options are also similar (though
they're not all guaranteed to work), except for the -s
option
which outputs the structure based sequence alignment (which is ignored
if conformation-file-list is specified).
This program is experimental. It is based on the ALIGN program of Gerson Cohen from the NIH.
See also
fit
.
compare_torsions
Usage:
compare_torsions
torsions-list
torsions-list
[
angle]
[
output-file]
Compares two sets of torsions.
compare_torsions
takes two files containing lists of
torsions, as generated by
list_torsions
and outputs the percentage of torsions that
match within a particular cutoff (the default is 30 degrees, set by
MIN_TORSION_CUTOFF
in
ramp/src/misc/compare_torsions.c
.
See also
list_torsions
.
consensus_distances
Usage:
consensus_distances
[
conformation-file-list]
[
output-file]
Generates the consensus distances from a set of conformations.
consensus_distances
takes a set of conformations and
calculates the most frequently observed, i.e., "consensus", distances
between atoms within that set.
Features: Currently only CA distances are calculated.
constraints_filter
Usage:
constraints_filter
conformation-file-list
constraints-file
[
output-file]
Filters a set of structures using distance constraints.
constraints_filter
takes a set of constraints specified
in constraints-file and filters a set of conformations
specified in conformation-file-list, outputting only those
conformation file names that satisfy all the specified
constraints.
The format of the constraints-file is as follows:
constraint 10 22 6.0 1.0
constraint 58 73 6.0 1.0
.
.
.
constraint 98 143 6.0 1.0
where each line starts with the string constraint
followed by
the residues involved in the constraint, the distance between them,
and a (+/-) tolerance value for that distance. A negative tolerance
value indicates that the constraint used is an absolute distance
(i.e., all atom pairs within (lesser than) the distance specified will
be considered to satisfy the constraint.
Features: Currently only CA constraints are used.
See also potential_filter
.
dssp_consensus
Usage:
dssp_consensus
[-n]
[
ss-file-list]
[
output-file]
Generates the consensus of a set of secondary structure assignments.
dssp_consensus
take a list of secondary structure
assignments in DSSP format and generates the consensus. It is useful
for combining different secondary structure prediction methods. The
probabilities output are based on the frequency of occurances of the
particular type of secondary structure.
dssp_to_ss
Usage:
dssp_to_ss
[-n]
[
ss-file]
[
output-file]
Reformats secondary structure assignments.
dssp_to_ss
takes the secondary structure assignments in
DSSP format and converts it to single line. It is useful for
comparison with various secondary structure assignments. The
-n
option will output 1
for helix, 2
for
sheet, and 3
for other.
extract_nmr_model
Usage:
extract_nmr_model
conformation-file
model-number
[
output-file]
Extracts a particular from an NMR conformation (PDB) file.
extract_nmr_model
extracts a model from a specified NMR
conformation (PDB) file. The model numbers are integers, usually from
1-20.
See also
extract_pdb
,
extract_pdb_chain
.
extract_pdb
Usage:
extract_pdb
conformation-file
[
extract-regions-file]
[
output-file]
Extracts regions from a conformation (PDB) file.
extract_pdb
extracts regions specified via standard input
or extract-regions-file from a specified conformation (PDB)
file. The regions must be specified as pairs of numbers which
represent residue ranges (inclusive).
See also
extract_nmr_model
,
extract_pdb_chain
.
extract_pdb_chain
Usage:
extract_pdb_chain
conformation-file
chain
[
output-file]
Extracts chains from a conformation (PDB) file.
extract_pdb_chain
extracts chains from a specified
conformation (PDB) file. The chain is specified in the form of a one
character string, i.e., "A", "B".
See also
extract_nmr_model
,
extract_pdb
.
fasta_to_1line
Usage:
fasta_to_1line
[
input-file]
[
output-file]
Converts a FASTA file into a 1line file.
fasta_to_1line
takes a standard FASTA file and converts
it such that each line contains a single sequence (i.e., the delimeter
between sequences is a newline as opposed to a line beginning with
>).
See also
1line_to_fasta
.
fit
Usage:
fit
[-aclmor]
conformation-file
conformation-file|conformation-file-list
[
output-file]
Superimposes pairs of conformations.
fit
superimposes pairs of conformations. If the second
argument is a conformation file (determined by the extension), then
fit
will superpose the second conformation onto the first one
and output the superposed conformation (the RMSD will be output to
stderr
). If the second argument is a list (determined by the
extension), then fit
will superpose all the conformations
specified in the list onto the second one and output the RMSDs. The
following options are applicable:
-a all atom fit/RMSD
-c CA atom fit/RMSD
-m main chain atom fit/RMSD
-l arguments are only conformation-files
-o overwrite with rotated conformation
-r output only the RMSD after fitting
See also
rmsd
.
get_distance
Usage:
get_distance
conformation-file
[
atom-pair-list|
atom-pair]
[
output-file]
Outputs specific distance or distances in a conformation (PDB) file.
get_distance
outputs the specific distances between the
two atoms in atom-pair or a list of such atom pairs.
atom-pair is specified as N-A M-B, where N, M are residue
numbers and A, B are atom types.
See also
get_distances
.
get_distances
Usage:
get_distances
conformation-file
distance-cutoff
[
output-file]
Outputs distances in a conformation (PDB) file.
get_distances
outputs distances within
distance_cutoff from a specified conformation (PDB) file.
Distances within CLASH_CUTOFF
(see
ramp/src/common/defines.h
) are marked with
****
. Neighbouring residues are marked with
####
.
See also
get_distance
.
insert_chain_id
Usage:
insert_chain_id
conformation-file
chain-id
[
output-file]
Inserts a chain ID in a conformation (PDB) file.
insert_chain_id
inserts a single character chain ID to
the ATOM
records in the specified conformation (PDB) file
containing a single chain.
See also
add_to_resno
.
list_torsions
Usage:
list_torsions
[-ms]
[
conformation-file]
[
output-file]
List the torsion angles in a conformation (PDB) file.
list_torsions
calculates the phi, psi, and chi angles in
a conformation and outputs them (by default) in this format:
C - 1 - psi is 230.022
C - 1 - chi1 is 299.959
R - 2 - phi is 295.965
R - 2 - psi is 144.819
R - 2 - chi1 is 311.778
R - 2 - chi2 is 172.907
R - 2 - chi3 is 187.699
R - 2 - chi4 is 100.294
-m
outputs only the main chain torsion (phi/psi) angles,
and -s
outputs only the side chain torsion (chi) angles.
See also
compare_torsions
.
logodds
Usage:
logodds
input-file
value
[
output-file]
Calculate the log probability of a particular value occuring by chance.
logodds
takes a value as input and calculates the
probability of it occuring by chance based on the set of values that
are observed (this is simple the negative log of the number of values
that are less than the input value divided by the total number of
values).
pdb_to_casp
Usage:
pdb_to_casp
[
conformation-file]
[
output-file]
Converts a conformation (PDB) file into the CASP format.
pdb_to_casp
takes a PDB file and rearranges the atoms so
that it matches the
CASP specification. The occupancy field are all assigned a
value of "1.00" and the error estimate field (last column) are
assigned a value of "9.00".
See also
pdb_to_sequence
,
pdb_to_vdb
.
pdb_to_sequence
Usage:
pdb_to_sequence
[-p1]
[
conformation-file]
[
output-file]
Converts a conformation (PDB) file into different sequence formats.
pdb_to_sequence
takes a conformation (PDB) file and
outputs the amino acid sequence in different formats:
-p output the sequence in PIR format
-1 output the sequence as a single line
The default output is a two-column format (residue number, one letter amino acid code).
See also
pdb_to_casp
,
pdb_to_vdb
.
pdb_to_vdb
Usage:
pdb_to_vdb
[
conformation-file]
[
output-file]
Converts a conformation file with "normal" atoms (PDB) into a conformation file with virtual atoms (VDB).
pdb_to_vdb
takes a conformation (PDB) file with the
"normal" heavy atoms and collapses the atoms into "virtual" atoms as
defined by Head-Gorden and Brooks, Biopolymers: 77-100,
1991.
See also
pdb_to_casp
,
pdb_to_sequence
.
phd_to_dssp
Usage:
phd_to_dssp
[-cf]
[
phd-generated-file]
[
output-file]
Converts a PHD prediction into DSSP format.
phd_to_dssp
converts a PHD prediction of secondary
structure for a given sequence into DSSP format, with confidence
information. When the -f
option is specified, only the high
confidence predictions are output (defaults are specified in the
source file). The -c
option checks the number of residues
between secondary structures and prints a warning message if it's
below the number defined in DEFAULT_INTER_SS_GAP_SIZE
in the
source file. The -m
option forces the default-sized gap by
eliminating neighbouring helix or sheet elements.
properties
Usage:
properties
[
sequence-file]
[
output-file]
Calculates a variety of properties for a protein.
properties
calculates a variety of sequence properties
for a protein, such as mass.
psipred_to_dssp
Usage:
psipred_to_dssp
[-cfm]
[
psipred-generated-file]
[
output-file]
Converts a PSIPRED prediction into DSSP format.
psipred_to_dssp
converts a PHD prediction of secondary
structure for a given sequence into DSSP format, with confidence
information. When the -f
option is specified, only the high
confidence predictions are output (defaults are specified in the
source file). The -c
option checks the number of residues
between secondary structures and prints a warning message if it's
below the number defined in DEFAULT_INTER_SS_GAP_SIZE
in the
source file. The -m
option forces the default-sized gap by
eliminating neighbouring helix or sheet elements.
sam_to_dssp
Usage:
sam_to_dssp
[
sam-generated-file]
[
output-file]
Converts a SAM prediction into DSSP format.
sam_to_dssp
converts a SAM (HMM/NN based prediction from
the UCSC Bioinformatics group) prediction of secondary structure for a
given sequence into DSSP format, with confidence information.
rmsd
Usage:
rmsd
[-acmstd]
[
conformation-file]
[
conformation-file]
[
output-file]
Calculates the RMSD between two conformations.
rmsd
calculates the root mean square deviation (RMSD)
between two conformations at various levels:
-a is the all atom RMSD
-c is the CA atom RMSD
-m is the main chain atom RMSD
-s is the side chain atom RMSD
-t is the phi/psi torsion RMSD
-d is the detail flag (gives the deviation on a CA level)
See also
fit
.
rmsd_matrix
.
rmsd_matrix
Usage:
rmsd_matrix
[-acm]
[
conformation-file-list]
[
output-file]
Does an all-against-all RMSD calculations between a set of conformations.
rmsd_matrix
takes a list of conformation (PDB) files and
does an all-against-all RMSD calculations between the set of
conformations and outputs the resulting matrix. The following options
are used:
-a is the all atom RMSD
-c is the CA atom RMSD
-m is the main chain atom RMSD
ss_consensus
Usage:
ss_consensus
[
ss-alignment]
[
output-file]
Generates the consensus of a set of secondary structure assignments.
ss_consensus
takes set of set of one-letter secondary
structure assignments and produces an "average" secondary structure
assignment in DSSP format. The first line of the ss-alignment
file must containg the amino acid string.
sw_to_2line
, clustalw_to_2line
Usage:
sw_to_2line
sequence-identifier
sequence-identifier
[
alignment-file]
[
output-file]
Generates a 2line style alignment.
sw_to_2line
and clustalw_to_2line
uses two
sequence identifiers, specified as the arguments, to parse an
alignment file (in the format defined by the calling argument) and
contatenate all the strings appropriately to generate a two line
alignment of the sequences.
transform
Usage:
transform
conformation-file
transformation-matrix
[
output-file]
Transforms the coordinates in a conformation file using a given transformation matrix.
transform
takes a conformation (PDB) file and applies the
transformation matrix to the x, y, and z coordinates of all the
atoms. The 4x3 transformation matrix consists of a rotation matrix
(3x3) followed by the translation vector (3x1).
volume
, rg
Usage:
volume
[-am]
conformation-file
[
output-file]
Calculates the volume of a conformation.
volume
takes a conformation (PDB) file and calculates the
volume (or the radius of gyration, if called as rg
). The
-a
flag computes the volume (or the radius of gyration) for
all the atoms, and the -m
flag compules the volume (or the
radius of gyration) for only the main chain atoms.
zscore
Usage:
zscore
input-file
cutoff
Calculates the average Z-score of a set of values within a specified cutoff.
zscore
takes a list of values as input and returns the
average Z-score (the number of standard deviations above the mean) for
over all the values within (less than) the specified cutoff.
structure_to_music
Usage:
structure_to_music
[ -gp
parametre-file]
conformation-file
ss-file
[
output-file]
Converts a 3D structure into a midge file that can be convered into a MIDI file.
structure_to_music
takes as input a protein structure
specified by the coordinates and the secondary structure and converts
it to a musical notation used by the program midge to generate MIDI
files that can be run on your favourite sequencer. A description on
this done with more detail, along with compositions by yours truly,
can be gotten from my personal
Proteomusic history page. An optional file can be specified
using the -p
option that can be used to modify various
parametres such as instrument patches and scales. The current
parametre file can be using by the -g
option which can be
used as a starting point. In other words, there's no need to modify
the source code unless absolutely necessary.
This program is under constant development so I might have the latest version of the executable/source better than what is on the public web site. The uses of are this program is best illustrated by my Proteomusic project and also the Protinfo Proteomusic web server module