Next Previous Contents

4. Programs

The programs in the RAMP suite are described one by one. The programs are grouped into six categories.

4.1 Scoring functions

contacts

Usage: contacts [conformation-file|conformation-file-list] [output-file] [distance-cutoff]

Computes the contact score for protein conformations.

contacts simply evaluates the number of contacts within a given distance cutoff and uses it as a contact score.

contact_order

Usage: contact_order [conformation-file|conformation-file-list] [output-file] [distance-cutoff]

Computes the (relative) contact order for protein conformations.

contact_order computes the (relative) contact order, which is the average sequence separation between contacting residues divided by the size of the protein. It is a measure of the average context-sensitivity of a residue in a protein and has shown to correlate well with folding rates (in other words, the distribution of the percent contact orders of native structures will be different the distribution based on random conformations).

The reference for this program is: Plaxco KW, Simons KT, Baker D. Contact order, transition state placement and the refolding rates of single domain proteins. Journal of Molecular Biology 277:985-994, 1998.

density_score

Usage: density_score [conformation-file-list] [output-file] [dimension]

Does an all-against-all RMSD calculations between a set of conformations and calculates a score based on the distances.

density_score takes a list of conformation (PDB) files and does an all-against-all RMSD calculations between the set of conformations and calculates a score based on the density of conformations. If the dimension argument is not specified, then the formula used is sum_ij log(rmsd_ij) for a given conformation i. If the density is specified and is greater than zero, then the formula sum_ij (1/rmsd_ij)^D where D is the dimension argument is used for each conformation i. Since the program only uses CA atoms, and does not store a matrix, a much larger number of conformations can be handled.

See also tdensity_score.

electrostatics

Usage: electrostatics [conformation-file|conformation-file-list] [output-file]

Computes the electrostatics energy for protein conformations.

electrostatics computes the total electrostatics energy between the atoms in a protein molecule.

hcf

Usage: hcf [conformation-file|conformation-file-list] [output-file]

Computes the hydrophobic compactness for protein conformations.

hcf computes the hydrophobic compactness (or "moment") between the atoms in a protein molecule. This is a measure of the square of the radius of gyration of the carbon atoms.

The reference for this program is: Samudrala R, Xia Y, Levitt M, Huang ES. A combined approach for ab initio construction of low resolution protein tertiary structures from sequence. In Altman R, Dunker K, Hunter L, Klein T, Lauderdale K, eds. Proceedings of the Pacific Symposium on Biocomputing, 505-516, 1999.

potential, potential_war, potential_rapdf, potential_rapdf_war, potential_rvpdf, potential_nvpdf

Usage: potential [-cl] conformation-file|conformation-file-list|loop-data-file scoring-function-file [output-file]

Generates conditional probability scores for protein conformations.

potential takes a conformation-file or conformation-file-list (default, or with -c) and a scoring-function-file and generates the scores for the conformations given the parametres in scoring-function-file. If conformation-file argument has the PDB filename extension .pdb, then it is assumed to be a single file. Otherwise the assumption is that it is a conformation-file-list. The -l option has the same behaviour but instead of a conformation-file or conformation-file-list, a file containing a loop description, in this format:

205 212
1vfa.pdb
1vfa_205-212.loops.pdb 78

is used to generate scores for different loops. In the above example, 205 and 212 are the starting and ending residues of the loop, 1vfa.pdb is the name of the experimental structure, 1vfa_205-212.loops.pdb is the name of the file containing all the loops, one after the other in PDB format, and 78 is the number of lines per loop conformation. The template file must have a loop in place (even if it's garbage---I recommend inserting the first loop in the loops file in the template structure) so an appropriate amount of space can be allocated in advance. This design rationale is because normally you have the experimental structure and you're comparing scores of loops to the score of the experimental structure loop, and also because it makes things slightly faster (no memory allocation/freeing for each loop considered).

potential is just a link to one of potential_rapdf, potential_rvpdf, potential_nvpdf. potential_rapdf is a residue-specific all-atom conditional probability discriminatory function. potential_rvpdf is a residue-specific virtual-atom conditional probability discriminatory function. potential_nvpdf is a non-residue-specific virtual-atom probability discriminatory function.

potential_war behaves in the same manner as potential (i.e., one of rapdf, rvpdf or nvpdf) but calculates the scores of interactions between atoms within a residue.

The reference for these programs is: Samudrala R, Moult J. An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. Journal of Molecular Biology 275:893-914, 1998.

potential_interactions, potential_rapdf_interactions

Usage: potential_interactions conformation-pair-file|conformation-pair-file-list scoring-function-file [output-file]

Generates conditional probability scores for interacting protein conformations.

potential_interactions behaves in a similar manner as potential but calculates the scores of interactions between atoms in a pair of conformations (specified by a conformation-file-pair (which is essentially two conformation file names separted by a "|" character; for example: filename1.pdb|filename2.pdb).

The reference for this program is: Samudrala R, Moult J. An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. Journal of Molecular Biology 275:893-914, 1998.

potential_ss, potential_rapdf_ss

Usage: potential_ss [-ehms] conformation-file|conformation-file-list ss-file scoring-function-file [output-file]

Generates conditional probability scores for protein conformations.

potential_ss behaves in a similar manner as potential (i.e., one of rapdf, rvpdf or nvpdf) but calculates the scores of interactions between atoms in different secondary structure elements, as given by ss-file (in DSSP format). The options are as follows:

e - calculates scores between atoms involved in nonlocal strand-strand interactions
h - calculates scores between atoms involved in nonlocal helix-helix interactions
m - calculates scores between atoms involved in nonlocal helix-strand interactions
s - calculates scores between atoms involved in all secondary structure interactions

The reference for this program is: Samudrala R, Moult J. An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. Journal of Molecular Biology 275:893-914, 1998.

score_profile

Usage: score_profile conformation-file scoring-function-file [output-file]

Breaks down scores on a per-residue contribution basis.

score_profile decomposes the contribution of single residues based on their interactions with the surrounding environment for a given conformation. The program uses the RAPDF all-atom pairwise function (supplied as the second argument) to perform this analysis, with the implicit assumption that contributions can be divided in a pairwise manner.

The residue number and score contribution are output to the optional output file argument or to stdout if no output file is specified. All the pairwise side chain contributions are output to stderr.

sequence_score ,sequence_similarity_score ,sequence_identity_score

Usage: sequence_score [alignment-file|alignment-file-list] [output-file]

Computes a sequence score (similarity and/or identity) given an alignment.

sequence_score computes a sequence score (similarity, when invoked as sequence_similarity_score, and identity, when involved as sequence_identity_score; both when invoked as sequence_score) given an alignment (which is specified as two lines in a file with the same line length).

The similarity score is computed using a matrix which is specified by SEQUENCE_SIMILARITY_MATRIX_FILE in src/common/defines.h.

solvation

Usage: solvation [conformation-file|conformation-file-list] [output-file]

Computes the solvation energy for protein conformations.

solvation computes the total Van der Waals energy between the atoms in a protein molecule.

ss_score

Usage: ss_score ss-file|ss-file-list ss-file [output-file]

Compares the secondary structure between two conformations.

ss_score compares the secondary structure between two conformations, specified using the DSSP format. The first argument can be specified as a single file or as a list of files, and in the latter case the comparison occurs between all the files in the ss-file-list. The utility of this program is to compare the secondary structure of a conformation constructed using modelling techniques to a predicted secondary structure to ensure that the final conformation results in a conformation colinear with the predicted secondary structure. Normally this would work best in the case of predicting sheets since it requires more than just local geometry to be predicted accurately.

tdensity_score

Usage: tdensity_score [conformation-file-list] [output-file]

Does an all-against-all torsion RMSD calculations between a set of conformations and calculates a score based on the distances.

tdensity_score is similar to density_score except that phi/psi values are used instead of Cartesian coordinates.

See also density_score.

vdw

Usage: vdw [conformation-file|conformation-file-list] [output-file]

Computes the VdW energy for protein conformations.

VdW computes the total Van der Waals energy between the atoms in a protein molecule.

potential_filter

Usage: potential_filter conformation-file|conformation-file-list|loop-data-file scoring-function-file [output-file]

Filters protein conformations using different scoring functions.

potential_filter filters protein conformations using a combination of many of the other scoring functions described above, including but not limited to electrostatics, vdw, hcf, potential_rapdf, and other terms such as volume, radius of gyration, and so on. The exact combination and the scoring functions to be used for filtering can be controlled via potential_filter.c and potential_filter.h.

4.2 Scoring function related programs

compile_raw_counts

Usage: compile_raw_counts [-aehmsw] [conformation-file[-pair]|conformation-file[-pair]-list] [raw-counts-file]

Compiles raw counts from a database of proteins.

compile_raw_counts takes a conformation file or a list of conformation files and compiles raw counts for all pairs of atoms (167x167) for 18 distance bins.

The flags control the behaviour of how the raw counts are compiled:

a - compiles raw counts for all pairs of atoms (default)
e - compiles raw counts between atoms involved in nonlocal strand-strand interactions
h - compiles raw counts between atoms involved in nonlocal helix-helix interactions
m - compiles raw counts between atoms involved in nonlocal helix-strand interactions
s - compiles raw counts between atoms involved in nonlocal secondary structure interactions
w - compiles raw counts for pairs of atoms within a residue

A DSSP-format file with the extension dssp must be present for the secondary structure interaction calculation.

If instead of a single conformation file, a conformation-file-pair specification is given (which is essentially two conformation file names separted by a "|" character; for example: filename1.pdb|filename2.pdb), then the program operates on only the interatomic contacts between the two files (all the other behaviour is identical). This is useful to compile raw counts for interface/interacting regions between different chains or domains.

The reference for this program is: Samudrala R, Moult J. An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. Journal of Molecular Biology 275:893-914, 1998.

See also compile_scores, potential, remove_raw_counts.

compile_scores

Usage: compile_scores raw-counts-file [output-file]

Calculates conditional probability scores given a set of counts.

compile_scores takes a set of raw counts for all pairs of atoms (167x167) for 18 distance bins and generates the negative log conditional probability scores for the conformations.

The reference for this program is: Samudrala R, Moult J. An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. Journal of Molecular Biology 275:893-914, 1998.

See also compile_raw_counts, plot_scores, potential.

plot_scores

Usage: plot_scores groups-list|group-pair scoring-function-file [output-file]

Plots the conditional probability scores for a set of atom/group pairs.

plot_scores plots the negative log conditional probability scores for a set of atom/group pairs from the given scoring function file. This is useful for viewing the distribution in a graphical manner (the output can easily be piped into a program such as gnuplot).

A single pair of group can be specified as an argument in this form ACA-WN (using one letter amino acid code to indicate an alanine alpha-carbon and a tryptophan nitrogen) or a file containing a list of such group (separated by '-' or a space (' ')).

Features: Obviously a file containing a list of groups cannot have a '-' in its name.

The reference for this program is: Samudrala R, Moult J. An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. Journal of Molecular Biology 275:893-914, 1998.

See also compile_scores, potential.

remove_raw_counts

Usage: remove_raw_counts [-ahsw] conformation-file|conformation-file-list raw-counts-file

Removes raw counts in a database of proteins from a set of raw counts.

remove_raw_counts takes a conformation file or a list of conformation files and substracts the raw counts for all pairs of atoms (167x167) for 18 distance bins in those conformations from a previously compiled set of raw counts. This is useful for jack-knifing.

The flags behave in the same manner as in compile_raw_counts.

The reference for this program is: Samudrala R, Moult J. An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. Journal of Molecular Biology 275:893-914, 1998.

See also compile_raw_counts, compile_scores, potential.

4.3 Main chain generation related programs

mcgen_exhaustive

Usage: mcgen_exhaustive [-discrete-state-model] sequence-file scoring-function-file [output-file-prefix] [num-conformations]

Exhaustively enumerates a sequence using a n-state phi/psi model.

mcgen_exhaustive exhaustively enumerates a sequence using a n-state phi/psi model. The default n-state phi/psi model is the one in the file ramp/lib/phi_psi_allowed_angles.default. See ramp/lib/phi_psi_allowed_angles.* for a list of files containing the available n-state models. The string representing discrete-state-model is suffixed to ramp/lib/phi_psi_allowed_angles (with a "." between them) to produce the filename containing the corresponding model.

The sequence is specified using one-letter amino acid code (generally in a single line) in sequence-file.

The output-file-prefix, if specified, will be the prefix used for the output file, suffixed with .best.pdb. If num-conformations is specified, then many files of the form output-file-prefix.bestn.pdb (up to num-conformations) will be generated.

Features: There are various limitations on the number of residues/atoms that can be handled, which you will find out about as you use the program.

See also mcgen_exhaustive_loop, mcgen_exhaustive_loop_ss, mcgen_exhaustive_ss.

mcgen_exhaustive_loop

Usage: mcgen_exhaustive_loop [-discrete-state-model] loop-data-file scoring-function-file [output-file-prefix] [num-conformations]

Generates loop (missing) regions given a framework.

mcgen_exhaustive_loop generates loops for a given region (as specified in the loop-data-file) by exhaustively enumerating all possible main chain conformations for that loop using an n-state phi/psi model and selecting the best ones using the RAPDF scoring function.

The default n-state phi/psi model is the one in the file ramp/lib/phi_psi_allowed_angles.loop_search. See ramp/lib/phi_psi_allowed_angles.* for a list of files containing the available n-state models. The string representing discrete-state-model is suffixed to ramp/lib/phi_psi_allowed_angles (with a "." between them) to produce the filename containing the corresponding model. An example invocation would be:

mcgen_exhaustive_loop -syssearch.14 loop-data-file ramp/lib/scores

where ramp/lib/phi_psi_allowed_angles.syssearch.14 will represent the discrete state model, and the scoring will be specified by the file ramp/lib/scores.

The format of the loop-data-file is as follows:

205 212 RERDYRLD
1vfa.pdb
loops.pdb 10
constraint 205 212 5.50 1.0
constraint 212 213 3.80 1.0

In the above example, 205-215 is the residue range of the loop, RERDYRLD is the sequence, 1vfa.pdb is the framework upon which the loop will built, loops.pdb is the file which will contain the 10 loop conformations (residues 205-212 only) that will be output, followed by a set of CA loop constraints. The string "none 0" can be used to suppress the output of this file. The format of the constraints is similar to that used in constraints_filter, essentially representing CA distances and (+/-) tolerance for pairs of residues.

The output-file-prefix, if specified, will be the prefix used for the output file, suffixed with .best.pdb. If num-conformations is specified, then many files of the form output-file-prefix.bestn.pdb (up to num-conformations) containing the coordinates for all the residues will be generated. The maximum number conformations output will be capped by MAX_MAINCHAINS in ramp/src/common/defines.h.

Features: since the loops are generated in torsion space, the psi value for the residue before the loop region begins must be known. But this can be known only if the CA atom is known for the first residue in the loop. Thus an extra residue must always be present (and therefore be built) on the framework to provide the psi angle for the first residue of the loop. If the loop is an N-terminal loop, then the phi angle for the last residue in the loop must be calculable from the coordinates, since the chain will be built in the reverse direction.

Constraints must be chosen wisely. The CA distance between the end of the loop and the next residue in the framework must also be used to tether the loop to the other end.

See also constraints_filter, mcgen_exhaustive_loop_ss, mcgen_semfold_loop, mcgen_semfold_loop_ss.

mcgen_exhaustive_loop_ss

Usage: mcgen_exhaustive_loop_ss [-discrete-state-model] loop-data-file scoring-function-file ss-file [output-file-prefix] [num-conformations]

Generates loop (missing) regions given a framework and a secondary structure specification.

mcgen_exhaustive_loop_ss generates loops for a given region (as specified in the loop-data-file) by exhaustively enumerating all possible main chain conformations for that loop using an n-state phi/psi model and selecting the best ones using the RAPDF scoring function, taking into account the secondary structure specified in ss-file (DSSP format). In all other ways, its behaviour is identical to mcgen_exhaustive_loop.

See also constraints_filter, mcgen_exhaustive_loop_ss, mcgen_semfold_loop, mcgen_semfold_loop_ss.

mcgen_exhaustive_ss

Usage: mcgen_exhaustive_ss [-discrete-state-model] sequence-file scoring-function-file ss-file [output-file-prefix] [num-conformations]

Exhaustively enumerates a sequence using a n-state phi/psi model keeping secondary structure fixed.

mcgen_exhaustive_ss exhaustively enumerates a sequence using a n-state phi/psi model while assigning idealised phi/psi values to helix and sheet residues (see ramp/src/common/defines.h). In all other ways, its behaviour is similar to mcgen_exhaustive.

See also mcgen_exhaustive, mcgen_exhaustive_loop, mcgen_ficsa_ss, mcgen_semfold_ss, phd_to_dssp.

mcgen_ficsa_ss

Usage: mcgen_ficsa_ss sequence-file scoring-function-file ss-file ntuplet-database-file [output-file] [random-seed]

Apply the fragment insertion conformational space annealing (FICSA) technique for predicting protein structure between secondary structure units.

mcgen_ficsa_ss applies the FICSA method for predicting protein structure, inserting fragments only for residues between secondary structure units such as helices or sheets or both (depending on how it is compiled). The secondary structure units are assigned default phi/psi angles (see ramp/src/common/defines.h).

The FICSA method is based on inserting small (usually three-residue) fragments, either from a database, or generated using discrete phi/psi models, randomly and a conformational space annealing procedure to find combinations of these fragments that have the lowest score. The ntuplet-database-file is generated using a combination of mcgen_ntuplets and prune_ntuplet_db.

See also mcgen_exhaustive_ss, mcgen_semfold_ss, mcgen_ntuplets, prune_ntuplet_db.

mcgen_fit

Usage: mcgen_fit [-discrete-state-model] conformation-file [output-file-prefix] [num-conformations]

Fits a n-state phi/psi model to a given conformation.

mcgen_fit fits a n-state phi/psi model to a given conformation minimising the CA RMSD between the model and the conformation.

The default n-state phi/psi model is the one in the file ramp/lib/phi_psi_allowed_angles.default. See ramp/lib/phi_psi_allowed_angles.* for a list of files containing the available n-state models. The string representing discrete-state-model is suffixed to ramp/lib/phi_psi_allowed_angles (with a "." between them) to produce the filename containing the corresponding model.

The output-file-prefix, if specified, will be the prefix used for the output file, suffixed with .best.pdb. If num-conformations is specified, then many files of the form output-file-prefix.bestn.pdb (up to num-conformations) will be generated.

Features: There are various limitations on the number of residues/atoms that can be handled, which you will find out about as you use the program.

See also mcgen_fit_ss, phd_to_dssp.

mcgen_fit_ss

Usage: mcgen_fit_ss [-discrete-state-model] conformation-file ss-file [output-file-prefix] [num-conformations]

Fits secondary structures to main chains using a n-state phi/psi model.

mcgen_fit_ss fits to a given conformation using an n-state phi/psi model minimising the CA RMSD but using idealised secondary structure values for helix and sheet residues.

The default n-state phi/psi model is the one in the file ramp/lib/phi_psi_allowed_angles.default. See ramp/lib/phi_psi_allowed_angles.* for a list of files containing the available n-state models. The string representing discrete-state-model is suffixed to ramp/lib/phi_psi_allowed_angles (with a "." between them) to produce the filename containing the corresponding model.

The first thing you need is a ss-file (DSSP format) containing the secondary structure. This can be created by running dssp on the conformation-file (PDB format) and then modifying the secondary structure assignment to your liking. Alternately, if you have a PHD prediction, then you can save the prediction to a file, and run phd_to_dssp on it.

Once you have created the ss-file, and have the conformation-file to which you want to fit the secondary structure, just type in the command with the conformation-file and the ss-file as the arguments.

The output-file-prefix, if specified, will be the prefix used for the output file, suffixed with .best.pdb. If num-conformations is specified, then many files of the form output-file-prefix.bestn.pdb (up to num-conformations) will be generated.

Features: There are various limitations on the number of residues/atoms that can be handled, which you will find out about as you use the program.

See also mcgen_fit, phd_to_dssp.

mcgen_idealise

Usage: mcgen_idealise conformation-file num-iterations|rmsd-limit [output-file]

Converts an experimental structure to have ideal geometry.

mcgen_idealise represents an experimental structure using ideal bond lengths and bond angles by randomly perturbing the torsion angles and using a steepest descent minimisation to lower the RMSD (as specified by rmsd-limit or DEFAULT_RMSD_LIMIT in ramp/src/mcgen/mcgen_idealise.c) for a given number of iterations (as specified by num-iterations or DEFAULT_NUM_ITERATIONS in ramp/src/mcgen/mcgen_idealise.c).

Features: Since mcgen_idealise uses steepest descent minimisation, the optimisation process is not very efficient.

mcgen_list

Usage: mcgen_list [torsion-angle-list] [output-file]

Generates a conformation using specified torsion valus

mcgen_list takes as input a a list of torsion values, specified using the format output by list_torsions:

C -    1 -  psi  is 230.022
C -    1 -  chi1 is 299.959
R -    2 -  phi  is 295.965
R -    2 -  psi  is 144.819
R -    2 -  chi1 is 311.778
R -    2 -  chi2 is 172.907
R -    2 -  chi3 is 187.699
R -    2 -  chi4 is 100.294

and generates a new conformation. Since experimental structures don't use idealised bond lengths and bond angles, reproducing the experimental structure using only a list of torsion values is not possible.

See also mcgen_list_ss.

mcgen_list_ss

Usage: mcgen_list_ss conformation-file ss-file [output-file]

Generates a conformation using the phi/psi values from a secondary structure (DSSP) file.

mcgen_list_ss takes as input a conformation (PDB) file and a secondary structure (DSSP) file and generates a new conformation using the torsion angles specified in the DSSP file. Idealised values are used for helix (H) and sheet (E) residues, and the secondary structure specification field in the DSSP file (using F, for "fixed") can also be used to ask the program use to the existing phi/psi values for a given residue in conformation-file.

Features: A PDB file is used to specify the sequence (this is bad design in some ways, but a program like mcgen_list can be used to create the PDB file, and this enables the use of phi/psi values already present in the conformation).

See also mcgen_list.

mcgen_ntuplets

Usage: mcgen_ntuplets [conformation-file-list] [n-tuple-size] [output-file]

Generate a database of phi/psi angles for fragments.

mcgen_ntuplets takes as input a set of conformation files and creates a database of fragments (with size specified by n-tuple-size) and the list of phi/psi angles that are found in the fragments. For each conformation in conformation-file-list, the fragments are are generated from the N-terminus until the end of the chain is reached. Fragments that do not possess chain connectivity are discarded.

See also prune_ntuplet_db.

mcgen_semfold_fit_ss

Usage: mcgen_semfold_fit_ss sequence-file scoring-function-file ss-file ntuplet-database-file [output-file] [num-conformations] [num-iterations] [random-seed]

Apply the segment matching and folding (SEMFOLD) technique to generate a fold that fits a particular structure.

mcgen_semfold_fit_ss is identical to mcgen_semfold_ss, except that it tries to mimic a known experimental conformation by minimising the RMSDs of the decoys generated to the known structure (which is specified as a file in PDB format called exp.pdb in there directory where this program is called).

This program is experimental.

See also mcgen_semfold_ss.

mcgen_semfold_loop

Usage: mcgen_semfold_loop loop-data-file scoring-function-file ntuplet-database-file [output-file-prefix] [num-conformations] [num-iterations] [random-seed]

Apply the segment matching and folding (SEMFOLD) technique for predicting protein structure in a specified loop region.

mcgen_semfold_loop applies the SEMFOLD method for predicting protein structure, inserting fragments only for residues in a specified loop region.

The SEMFOLD method is based on inserting small (usually three-residue) fragments randomly and a Monte Carlo/simulated annealing procedure to find combinations of these fragments that have the lowest score. The ntuplet-database-file is generated using a combination of mcgen_ntuplets and prune_ntuplet_db. See mcgen_exhaustive_loop for the format of the loop-data-file and more information on features and usage.

See also constraints_filter, mcgen_exhaustive_loop, mcgen_exhaustive_loop_ss, mcgen_semfold_loop_ss.

mcgen_semfold_loop_ss

Usage: mcgen_semfold_loop loop-data-file scoring-function-file ss-file ntuplet-database-file [output-file-prefix] [num-conformations] [num-iterations] [random-seed]

Apply the segment matching and folding (SEMFOLD) technique for predicting protein structure in a specified loop region taking secondary structures into account.

mcgen_semfold_loop_ss applies the SEMFOLD method for predicting protein structure, inserting fragments only for residues in a specified loop region, taking secondary structures into account as specified by the secondary structure (DSSP) file. In all other respects, its behaviour is identical to mcgen_semfold_loop.

See also constraints_filter, mcgen_exhaustive_loop, mcgen_exhaustive_loop_ss, mcgen_semfold_loop.

mcgen_semfold_region_ss

Usage: mcgen_semfold_region_ss sequence-file scoring-function-file ss-file ntuplet-database-file region [output-file] [num-conformations] [num-iterations] [random-seed]

Apply the segment matching and folding (SEMFOLD) technique for predicting protein structure between secondary structure units for a particular region.

mcgen_semfold_region_ss is identical to mcgen_semfold_ss, except that only a particular region is allowed to move (the rest of the protein is held fixed according to the torsion values specified in ss-file).

region is specified by two residue numbers separated by -, i.e., 22-40.

See also mcgen_semfold_loop_ss, mcgen_semfold_ss.

mcgen_semfold_ss

Usage: mcgen_semfold_ss sequence-file scoring-function-file ss-file ntuplet-database-file [output-file] [num-conformations] [num-iterations] [random-seed]

Apply the segment matching and folding (SEMFOLD) technique for predicting protein structure between secondary structure units.

mcgen_semfold_ss applies the SEMFOLD method for predicting protein structure, inserting fragments only for residues between secondary structure units such as helices or sheets or both (depending on how it is compiled). The secondary structure units are assigned default phi/psi angles (see ramp/src/common/defines.h).

The SEMFOLD method is based on inserting small (usually three-residue) fragments, either from a database, or generated using discrete phi/psi models, randomly and a Monte Carlo/simulated annealing procedure to find combinations of these fragments that have the lowest score (on a cluster of machines, with the appropriate shell scripts, a genetic algorithm can be performed between trajectories to further optimise the scores). The ntuplet-database-file is generated using a combination of mcgen_ntuplets and prune_ntuplet_db in cases where fragments are obtained from a database; the file can also be generated by exhaustively enumerating the discrete state phi/psi models included with the distribution.

The reference for these programs is: Samudrala R, Levitt M. A comprehensive analysis of 40 blind protein structure predictions. BMC Structural Biology 2: 3-18 (2002).

See also mcgen_exhaustive_loop, mcgen_semfold_loop, mcgen_ntuplets, prune_ntuplet_db.

prune_ntuplet_db

Usage: prune_ntuplet_db sequence-file ntuplet-database-file ntuplet-size [output-file]

Filters a database of containing phi/psi angles for identical fragments.

prune_ntuplet_db takes as input an ntuplet database of phi/psi angles, as created by mcgen_ntuplets, and filters it such that only identical tuplets that are present in the sequence of interest are output. This reduces the database size considerably and saves on memory and enables storage of large sized fragments for a given sequence. This program is ideal for fragments of length <= 3 where exact matches are available.

Additionally, by default, terminal ntuplets (for which the phi and psi angles are not defined, respectively) are discarded. This latter behaviour can be modified by commenting out the ELIMINATE_TERMINAL_TUPLETS variable in ramp/src/mcgen/prune_ntuplet_db.c.

See also mcgen_ntuplets, prune_ntuplet_db_ss.

prune_ntuplet_db_ss

Usage: prune_ntuplet_db_ss sequence-file scoring-function-file ss-file ntuplet-database-file ntuplet-size [output-file]

Filters a database of containing phi/psi angles for similar fragments.

prune_ntuplet_db_ss takes as input an ntuplet database of phi/psi angles, as created by mcgen_ntuplets, and filters it such that only similar tuplets that are present in the sequence of interest are output. This reduces the database size considerably and saves on memory and enables storage of large sized fragments for a given sequence.

"Similarity" of sequences is determined by scoring-function-file which is expected to be a matrix with a format as the one used by BLOSUM62 and the secondary structure compatibility. This program is ideal for fragments of length >= 4 where exact matches are unlikely.

Additionally, by default, terminal ntuplets (for which the phi and psi angles are not defined, respectively) are discarded. This latter behaviour can be modified by commenting out the ELIMINATE_TERMINAL_TUPLETS variable in ramp/src/mcgen/prune_ntuplet_db.c.

See also mcgen_ntuplets, prune_ntuplet_db.

4.4 Side chain generation related programs

scgen, scgen_single, scgen_double

Usage: scgen conformation-file scoring-function-file [output-file] [num-sidechains]

Generates low scoring side chain conformations.

scgen (which is just a link to scgen_single or scgen_double) takes an existing conformation (PDB), a scoring function, and generates the top scoring side chains (the number is determined by num-sidechains) as ranked by the scoring function.

The reference for these programs is: Samudrala R, Moult J. Determinants of side chain conformational preferences in protein structures. Protein Engineering 11: 991-997, 1998.

Features: scgen will not generate any new coordinates---it will only explore/twirl existing side chains. Use scgen_build to generate a set of built side chains if no side chains are present.

scgen_build

Usage: scgen_build conformation-file chi-angle-list [output-file]

Builds side chains given a list of chi angles.

scgen_build takes a list of chi angles, in the format generated by list_torsions:

C -    1 -  chi1 is 299.959
R -    2 -  chi1 is 311.778
R -    2 -  chi2 is 172.907
R -    2 -  chi3 is 187.699
R -    2 -  chi4 is 100.294

and builds the side chain in a given conformation to the match the values in the list (for those values that are specified). The side chains are built from scratch using standard geometry, i.e., the side chain coordinates in the conformation file are discarded. This has the advantage that side chains that don't exist can be built with this program, but if the side chains do exist, then it is better to transform (twirl) the existing coordinates using scgen_list.

See also scgen_build_missing, scgen_library, scgen_list.

scgen_build_missing

Usage: scgen_build_missing [conformation-file] [output-file]

Builds side chains for those residues with missing side chain atoms.

scgen_build_missing builds side chains for those residues with missing side chain atoms (using a standard library defined by CHI_ALLOWED_ANGLES_FILE). The side chains are built from scratch using standard geometry, i.e., the side chain coordinates in the conformation file are discarded. This has the advantage that side chains that don't exist can be built with this program, but if the side chains do exist, then it is better to transform (twirl) the existing coordinates using scgen_list.

The reference for this program is: Samudrala R, Huang ES, Koehl P, Levitt M. Constructing side chains on near-native main chains for ab initio protein structure prediction. Protein Engineering, 7: 453-457, 2000.

See also scgen_build, scgen_library, scgen_list.

scgen_library

Usage: scgen_library conformation-file [output-file]

Generates side chains rotamer library values closest to experimental structure values.

scgen_library takes a conformation file with all the atoms and twirls the side chains so the rotamers represent the closest library values to the corresponding experimental structure values.

The reference for this program is: Samudrala R, Moult J. Determinants of side chain conformational preferences in protein structures. Protein Engineering 11: 991-997, 1998.

See also scgen_build, scgen_list.

scgen_list

Usage: scgen_list conformation-file chi-angle-list [output-file]

Generate side chains given a list of chi angles.

scgen_list takes a list of chi angles, in the format generated by list_torsions:

C -    1 -  chi1 is 299.959
R -    2 -  chi1 is 311.778
R -    2 -  chi2 is 172.907
R -    2 -  chi3 is 187.699
R -    2 -  chi4 is 100.294

and twirls the side chain in a given conformation to the match the values in the list.

The reference for this program is: Samudrala R, Moult J. Determinants of side chain conformational preferences in protein structures. Protein Engineering 11: 991-997, 1998.

See also scgen_build, scgen_library.

scgen_mutate

Usage: scgen_mutate conformation-file alignment [output-file]

Mutates (changes) amino acids in a parent/template structure based on an alignment.

scgen_mutate takes an alignment between a parent/template sequence and a target sequence, the template conformation file, and creates a minimum perturbation model of the target conformation using the information in the template structure.

The alignment is specified in a file containing two lines in one-letter amino acid code, the first being the template sequence and the second being the target. Insertions and deletions are specified by dashes (and are not constructed). The coordinates for the main chain are copied over as is. Likewise for residues that are identical between the template and target sequences. For nonidentical residues, a chi-angle equivalence matrix is used to specify the chi angle values that are used from the parent structure to set the values in the template structure. Chi angles values that are not taken (or available) from the parent structure are obtained from a library containing a list of the most frequently observed chi angles as used by Randy Read's program MUTATE which performs an identical task. For example, if a V is mutated to a K, then the chi 1 of V is used as the chi 1 for K, and chi 2, chi 3, chi 4 for K are set from a library of standard values.

The reference for this program is: Samudrala R, Huang ES, Koehl P, Levitt M. Constructing side chains on near-native main chains for ab initio protein structure prediction. Protein Engineering, 7: 453-457, 2000.

This program is useful for producing an initial model for comparative/homology modelling.

4.5 Graph theory programs

cf, cf_single

Usage: cf possibility-file-list scoring-function-file [output-file-prefix] [num-conformations]

Selects the best scoring conformation mixing and matching between a set of possible conformations.

cf selects the best scoring conformation mixing and matching between a set of possible conformations (specified as nodes in a graph). The PDB file format is used to specify a possibility-file. Each possibility-file can contain one main chain conformation for all the residues in the file. Different (multiple) side chain conformations for a given residue can be specified in the file also (the last atom record should have the largest residue number). The possibility-file-list is a list of different possibility files, where each file represents one main conformation for a range of residues and one or more side chain conformations per residue. cf will find the optimal arrangement (as evaluated by the all-atom distance-dependent pairwise scoring function ( potential_rapdf) of the different possibilities by representing each possibility as a node in a graph; each interaction between two possible conformations (nodes) as a weighted edge; and finding the maximal completely connected subgraph (clique) with the lowest total score.

The format of a possibility-file-list is as follows:

file1.pdb
file2.pdb
   .
   .
   .
fileN.pdb

Each PDB file (up to MAX_MAINCHAINS) defines a single main chain. Each main chain position can have many different side chains (stored in alt_atoms in the residue data structure). To make nodes, the program obtains the score of every conformation with respect to the local main chain region. Since each file contains a single main chain, this is already given. However, the score of a node is ignored in the final summation for a clique score to avoid double counting. Zero weighted edges means no edge and any nonzero value will indicate the strength of the interaction between the two possible residue conformations represented by the node pair. Thus weighted edges handle interconnectedness and nodes are measures of local goodness. In abstract, a node is specified by a residue number, a possible conformation of side chain, and a possible conformation of a main chain. An edge is specified by a pair of nodes.

For the purposes of evaluating covalent consistency, a set of crossover regions is required. These regions are specified as residue ranges (pairs of residuue numbers separated by whitespace) in the file possibility-file-list.crossover_regions. For each pair of crossover regions a and b, if residue a and residue b for nodes i and j are within a crossover region, then the main chain for node i must be the same as the main chain for node j.

Using a crossover region of "1 N" where N is the size of the protein means that no crossovers will be allowed. Using a crossover region of "0 0" will allow all positions to crossover.

Possible side chain conformations can be generated using scgen. Main chain conformations for short regions can be generated using mcgen_exhaustive_loop or mcgen_semfold_loop or any other loop generation program. The default single residue per node representation does not exclude the idea of twirling sidechains in a pairwise manner using scgen_double.

The lowest scoring cliques are stored in a data structure in a queue fashion. For each list of cliques, the highest scoring clique found thus far is stored, along with its index. When a clique with lower score is found, it replaces the existing lowest scoring clique, and the highest scoring clique's index is recalculated.

To speed up evaluation, only cliques matching a certain size are evaluated. Editing the environment of the protein to remove extraneous bits also results in a significant speedup.

Features: Since the graph uses one byte of storage space for its edges, the score of an edge is rounded so it can fit in a single byte. As a result, a small amount of precision is lost.

The reference for this program is: Samudrala R, Moult J. A graph-theoretic algorithm for comparative modelling of protein structure. Journal of Molecular Biology 279:287-302 (1998).

See also scgen, mcgen_exhaustive_loop, mcgen_semfold_loop.

4.6 Miscellaneous programs

1line_to_fasta

Usage: 1line_to_fasta [input-file] [identifier] [output-file]

Converts a 1line file to a FASTA file.

1line_to_fasta takes a sequence in a 1line formatted file (sequence in a single 1line or just the single-letter amino acid characters only) and converts it into a FASTA formatted file.

See also fasta_to_1line.

add_to_resno

Usage: add_to_resno conformation-file value [output-file]

Changes the residue numbering in a conformation (PDB) file.

add_to_resno changes the residue numbering in a conformation (PDB) file by adding the number specified by value to the existing residue numbers. This is useful if you're working with certain programs that require residue numbers to start from 1, and you need to go back and forth between different numberings.

See also clean_pdb, insert_chain_id.

average_xyz

Usage: average_xyz conformation-file-list [output-file]

Averages the XYZ coordinates in a set of conformation (PDB) files.

average_xyz averages the XYZ Cartesian coordinates in a set of conformation (PDB) files specified by conformation-file-list. Averaging the Cartesian coordinates of a set of conformations around a native structure in RMSD space has been known to result in a conformation that is closer to the native structure than any of the individual conformations. However, since geometry information is not taken into account during the averaging process, a minimisation step may be required to make the final conformation look decent.

ca_only, mc_only, sc_only

Usage: ca_only [conformation-file] [output-file]

Extracts the appropriate atoms from a conformation file.

ca_only extracts the CA atoms from a conformation (PDB) file. mc_only extracts the main chain atoms from a conformation (PDB) file. sc_only extracts the side chain atoms from a conformation (PDB) file.

clean_pdb, renumber_pdb, striphyd

Usage: clean_pdb [conformation-file] [output-file]

Cleans up a conformation (PDB) file.

clean_pdb removes extraneous information from a conformation (PDB) file and cleans it up so it is usable by many of the programs in the RAMP suite. This includes stripping the conformation off of any hydrogen atoms (only action performed when invoked as striphyd), removing alternate atom conformations, and renumbering the residues (only action performed when invoked as renumber_pdb).

Features: Since the PDB files can be so varied in how information is recorded, this program might not always work. Also, you probably should use extract_pdb_chain to extract the chain interest before applying this program.

See also add_to_resno. extract_pdb_chain.

compare_structures

Usage: compare_structurs [-aclmors] conformation-file conformation-file|conformation-file-list [output-file]

Compares two structures with different sequences.

compare_structures is similar to fit except that it can compare two structures that are not identical in sequence. The options are also similar (though they're not all guaranteed to work), except for the -s option which outputs the structure based sequence alignment (which is ignored if conformation-file-list is specified).

This program is experimental. It is based on the ALIGN program of Gerson Cohen from the NIH.

See also fit.

compare_torsions

Usage: compare_torsions torsions-list torsions-list [angle] [output-file]

Compares two sets of torsions.

compare_torsions takes two files containing lists of torsions, as generated by list_torsions and outputs the percentage of torsions that match within a particular cutoff (the default is 30 degrees, set by MIN_TORSION_CUTOFF in ramp/src/misc/compare_torsions.c.

See also list_torsions.

consensus_distances

Usage: consensus_distances [conformation-file-list] [output-file]

Generates the consensus distances from a set of conformations.

consensus_distances takes a set of conformations and calculates the most frequently observed, i.e., "consensus", distances between atoms within that set.

Features: Currently only CA distances are calculated.

constraints_filter

Usage: constraints_filter conformation-file-list constraints-file [output-file]

Filters a set of structures using distance constraints.

constraints_filter takes a set of constraints specified in constraints-file and filters a set of conformations specified in conformation-file-list, outputting only those conformation file names that satisfy all the specified constraints.

The format of the constraints-file is as follows:

constraint 10 22 6.0 1.0
constraint 58 73 6.0 1.0
           .
           .
           .
constraint 98 143 6.0 1.0

where each line starts with the string constraint followed by the residues involved in the constraint, the distance between them, and a (+/-) tolerance value for that distance. A negative tolerance value indicates that the constraint used is an absolute distance (i.e., all atom pairs within (lesser than) the distance specified will be considered to satisfy the constraint.

Features: Currently only CA constraints are used.

See also potential_filter.

dssp_consensus

Usage: dssp_consensus [-n] [ss-file-list] [output-file]

Generates the consensus of a set of secondary structure assignments.

dssp_consensus take a list of secondary structure assignments in DSSP format and generates the consensus. It is useful for combining different secondary structure prediction methods. The probabilities output are based on the frequency of occurances of the particular type of secondary structure.

dssp_to_ss

Usage: dssp_to_ss [-n] [ss-file] [output-file]

Reformats secondary structure assignments.

dssp_to_ss takes the secondary structure assignments in DSSP format and converts it to single line. It is useful for comparison with various secondary structure assignments. The -n option will output 1 for helix, 2 for sheet, and 3 for other.

extract_nmr_model

Usage: extract_nmr_model conformation-file model-number [output-file]

Extracts a particular from an NMR conformation (PDB) file.

extract_nmr_model extracts a model from a specified NMR conformation (PDB) file. The model numbers are integers, usually from 1-20.

See also extract_pdb, extract_pdb_chain.

extract_pdb

Usage: extract_pdb conformation-file [extract-regions-file] [output-file]

Extracts regions from a conformation (PDB) file.

extract_pdb extracts regions specified via standard input or extract-regions-file from a specified conformation (PDB) file. The regions must be specified as pairs of numbers which represent residue ranges (inclusive).

See also extract_nmr_model, extract_pdb_chain.

extract_pdb_chain

Usage: extract_pdb_chain conformation-file chain [output-file]

Extracts chains from a conformation (PDB) file.

extract_pdb_chain extracts chains from a specified conformation (PDB) file. The chain is specified in the form of a one character string, i.e., "A", "B".

See also extract_nmr_model, extract_pdb.

fasta_to_1line

Usage: fasta_to_1line [input-file] [output-file]

Converts a FASTA file into a 1line file.

fasta_to_1line takes a standard FASTA file and converts it such that each line contains a single sequence (i.e., the delimeter between sequences is a newline as opposed to a line beginning with >).

See also 1line_to_fasta.

fit

Usage: fit [-aclmor] conformation-file conformation-file|conformation-file-list [output-file]

Superimposes pairs of conformations.

fit superimposes pairs of conformations. If the second argument is a conformation file (determined by the extension), then fit will superpose the second conformation onto the first one and output the superposed conformation (the RMSD will be output to stderr). If the second argument is a list (determined by the extension), then fit will superpose all the conformations specified in the list onto the second one and output the RMSDs. The following options are applicable:

-a all atom fit/RMSD
-c CA atom fit/RMSD
-m main chain atom fit/RMSD
-l arguments are only conformation-files
-o overwrite with rotated conformation
-r output only the RMSD after fitting

See also rmsd.

get_distance

Usage: get_distance conformation-file [atom-pair-list|atom-pair] [output-file]

Outputs specific distance or distances in a conformation (PDB) file.

get_distance outputs the specific distances between the two atoms in atom-pair or a list of such atom pairs. atom-pair is specified as N-A M-B, where N, M are residue numbers and A, B are atom types.

See also get_distances.

get_distances

Usage: get_distances conformation-file distance-cutoff [output-file]

Outputs distances in a conformation (PDB) file.

get_distances outputs distances within distance_cutoff from a specified conformation (PDB) file. Distances within CLASH_CUTOFF (see ramp/src/common/defines.h) are marked with ****. Neighbouring residues are marked with ####.

See also get_distance.

insert_chain_id

Usage: insert_chain_id conformation-file chain-id [output-file]

Inserts a chain ID in a conformation (PDB) file.

insert_chain_id inserts a single character chain ID to the ATOM records in the specified conformation (PDB) file containing a single chain.

See also add_to_resno.

list_torsions

Usage: list_torsions [-ms] [conformation-file] [output-file]

List the torsion angles in a conformation (PDB) file.

list_torsions calculates the phi, psi, and chi angles in a conformation and outputs them (by default) in this format:

C -    1 -  psi  is 230.022
C -    1 -  chi1 is 299.959
R -    2 -  phi  is 295.965
R -    2 -  psi  is 144.819
R -    2 -  chi1 is 311.778
R -    2 -  chi2 is 172.907
R -    2 -  chi3 is 187.699
R -    2 -  chi4 is 100.294

-m outputs only the main chain torsion (phi/psi) angles, and -s outputs only the side chain torsion (chi) angles.

See also compare_torsions.

logodds

Usage: logodds input-file value [output-file]

Calculate the log probability of a particular value occuring by chance.

logodds takes a value as input and calculates the probability of it occuring by chance based on the set of values that are observed (this is simple the negative log of the number of values that are less than the input value divided by the total number of values).

pdb_to_casp

Usage: pdb_to_casp [conformation-file] [output-file]

Converts a conformation (PDB) file into the CASP format.

pdb_to_casp takes a PDB file and rearranges the atoms so that it matches the CASP specification. The occupancy field are all assigned a value of "1.00" and the error estimate field (last column) are assigned a value of "9.00".

See also pdb_to_sequence, pdb_to_vdb.

pdb_to_sequence

Usage: pdb_to_sequence [-p1] [conformation-file] [output-file]

Converts a conformation (PDB) file into different sequence formats.

pdb_to_sequence takes a conformation (PDB) file and outputs the amino acid sequence in different formats:

-p output the sequence in PIR format
-1 output the sequence as a single line

The default output is a two-column format (residue number, one letter amino acid code).

See also pdb_to_casp, pdb_to_vdb.

pdb_to_vdb

Usage: pdb_to_vdb [conformation-file] [output-file]

Converts a conformation file with "normal" atoms (PDB) into a conformation file with virtual atoms (VDB).

pdb_to_vdb takes a conformation (PDB) file with the "normal" heavy atoms and collapses the atoms into "virtual" atoms as defined by Head-Gorden and Brooks, Biopolymers: 77-100, 1991.

See also pdb_to_casp, pdb_to_sequence.

phd_to_dssp

Usage: phd_to_dssp [-cf] [phd-generated-file] [output-file]

Converts a PHD prediction into DSSP format.

phd_to_dssp converts a PHD prediction of secondary structure for a given sequence into DSSP format, with confidence information. When the -f option is specified, only the high confidence predictions are output (defaults are specified in the source file). The -c option checks the number of residues between secondary structures and prints a warning message if it's below the number defined in DEFAULT_INTER_SS_GAP_SIZE in the source file. The -m option forces the default-sized gap by eliminating neighbouring helix or sheet elements.

properties

Usage: properties [sequence-file] [output-file]

Calculates a variety of properties for a protein.

properties calculates a variety of sequence properties for a protein, such as mass.

psipred_to_dssp

Usage: psipred_to_dssp [-cfm] [psipred-generated-file] [output-file]

Converts a PSIPRED prediction into DSSP format.

psipred_to_dssp converts a PHD prediction of secondary structure for a given sequence into DSSP format, with confidence information. When the -f option is specified, only the high confidence predictions are output (defaults are specified in the source file). The -c option checks the number of residues between secondary structures and prints a warning message if it's below the number defined in DEFAULT_INTER_SS_GAP_SIZE in the source file. The -m option forces the default-sized gap by eliminating neighbouring helix or sheet elements.

sam_to_dssp

Usage: sam_to_dssp [sam-generated-file] [output-file]

Converts a SAM prediction into DSSP format.

sam_to_dssp converts a SAM (HMM/NN based prediction from the UCSC Bioinformatics group) prediction of secondary structure for a given sequence into DSSP format, with confidence information.

rmsd

Usage: rmsd [-acmstd] [conformation-file] [conformation-file] [output-file]

Calculates the RMSD between two conformations.

rmsd calculates the root mean square deviation (RMSD) between two conformations at various levels:

-a is the all atom RMSD
-c is the CA atom RMSD
-m is the main chain atom RMSD
-s is the side chain atom RMSD
-t is the phi/psi torsion RMSD
-d is the detail flag (gives the deviation on a CA level)

See also fit. rmsd_matrix.

rmsd_matrix

Usage: rmsd_matrix [-acm] [conformation-file-list] [output-file]

Does an all-against-all RMSD calculations between a set of conformations.

rmsd_matrix takes a list of conformation (PDB) files and does an all-against-all RMSD calculations between the set of conformations and outputs the resulting matrix. The following options are used:

-a is the all atom RMSD
-c is the CA atom RMSD
-m is the main chain atom RMSD

See also fit, rmsd.

ss_consensus

Usage: ss_consensus [ss-alignment] [output-file]

Generates the consensus of a set of secondary structure assignments.

ss_consensus takes set of set of one-letter secondary structure assignments and produces an "average" secondary structure assignment in DSSP format. The first line of the ss-alignment file must containg the amino acid string.

sw_to_2line, clustalw_to_2line

Usage: sw_to_2line sequence-identifier sequence-identifier [alignment-file] [output-file]

Generates a 2line style alignment.

sw_to_2line and clustalw_to_2line uses two sequence identifiers, specified as the arguments, to parse an alignment file (in the format defined by the calling argument) and contatenate all the strings appropriately to generate a two line alignment of the sequences.

transform

Usage: transform conformation-file transformation-matrix [output-file]

Transforms the coordinates in a conformation file using a given transformation matrix.

transform takes a conformation (PDB) file and applies the transformation matrix to the x, y, and z coordinates of all the atoms. The 4x3 transformation matrix consists of a rotation matrix (3x3) followed by the translation vector (3x1).

volume, rg

Usage: volume [-am] conformation-file [output-file]

Calculates the volume of a conformation.

volume takes a conformation (PDB) file and calculates the volume (or the radius of gyration, if called as rg). The -a flag computes the volume (or the radius of gyration) for all the atoms, and the -m flag compules the volume (or the radius of gyration) for only the main chain atoms.

zscore

Usage: zscore input-file cutoff

Calculates the average Z-score of a set of values within a specified cutoff.

zscore takes a list of values as input and returns the average Z-score (the number of standard deviations above the mean) for over all the values within (less than) the specified cutoff.

4.7 Music related programs

structure_to_music

Usage: structure_to_music [ -gp parametre-file] conformation-file ss-file [output-file]

Converts a 3D structure into a midge file that can be convered into a MIDI file.

structure_to_music takes as input a protein structure specified by the coordinates and the secondary structure and converts it to a musical notation used by the program midge to generate MIDI files that can be run on your favourite sequencer. A description on this done with more detail, along with compositions by yours truly, can be gotten from my personal Proteomusic history page. An optional file can be specified using the -p option that can be used to modify various parametres such as instrument patches and scales. The current parametre file can be using by the -g option which can be used as a starting point. In other words, there's no need to modify the source code unless absolutely necessary.

This program is under constant development so I might have the latest version of the executable/source better than what is on the public web site. The uses of are this program is best illustrated by my Proteomusic project and also the Protinfo Proteomusic web server module


Next Previous Contents