LatFit - Help

Menu

Home


HPstruct
structure pred.

HPconvert
PDB, CML, ...

HPview
3D visualization

HPdeg
degeneracy

HPnnet
neutral network

HPdesign
seq. design

LatFit
PDB to lattice

Results
direct access

Help

FAQ


Introduction

LatFit calculates a low deviation on-lattice model of a given full atom protein structure in Protein Data Base (PDB) format. It utilizes a greedy distance or coordinate RMSD optimizating approach while successively fitting the structures monomers on the lattice. It supports backbone-only and sidechain-including models within various lattices.
Beneath final deviations and the resulting lattice protein model coordinate data an absolute move string representation is generated.

Different Parameters

PDB ID

Explanation for this parameter

Atom to Fit

Gives the PDB atom identifier for that coordinates should be extracted and a lattice protein model should be derived. The special string "CoM" denotes that the centroid of the amino acids sidechain atoms should be calculated and fitted.

Default values are:


NOTE: In case models including sidechains are fitted, the given atom string denotes the position of the sidechain monomers to fit. The backbone monomers are fitted onto the C_alpha atom positions.

Chain Identifier

Specifies which protein chain within the PDB file is to be handled. The default is chain "A". If no chain identifier is given within the PDB file please use "_" instead of a white space character.

Model Number

Some PDB files contain several models of the same protein. This parameter allows for the specification what model to fit.

Lattice Protein Type

Defines what type of protein to be fitted. This could be a backbone-only or sidechain-including model.
For backbone-only models, each amino acid is represented by a single monomer. This is usually done to represent the backbone (C_alpha) trail of a protein chain.
Models including sidechains represent each amino acid with two monomers, typically one representing the C_alpha backbone position and one to represent the centroid of the sidechain group of each amino acid.

Lattice Form

The lattice model to use for the fitting. Currently, LatFit supports the



CA-CA bond length

Since lattice proteins have a fixed distance between all connected monomers, a length of the connections has to be specified. The user can specify the C_alpha-C_alpha distance, which is usually fixed within proteins and thereby very well suited to scale the lattice protein according to the provided protein's coordinate data. The default distance is set to 3.8 Angstroems, the average distance between successive C_alpha atoms in proteins. This distance is as well close to the mean distance between the C_alpha atom and the centroid of an amino acids sidechain (about 3.6 Angstroems).

Optimization Mode

LatFit enables two heuristics to guide the greedy optimization method for the creation of the lattice protein models. It either searchs for a model that minimizes the distance RMSD (dRMSD) between the original and the produced lattice protein or optimizes the coordinate RMSD (cRMSD).

The optimization strategies differ technically alot since the cRMSD depends heavily on the superpositioning (relative positioning) between the two structures. Thus, one has to find (a) the best set of lattice points to represent the protein and (b) the best rotation of the lattice according to the orientation of the original protein in 3D-space.

In contrast, dRMSD calculation is independent from the relative orientation of the two proteins, since it compares structure internal distances only. Therefore, 'only' the best set of lattice points to represent the protein on the lattice has to be identified, independently from the orientation of the lattice dimensions in 3D-space.
The final structure of the dRMSD-based fitting procedure might be mirrored compared to the original structure, since dRMSD does not account for reflection. To find the lattice fit in the right orientation we generate and return the mirrored structure that minimizes a cRMSD when superpositioned using the algorithm by Kabsch.

Max. to keep per Iteration

The RMSD-optimizing fitting procedures of latFit build the lattice model sequentially starting from the amino terminus of the original protein. A greedy chain-growth procedure is used, i.e. only the best lattice models are considered for elongation to derive the next longer fit. The "Max. to keep per Iteration" parameter determines how many of the best structures are considered for the next iteration.
This parameter influences therefore directly the runtime of the program.
Generally, for dRMSD optimization a high value (about 100-1000) is useful, while for cRMSD optimization a lower number (10-100) should be used since the correct lattice rotation has to be determined as well.

Rotation Steps / Interval

The cRMSD-optimzing fitting procedure allows for a fast, additive coordinate RMSD update along the chain extension, but depends on the relative orientation of the protein within the lattice. Thus, we follow Miao et al.(JMB,2004) to find the best fit.
In general a user defined number of rotation intervals R are trialled for each of the XYZ rotation axes. For each rotation, we transform the original protein coordinates to get the rotated current target structure. By applying the cRMSD based fitting procedure we get the best fit for the current rotation. Successively, we evaluate the best fit for all trialled rotations. To optimise results, a further rotational refinement step can be applied around the best resulting model.
The run time of LatFit scales with respect to the lattice co-ordination number (i.e. the number of neighboring vectors), the max. number of structures to keep per iteration, and most importantly the number of rotation intervals R trialled.

Refinement Rotation

As explained for the rotation steps, the determination of the correct lattice rotation is essential for the fitting quality when applying a cRMSD-optimizing fitting procedure. Thus, when determined the best rotation according to the given rotation steps one can apply another refinement rotation in order to determine an even better lattice rotation close to the rotation angles determined by the first rotation screen.