Translation

Program Translation

This program is designed to translate the sequences of inserts from either the New England Biolabs (Ph.D.-12TM or Ph.D.-C7CTM) libraries, or any other library of interest if provided with the start and end sequences of the vector. The program will automatically locate the position of the insert, translate the insert, and indicate any possible errors in the insert sequence (such as unexpected codons or errors in the surrounding sequence).

DNA2PRO

Peptide translation from DNA sequence 

?

Characterization of PEPTIDE populations

These 4 programs have been designed to analyze the statistical properties of a peptide population. These data are particularly valuable when calculated in conjunction with randomly chosen members of the unselected library, as library-specific biases can be identified and subtracted.

AAFREQ

This program calculates the frequency of amino acids within a peptide population as a function of position of the insert.

?

POPDIV

The diversity of the population and of individual positions is calculated.

?

AADIV

This program performs the calculations of the 2 previous programs simultaneously.

?

INFO

This program calculates an information measure associated with each peptide (e.g. Rodi et al., 1999; 2002) as a reflection of the likelihood of observing the peptide by chance. A peptide with a relatively low information content has a sequence representative of peptides more common in the population, indicative of peptides that facilitate viral growth. A peptide with a relatively high information content has a sequence relatively uncommon in the population, probably reflecting a relatively low growth rate. Since affinity selection usually involves alternating steps selecting for high affinity and then for good growth, the information measure allows the investigator to judge whether or not a peptide might be present on the basis of its growth characteristics rather than its binding properties.

?
DIVAA

DIVAA is a quantitative measure of amino acid sequence diversity, and provides a simple means to generate hypotheses concerning the contribution of individual residues to the functional and evolutionary relationships among proteins.

?

Peptide Motif Identification

A number of algorithmic and heuristic approaches have been taken to detect weak sequence similarities within practicable computation times, including the Smith-Waterman algorithm, FASTA , BLAST and ParAlign. These bioinformatic tools, however, have been developed with, and optimized for, long protein sequences. They are ill-suited for use in the analysis of combinatorial phage display data which consist of short peptide lengths.   Weak sequence motifs within short peptide sequence populations, however, can be readily identified with these three programs that search for motifs within the peptide population.

MOTIF1

This program searches for continuous motifs within the peptide population. Allowing for conservative substitutions, segments of specified length that occur more than once are identified.

?

MOTIF2

This program identifies discontinuous short motifs of 3 amino acids within a peptide population.

?

Comparison of PEPTIDE Population to Sequence of a known Structure

These 3 programs use PDB files as a basis for analysis of protein-ligand interactions

CLOSEcon

This program provides a list of the amino acid residues that are in contact (defined by a maximum interatomic distance) with a ligand on the basis of crystallogrpahic coordinates.

?

HETEROalign

This program provides two visualizations of the similarity between a protein sequence and a population of peptides.The first is a three-dimensional representation of the similarity. The program calculates the similarity and replaces the 'temperature factor' in a pdb file with the similarity so that any standard three-dimensional visualization package can be used to visualize the similarity when the colors of the image are coded to 'temperature factor'. The second output file is the sequence of the protein with the peptides exhibiting similarity aligned to the sequence.

?

DistSim

This program uses the pdb output from HETEROalign to calculate the relationship between the similarity (temperature factor) and the distance from the ligand. A single distance and a single temperature factor are associated with each amino acid. A scatterplot of distance vs similarity per each residue can demonstrate where the amino acids in the protein sequence are the most similar to those of the peptide population, in relation to the distance from the hetero group of the protein.

?

Analysis of Single or Multiple FASTA Sequences

These 3 programs carry out optimal sequence alignments between affinity-selected peptides and protein sequences for which there is only a text sequence (no PDB coordinates)

MATCH

This program carries out a calculation of the similarity between a sequence and a population of peptides when a PDB file is not available. The output is a set of aligned peptides similar to the output one of HETEROalign, except that here the contact points are not known.

?

FASTAcon

Given a FASTA list of proteins (as in a genome) this program provides a list of the proteins containing a user-defined short consensus sequence which may be either continuous or discontinuous

?

FASTAskan

This program calculates the similarity between a peptide population and a large set of protein sequences in fasta format stored in one or more text files. The output is a list of the proteins with the highest peak value of the similarity, ordered according to the value of the peak score. Scores are generated by calculating the similarity between each peptide sequence as compared to the length of the protein sequence.

?

 

Copyright © 2003 Biosciences Division, Argonne National Laboratory
This site is best viewed with IE 6+ and Netscape 6+