Welcome to the GenomeWeb
Protein General Sequence Analysis Analysis

Search for:


These are a collection of protein sequence analysis utilities.

[info] BIOLOGY WORKBENCH
[info] MOWSE - search by molecular weight fingerprint
[info] NetOglyc - Prediction of Mucin type O-glycosylation sites
[info] Prediction of GlcNAc O-glycosylation site in Dictyostelium discoideum pr
[info] PSORT - Analyze and predict protein sorting signals.
[info] ProtComp - sub-cellular localization of Eukaryotic proteins
[info] PROPSEARCH - database query by amino acid composition
[info] Peptide MW Calculator
[info] Compute pI/Mw tool
[info] ProtParam tool
[info] PEST - identify proteins with short half-lives
[info] SAPS - Statistical Analysis of Protein Sequences
[info] CBRG at ETHZ
[info] PeptideSearch Database searching by mass spectrometric data
[info] SPAC - identify polypeptide using amino-acid composition
[info] GeneFIND Family Identification System
[info] GeneQuiz
[info] PEDANT - Protein Extraction, Description, and ANalysis Tool
[info] A280/A260 calculator
[info] Pratt Pattern Discovery


Detailed information on the above options


BIOLOGY WORKBENCH
The Biology Workbench is a point and click WWW interface for an integrated set of programs and database searching tools that allow you to carry out sequence analysis without having to log into a remote computer site.

The service is free to non-commercial researchers (you just need to register). There is a comprehensive demonstration to help you get started.


MOWSE - search by molecular weight fingerprint
You can use this page to submit a MOWSE database search. MOWSE will search the OWL protein database with the protein fragment information, and return the protein(s) which most likely correspond to your peptide-data.


NetOglyc - Prediction of Mucin type O-glycosylation sites
The specificities of the UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferase family which links the carbohydrate GalNAc to the side chain of certain serine and threonine residues in mucin type glycoproteins are presently unknown. The specificity seems to be modulated by sequence context, secondary structure and surface accessibility. The sequence context of glycosylated threonines was found to differ from that of serine, and the sites were found to cluster. Non-clustered sites had a sequence context different from that of clustered sites. Charged residues were disfavoured at position -1 and +3. A jury of artificial neural networks was trained to recognize the sequence context and surface accessibility of 299 known and verified mucin type O-glycosylation sites extracted from O-GLYCBASE. The cross-validated NetOglyc network system correctly found 83 % of the glycosylated and 90 % of the non-glycosylated serine and threonine residues in independent test sets, thus proving more accurate than matrix statistics and vector projection methods.


Prediction of GlcNAc O-glycosylation site in Dictyostelium discoideum pr
The server is at an experimental stage for now, due to the limited dataset of 39 experimentally known sites of glycosylations. They hope to increase the accuracy of the server with availability of more experimental data.


PSORT - Analyze and predict protein sorting signals.
Analyze and predict protein sorting signals coded in amino acid sequences.


ProtComp - sub-cellular localization of Eukaryotic proteins
The program is based on complex neural-network recognizers, which identify probability of the subcellular localization in nucleus, plasma membrane, extracellular, cytoplasmic, mitochondrial, chloroplast, endoplasmic reticulum, peroxisomal, lysosomal or Golgi compartments.


PROPSEARCH - database query by amino acid composition
PROPSEARCH reads your amino acid compositional analysis data and performs a protein database query to identify the protein. The result is emailed back to you. Searches take about 20 minutes.


Peptide MW Calculator
This calculates the molecular weight of your peptide.


Compute pI/Mw tool
Compute pI/Mw is a tool which allows the computation of the theoretical pI (isoe lectric point) and Mw (molecular weight) for a protein sequence.


ProtParam tool
ProtParam is a tool which allows the computation of various physical and chemical parameters for a protein sequence.

The computed parameters include the molecular weight, theoretical pI, amino acid composition, extinction coefficient, estimated half-life, instability index, aliphatic index and grand average of hydropathicity (GRAVY).


PEST - identify proteins with short half-lives
Proteins with intracellular half-lives of less than two hours are found to contain regions rich in proline, glutamic acid, serine and threonine (P, E, S and T). These so called PEST regions are generally flanked by clusters of positively charged amino acids.

The PEST search utility identifies possible PEST regions in a submitted probe using the Molecular fraction of the P, E, S and T components, and the hydrophobicity index of the region.


SAPS - Statistical Analysis of Protein Sequences
This program, written by the group of Samuel Karlin, analyses proteins for statistically significant features like charge-clusters, repeats, hydrophobic regions, compositional domains etc. One of its options is to generate self-explanatory output.


CBRG at ETHZ

) roteins rees, score...)


PeptideSearch Database searching by mass spectrometric data
This is an advanced tool for protein database searching by mass spectrometric data, such as peptide mass maps or (partial) amino acid sequences.


SPAC - identify polypeptide using amino-acid composition
SPAC is able to retrieve in protein or nucleic acid databases, the sequence corresponding to a protein or peptide whose only amino acid composition and molecular weight are known. This algorithm is more particularly devoted to the retrieval of partial sequences, a task that other available softwares poorly perform. Its accuracy for the attribution of a protein fragment to a sequence could represent an easy and economical first tool upstream the use of more sophisticated and expensive methods in proteomic research.


GeneFIND Family Identification System
The GeneFIND family identification system aims at high-throughput full-scale gene family identification, by taking advantages of the strengths of various sea rch methods and incorporating ProClass family information. Multi-level filters are u sed, starting the fastest MOTIFIND neural networks, followed by BLAST search, SSEARCH dynamic programming, and motif pattern search. The current implementation allows large-scale identification of 942 protein families.


GeneQuiz
Genequiz provides highly automated analysis of biological sequences.

GeneQuiz derives functional annotation for protein sequences and provides supporting evidence, including family alignments.


PEDANT - Protein Extraction, Description, and ANalysis Tool
PEDANT is a software system for completely automatic and exhaustive analysis of protein sequence sets - from individual sequences to complete genomes.

the predicted open reading frames from fully sequenced genomes using a combination of sequence comparison and prediction techniques


A280/A260 calculator
This can be used to calculate the molecular weight (using average isotopic mass), extinction coefficient, the concentration, and the formal charge of a protein.


Pratt Pattern Discovery
Pratt is a tool that allows the user to search for patterns conserved in a set of protein sequences. The user can specify what kind of patterns should be searched for, and how many sequences should match a pattern to be reported.


Any Comments, Questions? Support@hgmp.mrc.ac.uk