Welcome to the GenomeWeb
Nucleic Acid Databases

Search for:


These are a collection of nucleic acid database sites.

Search Major Sequence Databases

[info] Search Databases at EBI (EMBL)
[info] Search Databases at NCBI (GenBank)
[info] Search Databases at GSDB
[info] Search Databases at DDBJ

BLAST searches

[info] BLAST search of databases at NCBI
[info] BLAST search of databases at NCGR
[info] BLAST search of human chromosome databases
[info] BLASTula, the server of Blast servers
[info] BLAST2 search of databases at EMBL
[info] INCA with BLAST / Entrez

Other Searches

[info] Expressed Sequence Tags (dbEST)
[info] dbGSS - Genome Survey Sequence
[info] SRS-FASTA: Similarity Search of GenBank Subsets
[info] Sequence Retrieval System (SRS)
[info] WWW-Query - sequence data and multivariate analysis
[info] GeneNet

Miscellaneous Nucleic Databases

[info] REBASE The Restriction Enzyme Database
[info] Multi-Cut - A Data Base of Restriction Endonuclease Buffers
[info] Sequence Tag Alignment and Consensus Knowledgebase (STACK)
[info] Codon Usage Database
[info] ImMunoGeneTics Database (IMGT)
[info] EPD Eukaryotic Promotor Database
[info] The Tumor Gene Database
[info] Nucleic Acid Database (NDB) Project
[info] DNA Patents Database
[info] Molecular Probe Data Base (MPDB)
[info] HIV Sequence Database
[info] Euchromatin Network
[info] Human Tumor Gene Index (hTGI)
[info] Intron Sequence Information System (ISIS)
[info] Ares Lab Yeast Intron Database
[info] Exon-Intron Database


Detailed information on the above options


Search Databases at EBI (EMBL)
The EBI provides facilities to search for sequences by text or by sequence similarity and to submit new sequences.


Search Databases at NCBI (GenBank)
The NCBI provides facilities to search for sequences by text or by sequence similarity and to submit new sequences.


Search Databases at GSDB
The GSDB provides facilities to search for sequences by text or by sequence similarity and to submit new sequences.


Search Databases at DDBJ
The DDBJ provides facilities to search for sequences by text or by sequence similarity and to submit new sequences.


BLAST search of databases at NCBI
BLAST is a program that allows you to search for similarity between your query sequence and the gene sequences held at the NCBI.


BLAST search of databases at NCGR
BLAST is a program that allows you to search for similarity between your query sequence and the gene sequences held at the NCGR.


BLAST search of human chromosome databases
Allows the searching of a DNA database containing all human sequence data available from the Sanger Centre.

The sequence data contains:


BLASTula, the server of Blast servers
BLASTula, the server of Blast servers: a group of pages offering a unique access to more than 40 different Blast servers world-wide operating on original sets of sequences.


BLAST2 search of databases at EMBL
BLAST2 is a program that allows you to search for similarity between your query sequence and the gene sequences held at EMBL. It is similar to the original BLAST program, but it includes gaps in the alignments.


INCA with BLAST / Entrez
Iterative Neighborhood Cluster Analysis

INCA is a Java applet that runs BLAST

INCA is a Java 1.02 applet. Give INCA a starter sequence and it finds related sequences. INCA runs BLAST on the starter sequence and then runs BLAST on the matching sequences. INCA keeps track of all the results. INCA originally accessed the Entrez predefined sequence neighbors. Now INCA uses the BLAST server to find sequence neighbors dynamically. Using BLAST instead of Entrez to find neighbors permits one to adjust search parameters as needed, and can improve search results.


Expressed Sequence Tags (dbEST)
dbEST (Expressed Sequence Tag) sequences are 'single pass' partial DNA sequences derived from clones randomly selected from cDNA libraries. dbEST is maintained by NCBI and included in the GenBank database. Because these data differ from traditional GenBank entries and thus require special processing and annotation, NCBI also makes them available in a separate database, dbEST. The full reports contain information on the availability of physical cDNA clones and mapping data in collaboration with the Genome Data Base at Johns Hopkins University.


dbGSS - Genome Survey Sequence
Contains contact information about the contributors, experimental conditions and genetic map locations of the Genome Survey Sequence division of Genbank/EMBL.


SRS-FASTA: Similarity Search of GenBank Subsets
This is a search of your query sequence against subsets of nucleic and protein databanks. These subsets are chosen by you with keyword selections in the sequence documentation.

There may be times when you will get better information by eliminating unwanted sections of the databanks before performing a sequence search. Given the large size and constant updates to the biosequence databanks, it is difficult to produce subsets of these data directly for similarity searching. By coupling similarity search software (FastA) with keyword selection software (SRS), one can provide such searches fairly efficiently.


Sequence Retrieval System (SRS)
A powerful search tool with links between more than 20 molecular biology databases (EMBL, SwissProt, PIR, PDB, Prosite ...) allowing complex searches


WWW-Query - sequence data and multivariate analysis
This is a World-Wide Web server for accessing sequence collections indexed with ACNUC and for performing multivariate analyses on sequences. General collections like GenBank or EMBL can be accessed, as well as specialized data banks like Hovergen or NRSub.

Indexation with ACNUC makes possible the building of queries using many criteria to retrieve sequences. Criteria are based on mnemonics, accession numbers, keywords, taxonomic data, bibliographic references, dates of insertion in the bank, the nature of the genome from which a sequence has been obtained, etc. Also, the notion of subsequence introduced in ACNUC allows to retrieve idependently genomic fragments of biological interest like CDS, tRNAs, rRNA, snRNAs, etc.

The result of each query is represented by a list of sequences and this list is temporarily stored in our server. By this way, it is possible to re-use a previous list to build more complex queries or to perform treatments on a set of sequences. Up to now, these methods consist mainly in programs for performing multivariate analyses on the CDS or the proteins. These methods are: Principal Component Analysis (PCA), COrrespondence Analysis (COA), and Multiple Correspondence Analysis (MCA).


GeneNet
GeneNet is a meta-search system for the analysis of sequence similarity and is designed for helping biologists to analysis sequences efficiently via WWW. It also performs periodical searching that prevents biologists from repetitive analysis of the same sequence.

GeneNet can communicate simultaneously with four databases (GenBank in NCBI, PDB, BLOCKS, and KEGG.) which are widely used. For protein sequences, searches are performed to four databases described above. For DNA sequences, only GenBank analysis is possible.


REBASE The Restriction Enzyme Database
REBASE is a collection of information about restriction enzymes, methylases, the microorganisms from which they have been isolated, recognition sequences, cleavage sites, methylation specificity, the commercial availability of the enzymes, and references - both published and unpublished observations


Multi-Cut - A Data Base of Restriction Endonuclease Buffers
Multi-cut is a database of restriction endonuclease buffers. It finds compatible buffers for a list of enzymes that you want to use in a multiple restriction endonuclease digest. Multi-Cut searches through activity data from the catalogs of several major restriction endonuclease manufacturers and finds buffers that will work with all of the endonucleases in the reaction.


Sequence Tag Alignment and Consensus Knowledgebase (STACK)
Aims to make the most comprehensive representation of the sequence of each of the expressed genes in the human genome.


Codon Usage Database
A query box to search a codon usage table for an organism, is presented. Search can be done via the Latin name or common name.

Alphabetical lists of all organisms and lists for organisms with 100 or more CDS's in Genbank available, are also presented.


ImMunoGeneTics Database (IMGT)
IMGT, the international ImMunoGeneTics database, is a high-quality integrated database specialising in Immunoglobulins (Ig), T cell receptors (TcR) and Major Histocompatibility Complex (MHC) molecules of all vertebrate species, created in 1989 by Marie-Paule Lefranc (Universiti Montpellier II, CNRS). IMGT, a European project since 1992, works in close collaboration with EBI. At present, IMGT includes two databases: IMGT/LIGM-DB, a comprehensive database of Ig and TcR from human and other vertebrates, with translation for fully annotated sequences, and IMGT/HLA-DB, a database of the human MHC referred to as HLA (Human Leucocyte Antigens). The IMGT server provides a common access to all Immunogenetics data.


EPD Eukaryotic Promotor Database
The Eukaryotic Promoter Database (EPD) is an annotated non-redundant collection of experimentally characterised eukaryotic POL II promoters.


The Tumor Gene Database
A database of genes associated with tumorigenesis and cellular transformation. This database includes oncogenes, proto-oncogenes, tumor supressor genes/anti-oncogenes, regulators and substrates of the above, regions believed to contain such genes such as tumor-associated chromosomal break points and viral integration sites, and other genes and chromosomal regions that seems relevant.


Nucleic Acid Database (NDB) Project
The goal of the Nucleic Acid Database Project is to assemble and distribute structural information about nucleic acids.

Structures may be selected by making choices based on a large variety of structural and experimental characteristics.

The user can then view the structure's coordinates in either NDB or PDB format, view the structure's full NDB entry, view the structure using either a local viewer or the remote viewer (RasMol), or display the structure's atlas entry.


DNA Patents Database
The DNA Patents Database, compiled by the National Academy of Sciences (USA) contains the full text of patents. It is set up to provide the key biological information about each patent - which genes are included, the techniques used in their discovery and the precise extent of the claims made in each patent.


Molecular Probe Data Base (MPDB)
Contains information on ca. 4000 synthetic oligonucleotides with a sequence of up to 100 nucleotides.


HIV Sequence Database
The HIV Sequence Database focuses on five primary goals:

DB Search does a straightforward search on a large number of database fields. Ouput comes in the form og Genbank-style sequences.

HIV-MAP allows searching on fewer fields, but that can find all sequences that overlap or partially overlap a region, optionally clip out that region from a longer sequence, and can even produce an alignment of the selected sequences. This interface does allow searches by subtype, and country, and accession number or sequence name, and can produce output files in Genbank, Fasta, and Intelligenetics format.


Euchromatin Network
The Euchromatin Network is designed to help researchers and others interested in the latest developments in our studies of this most active part of the genome within the cell nucleus.

With increasing accuracy, resolution, and sensitivity, our cell biology methods are revealing new and important information about the role of active euchromatin in the life of the cell, during embryogenesis and cell differentiation, during the hormone response and the immune response, during neoplasia and organ regeneration.

Proteins have been described as the"agents" whereby the cell accomplishes its many metabolic functions. DNA has been described as the"library" whereby the cell stores the structural blueprints for each protein of that individual. RNA is now being recognized as the"spark" whereby the cell activates specific genes of the genome for expression as proteins in the cell. Such"riboregulators" are being recognized in the animal, the plant and even the bacterial world.

Euchromatin is that unique combination of DNA, RNA and proteins which allows this magnificent cellular program within the cell nucleus to proceed with accuracy, safety and flexibility.


Human Tumor Gene Index (hTGI)
The Human Tumor Gene Index (hTGI) has two major goals:


Intron Sequence Information System (ISIS)
This contains information on spliceosomal introns. ISIS contains phylogenetic and protein homology categories, information about individual sequences and various bioinformatics analyses of taxonomical groupings of sequences using non-redundant subsets of the data.


Ares Lab Yeast Intron Database
This site contains information about the spliceosomal introns of the yeast Saccharomyces cerevisiae. This class of introns presents special problems for the annotation and analysis of eukaryotic genome sequences. Splice sites themselves are information-poor, and their recognition by the splicing apparatus is highly context-dependent. At present we do not understand splice site context well enough to predict which potential splice sites will be used, and thus how the genomic sequences will be expressed.


Exon-Intron Database
An exhaustive database of protein-coding intron-containing genes.


Any Comments, Questions? Support@hgmp.mrc.ac.uk