DNA Repeats

These are a collection of DNA repeat-finding sites.

[info] Dst - Locate human repeats
[info] Tandem Repeats Finder
[info] Large Dot Plots
[info] REPuter - Fast Computation of Maximal Repeats in Complete Genomes
[info] RepeatMasker - mask out repeat sequences
[info] CENSOR - mask out repeat sequences

Detailed information on the above options

Dst - Locate human repeats
Two search modes are available: a search for human repeats using the file BR3X and a self-homology search that will find repeats more than 2kb apart. The images produced here after searching the repeat database are simplified versions of those produced by the stand alone program.

Tandem Repeats Finder
A tandem repeat in DNA is two or more adjacent, approximate copies of a pattern of nucleotides. Tandem Repeats Finder is a program to locate and display tandem repeats in DNA sequences. In order to use the program, the user submits a sequence in FASTA format. There is no need to specify the pattern, the size of the pattern or any other parameter. The program's analysis is sent back to the user's web browser as two files, a summary table file and an alignment file. The summary table contains information about each repeat, including its location, size, number of copies and nucleotide content. Clicking on the location indices for one of the table entries opens a second web browser that shows an alignment of the copies against a consensus pattern. The program is very fast, analyzing sequences on the order of .5Mb in just a few seconds. Submitted sequences may be up to 5Mb in length. Repeats with pattern size in the range from 1 to 500 bases are detected.

Large Dot Plots
This page accesses a very fast dot plot algorithm designed for large DNA sequences. This demonstration only allows a sequence to be compared with itself. The algorithm uses a word size of eight and remaps the the matches onto a 500x500 grid. As the query becomes very large it is necessary to set a threshhold for plotting (the cutoff score below). If the cutoff is not raised, an interesting effect is observed where the plot is largely black and the only light areas are regions of unusual sequence organization like the ones found by Zinfo.

The default output is curently a postscript file that will be displayed externally to the WWW browser. The gif conversion is not currently producing a high quality image. This program is derived from an X-windows based interactive version that saves the internal mapping so that changes in the cutoff score can be displayed rapidly.

REPuter - Fast Computation of Maximal Repeats in Complete Genomes
REPuter computes all maximal duplications and reverse, complemented and reverse complemented repeats in a DNA input sequence.

RepeatMasker - mask out repeat sequences
RepeatMasker screens DNA sequences in fasta format against a library of repetitive elements and returns a masked query sequence ready for database searches as well as a table annotating the masked regions.

CENSOR - mask out repeat sequences
