DNA sequences in eukaryotes are classified according to the number of copies present in an average genome, which can range from one to more then ten thousand. The three main types are highly-repetitive, middle-repetitive, and single-copy sequences. Highly-repetitive DNA can be further classified as satellite, minisatellite, or microsatellite depending on the size of the repeat units. Middle-repetitive DNA includes the mobile transposable elements. Single-copy DNA – which can have up to nine copies – carries most of the genome’s functional information.
Highly-repetitive DNA sequences occur 10 000 or more times per genome. In mammals and insects they are concentrated at the centromeres; in plants the sequences are interstitial and telomeric. HR-DNA sequences range from 100s of kb to several Mbp in length and contribute to the inertness of heterochromatin. The individual repeats contained in highly-repetitive DNA can be arranged in clusters, in blocks of repeat units, in long tandem arrays, or distributed as satellite DNA. Highly-repetitive DNA is not transcribed and does not contain any genes.
Satellite HR-DNA consists of repeat units ranging from microsatellite (1-4 bp) through minisatellite ( up to 64 bp) to satellite (5-171 bp). Some is concentrated in certain regions of the genome, such as minisatellite in the telomeres or satellite in the centromeres. Microsatellite DNA is interspersed throughout the genome, as is some minisatellite DNA.
Most middle-repetitive DNA has between 10-1000 copies per genome. Clustered MR-DNA sequences are transcribed by RNA pol I or III and include ribosomal, transfer, and histone genes. Middle-repetitive DNA also include gene coding for proteins such as actin and myosin that are required in large amounts by cells. For these genes, the high amount of transcription is provided by the large number of genes rather than a high rate of trancription. These sequences require a high level of redundancy. Interspersed MR-DNA sequences include transposable elements, which have evolved the ability to move around within the genome. MR-DNA constitutes from 5-80% of the eukaryote genome; around 44% of the genome consists of TEs and their derivatives.
Repetitive DNA arise through several different mechanisms. Duplication can take place in non-coding region of the genome. Repeats can also be amplified by unequal crossing-over between chromosomes or between sister chromatids, or by slippage replication.
Unique or single-copy DNA is present 10 or fewer times per genome. In eukaryotes it makes up between 30-85% of the genome. Single-copy DNA holds a large portion of the organism’s genetic information, including protein-coding genes and regulators of gene expression. Most protein-coding genes are single-copy.