The process of duplication can be applied to segments of DNA ranging from individual structural or functional domains to entire genomes. Whatever the size of the duplicated region, a contribution can be made to evolution. Genes – and therefore proteins – can be extended, copied genes can diverge to serve new purposes, or entire organisms can become polyploid.
Internal gene duplication involves gene elongation by the duplication of structural and/or functional domains. This process can result in dosage repetition (as in ubiquitin), structural extension (collagen), or domain divergence (immunoglobulins). Internal gene duplication occurs due to unequal crossing-over between chromosomes or chromatids, crossovers within exons, or slippage replication.
The product of a single-gene duplication may or may not retain its function. Or it could gain a new function, leading to the formation of a gene family. Single gene duplications occur due to crossovers between chromosomes or chromatids, or integration of cDNA into the genome. Gene duplication speeds the development of the genome by providing a spare copy which is not subject to any pressure to conserve its function. The spare will be free to acquire mutations and may diverge to fulfill a new function, or deteriorate into a pseudogene.
Another feature of eukaryote genome evolution is the duplication of non-coding regions of the genome. This has increased the size of the genome and of the non-coding spaces between genes and between introns. Sub-genomic duplication is largely due to the activity of transposable elements such as LINEs (L1) and SINEs (Alu).
Small segments of chromosomes can be duplicated in duplicative transposition events, where a section of chromosome is copied and inserted into another chromosome.
Duplication of the entire genome has occurred twice during chordate evolution. This can be calculated from the numbers of HomC complexes. The first round occurred between the cephalochordata and the agnatha; the second between the chondrichthyes and osteichthyes. Another example is the homeobox and associated genes, which are often found in quadruplicate. In general, higher vertebrates have more gene family members on more chromosomes than lower vertebrates, which in turn have more gene family members on more chromosomes than invertebrates. Although chromosomes have not remained intact through evolution, groups of genes have (synteny).
In plants, genome duplication is an efficient way of increasing the size of the genome. The resultant cell can remain tetraploid or undergo chromosome divergence, returning to a diploid state but with double the original number of chromosomes. Vertebrate genome duplication has led to the development of paralogous chromosome segments, clusters of genes on different chromosomes more closely resembling each other than the neighbouring genes.
From gene segments to whole genomes, there is a wide variety of possible duplication events. While some result in nothing more than the formation of pseudogenes, others may have a significant effect on the structure and function of some or all of the genome.