Lists of genomic bases: A, T, G and C

Repeat after me: what are repeat expansion disorders?

We learn how repeating sequences in our DNA can impact on health and how genome sequencing can make a diagnostic difference

Repeat expansion disorders are a group of over 40 inherited neurological conditions that collectively affect one in every 3,000 people.

Much of the human genome is composed of repeating sequences. Often, these sections are three base pairs (DNA ‘letters’) long, but larger repeating sequences are also seen.

In many of these repetitive regions, the number of repeats varies between people, and this does not cause any problems. In some locations, however, a high number of repeats can cause a repeat expansion disorder.

Repeats and complications

The most well-known repeat expansion disorder is Huntington disease, a rare condition caused by a high number of repeats of the CAG sequence in the Huntingtin gene.

Most people have between 10 and 35 repeats of this sequence and are unaffected. People with 40 or more repeats always develop the disease, and people with between 36 and 39 may or may not.

In Huntington disease, the CAG repeats result in the production of an abnormal Huntingtin protein which disrupts cell functions in the neurons, but in some other repeat expansion disorders, the repeats appear in regions of the genome that do not code for proteins.

For example, Fragile X syndrome is caused by a CGG repeating sequence in a non-coding part of a gene called FMR1 on the X chromosome. As a result, it does not make a harmful protein but instead causes the gene to be epigenetically silenced by methylation, so the corresponding protein is not made.

The symptoms of the disorder, which include intellectual disability, are caused by a shortage of FMR1, a protein that helps form connections between neurons.


Although they are genetic diseases, repeat expansion disorders can seem to appear spontaneously. This is because in repetitive areas of the genome where there are sequences with a high number of repeats, the DNA sequence can more easily become unstable.

People with a higher number of repeats can add extra repeats during cell division, and if this happens during sperm or egg creation, the number of repeats can then cross the threshold causing the child to be affected by a repeat expansion disorder.

For example, normally, there are between 5 and 40 CGG repeats in the FMR1 gene, but the methylation that causes fragile X syndrome is not triggered until more than 200 repeats are present.

People with over 40 repeats are unaffected by fragile X syndrome but are classed as having a ‘premutation’, and women with this premutation are known to be at increased risk of having an affected child.

The diagnostic challenge

Repeat expansion disorders can be difficult to diagnose for several reasons: they are rare and can produce very diverse symptoms, and as described above, there may not be a family history of the condition due to accumulation of repeats over generations.

Historically, some conditions have also been difficult to detect with genome sequencing, because next-generation sequencing – the type used in the 100,000 Genomes Project – was not very effective at accessing loci containing repeat expansions.

New research published in the Lancet shows that whole genome sequencing is now a viable option for diagnosing these disorders because of improved bioinformatics systems that specifically look for repeats.

Another new type of sequencing called long-read sequencing is also useful for identifying repeating sequences. It is less error-prone because when smaller reads are reassembled, like in next-generation sequencing, repeats may be overlooked or duplicated.

For more information on repeating sequences in the genome and their impact on health, read our article on copy number variants.

Please note: This article is for informational or educational purposes, and does not substitute professional medical advice.