What are copy number variants?

Everyone has unique variants in their genome – and many are harmless – but which types can have an impact on our health and how do they occur?

Large parts of the human genome are made up of repeating sequences. Some are short sequences of two or three base pairs (DNA ‘letters’) repeated tens or hundreds of times; others can include duplications of whole genes or larger sections of chromosomes. Copy number variants (CNVs) are where the number of repeats varies between individuals, and may account for almost 10% of an individual’s genome.

Many of these variants appear to have no effect on health, but some are associated with disease, or can have other clinically relevant effects. Two types of CNVs – trinucleotide repeats and whole gene duplications – can have a particularly large impact on the health of affected individuals.

Trinucleotide repeats and disease

One of the most well-known examples of a disease-causing CNV is Huntington disease, which is caused by a repeating sequence of three base pairs (known as a trinucleotide repeat) at the end of the coding region of the HTT gene.

In unaffected people, a ‘CAG’ sequence is repeated between 10 and 35 times, but people with 40 or more repeats will be affected by Huntington disease. Large numbers of repeats are possible (exceeding 200), and higher numbers of repeats ­– especially over 60 – are associated with earlier disease onset. The disease shows reduced penetrance in people with between 36 and 39 repeats.

Trinucleotide repeats are also responsible for Fragile X syndrome and its associated conditions (known as ‘premutation’ disorders), which occur due to a ‘CGG’ repeat in a non-coding region of the FMR1 gene.

Unaffected individuals will have between 5 and 40 repeats, but health conditions appear in individuals with a greater number of repeats. Those with between 55 and 200 repeats have a premutation, and are at risk of a variety of clinical disorders, including neurodevelopmental problems, psychiatric disorders and premature menopause. Those with over 200 repeats have Fragile X syndrome. This quantity of repeats leads to the gene’s promoter becoming inactivate and the gene switching off, even though the coding regions are unaffected.

Whole gene duplications: implications for healthcare

CNVs can also encompass the repeating of whole genes within a genome. One example is the CYP2D6 gene, which codes for the cytochrome P450 in humans. P450 is an enzyme important in breaking down substances not produced by the body, including medications.

As well as the presence of several different alleles that can cause variation in P450 activity, some individuals also have duplications of this gene, which can lead them to be ultra-high metabolisers. This can have significant clinical impact, as these individuals will respond to drug doses very differently than others in the population.

In many cases, ultra-high metabolisers will break down drugs more quickly, and so a normal dosage will be ineffective for the patient. P450 is involved in the metabolism of several antidepressants and antipsychotics, making this is an area of interest within psychiatric prescribing.

Some drugs such as codeine, however, take advantage of P450. These are compounds (known as pro-drugs) which are transformed by the body into the active form – in codeine’s case, into morphine. For ultra-high metabolisers, this effect is increased, so a patient could end up with a much greater concentration of morphine in the bloodstream than the prescriber intended.

To learn more about the types of genomic variants that can affect human health, read our explainer article, ‘Various types of variant: What is genomic variation?’