Written by 10:11 pm Covid-19, SARS-CoV-2

Molecular Evolution of the SARS-CoV-2 Virus (Part 2)

I discussed, in another post, that SARS-Cov-2 sequences published in NCBI GenBank are mostly incomplete. It is not possible, at this point in time, to pinpoint deletions that might be part of the natural evolution of the virus, as it happened for the SARS-CoV-1.

However, it is possible to study the evolution of SAR-CoV-2 for the past months by counting point mutations, or Single Nucleotide Variants (SNV), in each published sequence, with respect to the original virus sequence, published in GenBank in January 2020. If the virus is mutating over time, it should be accumulating mutations, getting weaker and weaker, as it happens for the common cold and common flu virus.

I used a process similar to the one used in looking for deletions; the process is described in another post. The general idea is to call out SNVs, while ignoring deletions, as it is not possible to tell genuine deletions from sequencing gaps.

Figure 1: The daily average number of SNV of viral genomes by virus sample collection date.

In Figure 1, each SARS-CoV-2 viral genome that has been sequenced in the US is aligned to the oldest SARS-CoV-2 genome available in NCBI GenBank; point mutations are summed up, while deletion or sequencing gaps are ignored. For every viral genome collection date, the average number of point mutations is plotted. This approach somewhat under-estimates the true number of mutations, due to sequencing gaps and deletions being ignored.

Figure 1 undeniably shows that the SARS-CoV-2 virus has been accumulating mutations since its appearance on US soil. The accumulation of mutations is steady and continuing in present months, at a constant rate.

It is interesting to note that from March through May the mutation rate appears to be higher. One interesting hypothesis is that the mutation rate is somewhat proportional to the number of hosts visited by a virus. March through May was the “first wave” of infections. At the time of this writing, we are under the “second wave”. As more genomes are published in GenBank, it will be interesting to see whether the mutation rate picks up again, during the second wave. A second intriguing hypothesis is that a more lethal virus has a higher mutation rate than its less lethal evolved version has; as a virus gets weaker an weaker, its mutation rate slows down, as it visits more and more hosts. Both these hypothesis need work to be proven or disproved.

The interesting fact here is that the virus is mutating over time and the mutation rate increase seem to happen concurrently to the reduction in Case Fatality Rate we observed in another post.

RNA virus mutation rate is so damn high. It should not be surprising that the SARS-CoV-2 is accumulating mutations over time. It also fits the natural history of respiratory virus that become less infectious after winter’s end. Is the disappearance of common cold and flu virus a consequence of the accumulation of mutations? Is it certainly a fascinating hypothesis. The Spanish flu virus disappeared into thin air as well1. Nobody has ever explained why and how.

Perhaps, the SARS-CoV-2 is not mutating as fast as we would want it to. But its mutation rate surely opens the door to some wishful thinking that, with the virus getting weaker and vaccines becoming available in 2021, the pandemic might be over soon. How soon? See this other post for a hint.

  1. Gina Kolata, Flu: The Story Of The Great Influenza Pandemic of 1918 and the Search for the Virus that Caused It, Atria Books, 2001 []
Tags: , , , Last modified: November 28, 2020
Close