In a typical year, the surveillance team at the University of Washington's clinical virology lab runs about 50,000 tests to identify viruses. Since the first COVID-19 case hit Seattle, where the lab is based, it has done about 2 million. "Forty years of testing, in one year," says its assistant director, Alex Greninger.

That lab is also one of many — spread across state, private, and university facilities — that's reading the viral genomes of positive test samples to see if there are any worrisome changes in the virus. The importance of that search became more obvious in December, with the reporting of the first "variant of concern," B.1.1.7, out of the United Kingdom. It has mutations that let it spread more easily than the original SARS-CoV-2 coronavirus.

The rise of that variant, plus B.1.351 from South Africa and P.1 from Brazil, were among factors leading to a renewed focus on surveillance by sequencing — that is, cataloging the order of chemical subunits of the virus's genetic material. The Biden administration has pledged almost $200 million to boost the sequencing effort, and Congress approved a $1.75 billion infusion for a program of the Centers for Disease Control and Prevention that includes sequencing.

Sequencing in the U.S. has been stymied by decentralized health providers and payers, a slow rise in testing and underfunding. The situation is improving, and the CDC and its partners are now reporting well over the agency's initial goal of 7,000 sequences per week. However, that's still far fewer than 5 percent of new cases, which some experts see as a good benchmark for genomic surveillance.

How does sequencing help against COVID-19?

Reading the SARS-CoV-2 genome is a key part of surveillance (which also includes testing, tracking cases, and contact tracing). Once a genetic sample from someone's nose or throat is confirmed to be COVID-19-positive, scientists take that sample — a DNA copy of the virus's RNA-based genome — and sequence all of it. They chop the genetic material into bits and use machines to read the sequence of genetic letters (the chemical bases known for short as A, C, T, and G) contained in those bits. They can figure out the whole viral genome from the overlapping pieces.

The purpose of sequencing depends, in part, on the progress of the outbreak, says Steve Schaffner, a computational biologist at the Broad Institute of MIT and Harvard University. "In the very beginning, you need to know what it is that's infecting people," says Schaffner, who coauthored a summary of genome analysis during viral outbreaks for the Annual Review of Virology. For SARS-CoV-2, the sequences first obtained in China were quickly used to begin vaccine design.

As the disease spread, scientists used differences in the sequences to build a sort of family tree for the virus, figuring out how it traveled from person to person and place to place. That has important implications for public health measures, says Schaffner. If a virus is spreading only locally, for example, then closing the borders won't do much good. Or if a disease tends to "superspread" from one person to many, as COVID-19 does, then contact tracing should focus on finding the first person to instigate a cluster of cases.

Now, as the pandemic moves into its later stages (we hope) and the virus evolves, "all the focus is on these variants," says Schaffner.

Read the rest of the article at Knowable Magazine, an independent journalistic endeavor from Annual Reviews. Sign up for the newsletter.