Determining the effects of viral mutations without experiments

Jesse Bloom

Fred Hutch Cancer Center / HHMI

Slides: https://slides.com/jbloom/grc2025

Richard Neher

Some viruses evolve very rapidly

Determining effects of viral mutations is important

Interpret consequences of mutations seen during viral surveillance.
Inform design of drugs and vaccine updates.
Understand function and mechanisms of viral proteins.

Different patterns of evolution at different sites

Traditional way to determine effect of mutations is experiments

My group tries to do such experiments at large scale via deep mutational scanning

Yeast display or lentiviral pseudotype libraries allow us to measure many mutants at once by pooling them all together and reading out effects of mutations by deep sequencing (Starr et al, 2020; Dadonaite et al, 2023)

Limitations of using experiments to understand effects of mutations

Laborious: even with deep mutational scanning, it's a lot of effort.

Limitations of using experiments to understand effects of mutations

Laborious: even with deep mutational scanning, it's a lot of effort.

Lab assays measure effects of mutations in cells or mice, not humans. This is not the same as fitness in the real world.

Limitations of using experiments to understand effects of mutations

Laborious: even with deep mutational scanning, it's a lot of effort.

Lab assays measure effects of mutations in cells or mice, not humans. This is not the same as fitness in the real world.

Some viral proteins have poorly understood functions that lack good lab assays.

Nature is "testing" effects of viral mutations in humans all the time

Average neutral single-nucleotide mutation has occurred ~30,000 independent times in human transmitted SARS-CoV-2

Viral substitution rate at synonymous sites: ~7.5e-4 substitutions/year (Neher, 2022)
Typical infection duration: ~5 days = 0.01 years/infection
Total human infections with SARS-CoV-2: ~12e9 infections
So total synonymous substitutions per site: 7.5e-4 x 0.01 x 12e9 = 90,000
There are three possible mutations per site: 45,000 / 3 = 30,000
Mutation spectrum uneven, so some mutations have occurred more than others:
- C->T mutations have occurred ~100,000 times
- A->C mutations have occurred ~2,000 times

We can use publicly available human SARS-CoV-2 sequences to "read out" effects of viral mutations on human transmission

We use the ~10 million public sequences in the UShER mutation-annotated tree
These sequences represent ~0.1% of all human SARS-CoV-2 infections

First calculate how often each mutation expected to be observed without selection by analyzing 4-fold degenerate sites

Bloom, Beichman, Neher, Harris (2023)

We count unique occurrences of mutation, not number of sequences with mutation

Bloom and Neher (2023)

Mutations expected to be observed ~10 to ~700 times in absence of selection

Bloom and Neher (2023)

There are enough sequences to calculate effects on a per-mutation basis

Bloom and Neher (2023)

We calculate effect as log of actual versus expected mutation counts

fitness effect of mutation = log (actual counts / expected counts)

Effects of zero indicate neutral mutation, negative indicates deleterious mutation

Distribution of effects of all mutations

Bloom and Neher (2023)

We can see which genes are under strong purifying selection

Bloom and Neher (2023)

Among accessory genes, ORF3a is under strongest selection against stop codons

Experiments show that only accessory gene deletion that strongly attenuates virus in animal models is ORF3 (McGrath et al, 2022)

Crucially, we see effect of each mutation

Key sites in proteins of unknown function

These maps can identify constrained sites

Estimated mutation effects are robust to sequence sampling location

Estimated mutation effects are robust to viral clade identity

Estimated mutation effects correlate well with deep mutational scanning

Two spike deep mutational scans using different underlying methodologies: lentiviral pseudotyping of spike or yeast display of RBD

Maps of mutation effects to all viral proteins

Areas for future work and limitations

Quantitative relationship between the ratio of observed versus expected counts and fitness depends on sampling intensity

Areas for future work and limitations

Quantitative relationship between the ratio of observed versus expected counts and fitness depends on sampling intensity

There is additional information in dynamics of mutation after it occurs that our method currently does not leverage

Areas for future work and limitations

Quantitative relationship between the ratio of observed versus expected counts and fitness depends on sampling intensity

There is additional information in dynamics of mutation after it occurs that our method currently does not leverage

Accuracy of our our approach depends critically:

Having a dataset free of sequencing/bioinformatic errors
Accurately estimating per-site mutation rate

Areas for future work and limitations

Quantitative relationship between the ratio of observed versus expected counts and fitness depends on sampling intensity

There is additional information in dynamics of mutation after it occurs that our method currently does not leverage

Accuracy of our our approach depends critically:

Having a dataset free of sequencing/bioinformatic errors
Accurately estimating per-site mutation rate

This overall approach could be applied to many viruses / organisms with enough sequencing

Thanks

Estimates of mutation rate

Kelley Harris, Annabel Beichman

Assistance with UShER

Angie Hinrichs, Russ Corbett-Detig

Richard Neher

These slides: https://slides.com/jbloom/grc2025

GitHub: https://github.com/jbloomlab/SARS2-mut-fitness

grc2025

By Jesse Bloom

grc2025

Estimating effects of mutations to all SARS-CoV-2 proteins from actual versus expected mutation counts in natural sequences

Jesse Bloom PRO

Scientist studying evolution of proteins and viruses.

Determining the effects of viral mutations without experiments

Jesse Bloom

Some viruses evolve very rapidly

Determining effects of viral mutations is important

Different patterns of evolution at different sites

Traditional way to determine effect of mutations is experiments

Traditional way to determine effect of mutations is experiments

Traditional way to determine effect of mutations is experiments

Traditional way to determine effect of mutations is experiments

Traditional way to determine effect of mutations is experiments

My group tries to do such experiments at large scale via deep mutational scanning

Limitations of using experiments to understand effects of mutations

Limitations of using experiments to understand effects of mutations

Limitations of using experiments to understand effects of mutations

Nature is "testing" effects of viral mutations in humans all the time

Average neutral single-nucleotide mutation has occurred ~30,000 independent times in human transmitted SARS-CoV-2

We can use publicly available human SARS-CoV-2 sequences to "read out" effects of viral mutations on human transmission

First calculate how often each mutation expected to be observed without selection by analyzing 4-fold degenerate sites

We count unique occurrences of mutation, not number of sequences with mutation

Mutations expected to be observed ~10 to ~700 times in absence of selection

There are enough sequences to calculate effects on a per-mutation basis

Distribution of effects of all mutations

We can see which genes are under strong purifying selection

Among accessory genes, ORF3a is under strongest selection against stop codons

Crucially, we see effect of each mutation

Key sites in proteins of unknown function

These maps can identify constrained sites

Estimated mutation effects are robust to sequence sampling location

Estimated mutation effects are robust to viral clade identity

Estimated mutation effects correlate well with deep mutational scanning

Maps of mutation effects to all viral proteins

Areas for future work and limitations

Areas for future work and limitations

Areas for future work and limitations

Areas for future work and limitations

Thanks

grc2025

More from Jesse Bloom