AI Is Not Coming for AAV Scientist Jobs. We Are Worried About the Wrong Thing (~2min read)

TL;DR AI doesn't have enough data to replace you. You are the one making the AI.

The fear is understandable. The headlines are real. But the anxiety is aimed at the wrong target.

Here's why — three reasons, each one sharper than the last.

One: The Data Gap Is Structural, Not Temporary

AI succeeds where data is abundant, cheap to label, and fast to validate. Language models were trained on the near-complete distribution of written human text. Image models on hundreds of billions of labeled examples, with feedback measured in milliseconds.

AAV runs on a completely different economy. Labels cost hundreds to thousands of dollars each. Feedback runs on animal model timelines — weeks to months per experimental round, in vivo readouts that don't compress no matter how much compute you throw at them. The characterized fraction of biologically relevant sequence space is a rounding error compared to what powered the AI revolutions you've been reading about.

This isn't a gap that better algorithms close. The models don't have enough to learn from. You are the learning.

Two: The Field Learns in Private — By Design

In computer science, sharing is the currency. Open code, public benchmarks, reproducible baselines. When something works, everyone knows within weeks and builds on top of it, not recreates it.

AAV engineering operates under patents, NDAs, sponsored research agreements, and publication timelines shaped by commercial incentives. The most important methodological knowledge — what worked in NHP, what the real failure rate is, what the field has already ruled out — circulates privately if at all. Every organization solves the same problems in parallel, in silence, with no shared benchmark to orient against.

AI tools trained on public data are therefore working from a picture of this field that is systematically incomplete in the ways that matter most.

Three: The Most Consequential Decision Stays Human

The goal of AAV engineering is to go somewhere functional that doesn't exist in nature. That is, by definition, out of distribution. And when a model operates out of distribution, its confidence scores stop tracking its reliability — it returns a number that looks like the other numbers, with no flag that it's extrapolating into space it has never seen characterized.

The person who catches that is the experimental scientist with a basic intuitive ML understanding. The one who knows the training data, knows the biology, and knows what a confident prediction looks like when it's about to send a program in the wrong direction. That judgment is not a temporary placeholder. It is the work.

The Real Question

The job isn't at risk. But the role is evolving toward something more important: the person who generates data worth learning from, and who understands the system well enough to know when it's working and when it isn't.

The question isn't "will AI take my job?"
It's "am I generating the data that makes AI worth using?"

Full version with data analysis, IP deep dive, and a concrete NHP scenario: [Link]

Medium version: [Link]

PS: This is what The AIxAAV Interpreter is for: translating ML methods into actionable AAV engineering strategies. Follow me on LinkedIn for more practical insights that accelerate bio-innovation.

Search This Blog

The AI × AAV Interpreter