Maurizio Morri – Science Blog

AI Is Moving From Reading Biology to Running It

March 13, 2026

For years, the big promise of AI in biology was interpretation. Models could read papers, analyze genomic data, classify images, and suggest hypotheses faster than any human team. Over the last two weeks, the story has started to feel more concrete. The frontier is no longer just AI that understands biology. It is AI that can participate in the experimental loop itself, proposing tests, learning from the results, and steering the next round of lab work. That shift became especially visible this month through new reporting on autonomous biology experiments and through continued discussion around models that can now generate short genomic sequences.

The clearest example came from OpenAI and Ginkgo Bioworks. In work highlighted by both OpenAI and Scientific American, GPT 5 was connected to Ginkgo’s cloud laboratory to optimize cell free protein synthesis, a widely used method for making proteins without living cells. According to OpenAI, the system ran more than 36,000 unique reactions across 580 automated plates and achieved a 40 percent reduction in protein production cost, with a 57 percent improvement in reagent cost. Scientific American described the broader significance well: this was not just a chatbot commenting on biology, but an AI system designing experiments, receiving data back from a robotic lab, and iterating at a speed that would be difficult for a human team to match.

That matters because biology has always resisted the hype cycle that dominates other areas of AI. In coding or mathematics, answers can often be checked quickly. In biology, the real bottleneck is usually experimentation. Wet lab work is slow, expensive, noisy, and full of physical constraints. If AI can meaningfully reduce the cost and time of iteration, the impact could spill into drug discovery, diagnostics, synthetic biology, and biomanufacturing. Cell free protein synthesis may sound niche, but proteins sit at the center of modern therapeutics, diagnostics, enzymes, and research tools. Lowering the cost of making and testing them is not a side improvement. It changes how fast real science can move.

At the same time, another strand of the story is developing on the design side. Nature reported on March 4 that the Evo 2 genomic language model can generate short genome sequences, although researchers quoted in the piece stressed that there is still a major gap between writing plausible DNA strings and creating genomes that function reliably inside living cells. That distinction is important. It shows how quickly the field is moving while also reminding us that biological reality is still the final judge. AI can now propose increasingly sophisticated biological designs, but living systems remain far more complex than text, images, or code.

This is exactly why the most interesting development is not raw model capability on its own. It is the coupling of models to instruments, protocols, and validation layers. OpenAI’s writeup makes clear that the experimental loop included strict programmatic checks so the AI could not submit experiments that looked good in text but could not actually run on the automation platform. Scientific American also reported an instructive failure case, where the model tried to assign a negative amount of water when exploring a new condition space. That is not a trivial anecdote. It is a reminder that useful AI in medicine and biology will depend on constraints, guardrails, and interfaces to the physical world. Real progress is going to come from systems that are not only creative, but also grounded.

There is also a necessary caution here for anyone tempted to treat every impressive accuracy number as biological understanding. A University of Warwick study released on March 2 warned that some AI pathology models may rely on shortcuts and confounding signals rather than truly detecting the underlying biology they claim to measure. In other words, a model can perform well on paper while still learning the wrong lesson. That warning lands at exactly the right moment. As AI tools move deeper into medicine, the question is no longer whether they can generate plausible outputs. The real question is whether they are discovering meaningful biological structure or only exploiting correlations that break when conditions change.

That tension is what makes this moment worth writing about for a general audience. We are watching AI in biology become more physical, more operational, and more useful, but also more exposed to the discipline of reality. The next phase will not be won by the model that sounds smartest in a demo. It will be won by systems that can survive the messiness of experiments, the variability of cells and tissues, and the rigor required for medical evidence. If the last era was about AI reading biology, the next one may be about AI doing biology, one validated experiment at a time.

Sources

https://openai.com/index/gpt-5-lowers-protein-synthesis-cost/

https://www.scientificamerican.com/article/openai-and-ginkgo-bioworks-show-how-ai-can-accelerate-scientific-discovery/

https://www.nature.com/articles/d41586-026-00681-y

https://www.eurekalert.org/news-releases/1118118
A Whole Cell Ran on a Supercomputer, and It Took Six Days

March 10, 2026

This week, a research team at the University of Illinois Urbana Champaign reported something that used to sound like science fiction: a full life cycle simulation of a living cell, from DNA replication and metabolism to growth and division. They did it for a genetically minimal bacterium, and they did it at nanoscale resolution, tracking how the cell’s molecules behave throughout the cycle.

The trick was choosing the right organism and the right computing strategy. The team used a “minimal cell” called JCVI syn3A, engineered to carry only the genes needed for basic life functions, which makes the modeling problem hard but not impossible. Even so, the simulation still had to account for every gene, protein, RNA molecule, and chemical reaction strongly enough that the timing of cellular events came out close to reality.

What makes the story feel like a real milestone is the engineering detail. One part of the biology, chromosome replication, was so computationally expensive that it almost doubled the runtime. The team ended up dedicating a separate GPU to DNA replication while another GPU handled the rest of the cell dynamics, which is the kind of pragmatic systems decision you only make after you have actually tried to run the whole thing. With that split, they simulated a 105 minute cell cycle in six days of compute time on the Delta supercomputing system at the National Center for Supercomputing Applications.

This is not an atom by atom digital cell, and it is not a replacement for experiments. The point is leverage. A whole cell model that can predict many cellular properties at once is like running hundreds of coordinated experiments in silico, then using real data to keep the model honest and refine it. If this approach scales, it changes what “understanding a cell” can mean, because you can start asking systems questions that are too entangled to isolate in the lab one variable at a time.

Sources: https://news.illinois.edu/team-simulates-a-living-cell-that-grows-and-divides/ https://www.ncsa.illinois.edu/2026/03/10/simulating-the-life-cycle-of-a-cell-with-ncsas-delta/
The Next Wave of Bio AI Is About Interactions

March 3, 2026
In the last couple of weeks, the most interesting shift in biology focused AI has not been a better single structure predictor. It is the jump from predicting shapes to predicting interactions and designing the parts that create them. A Nature report described a new proprietary drug discovery model from Isomorphic Labs that impressed researchers because it appears to predict how drug sized molecules interact with protein targets at a level people compare to a hypothetical next generation AlphaFold, but now aimed at binding, selectivity, and chemistry relevant signals rather than only static structure. (Nature)

The important technical point is that biology is not only geometry. Drug action is about ensembles, pockets that breathe, water and ions, and the coupling between protein motion and ligand chemistry. If a model can learn interaction landscapes well enough to propose molecules that survive real world constraints, then the bottleneck shifts from “can we model a protein” to “can we close the loop from target to candidate with fewer wet lab cycles”. That is why pharma partnerships keep clustering around models that explicitly predict binding and other interaction level properties, not just sequence to structure. (Reuters)

In parallel, Nature also highlighted how generative biology tools are moving up the abstraction ladder toward designing biological components more directly, including higher level assemblies and genomes, with the same pattern: you get value when the model is constrained by what can actually function inside cells and what can actually be built. The takeaway is that the frontier is becoming system level. The winning models will not just output a plausible sequence. They will output a design that fits a manufacturable path, a measurable assay, and a safety envelope. (Nature)

Sources
https://www.nature.com/articles/d41586-026-00365-7 https://www.reuters.com/business/healthcare-pharmaceuticals/takeda-deepens-ai-drug-discovery-push-with-17-billion-iambic-deal-2026-02-09/ https://www.nature.com/articles/d41586-026-00566-0
Persistency and genomic data

February 19, 2026

Most data breaches fade with time. Passwords get rotated. Credit cards get replaced. Even medical facts can become stale. Genomic data is different because it is persistent, inherently identifying, and useful far beyond the context in which it was collected. Once a genome is out, it is out forever, and it can be linked back to a person in ways that keep improving as more reference data becomes public.

That persistence creates a mismatch between how teams think about privacy and how genomic privacy actually works. Many organizations treat privacy as a compliance perimeter. They focus on access controls, encryption, and policies. Those are necessary, but they are not sufficient because the risk is not only unauthorized access. The risk is also unintended inference, reidentification, and downstream use that was never anticipated when the data was shared or the consent was signed.

NIST has been pushing the conversation toward risk based practice rather than checkbox security. The NIST Privacy Framework is meant to help organizations identify and manage privacy risk as part of enterprise risk management, not as an afterthought bolted onto engineering. https://www.nist.gov/privacy-framework 

For genomics specifically, NIST has also published work that frames genomic cybersecurity and privacy as a combined problem, because in real systems the privacy failures often happen through security failures, and the security failures matter because of the privacy outcomes. A relevant example is NIST’s Genomic Data Cybersecurity and Privacy community profile work, which explicitly positions genomic data as requiring a structured approach to both privacy and cybersecurity capabilities. https://csrc.nist.rip/pubs/ir/8467/2pd 

The research ecosystem has learned this the hard way, which is why controlled access has become the norm for many human datasets. NIH’s Genomic Data Sharing policy lays out expectations for responsible sharing, and the dbGaP access process makes it clear that access is not just a technical permission, it is a governance decision with terms, renewals, and institutional accountability. https://grants.nih.gov/policy-and-compliance/policy-topics/sharing-policies/gds/overview https://grants.nih.gov/policy-and-compliance/policy-topics/sharing-policies/accessing-data/dbgap 

This governance direction is also why machine readable identity and authorization are becoming central in federated genomics. GA4GH Passports formalize the idea that a researcher presents verifiable permissions, called visas, that communicate what they are authorized to access across systems without manual reapproval at every boundary. It is not just an implementation detail. It is an architectural choice that assumes access decisions must be portable, auditable, and harder to spoof. https://www.ga4gh.org/product/ga4gh-passports/ 

People often assume that legal protections solve the discrimination problem, but the reality is narrower. In the United States, GINA makes it illegal for employers to discriminate based on genetic information and restricts how genetic information can be used in employment decisions. That matters, but it does not erase the risk landscape, and it does not automatically cover every scenario a person worries about. The EEOC summary captures the core employment protections under Title II. https://www.eeoc.gov/genetic-information-discrimination 

So what should a genomics team do differently, in practical terms, if they take persistence seriously.

First, design for least data, not just least privilege. The simplest way to reduce genomic privacy risk is to avoid moving raw or near raw data when you do not need it. If a workflow can be done on derived representations, summary statistics, or privacy preserving features, that is a real risk reduction because it narrows what an attacker can steal and what a partner can misuse.

Second, treat consent and data use limits as technical requirements, not just documents. NIH’s approach to controlled access is a reminder that “allowed use” is part of the system specification, and it has to be enforceable through identity, logging, and process, not simply written down. https://grants.nih.gov/policy-and-compliance/policy-topics/sharing-policies/accessing-data/using-genomic-data 

Third, assume linkage will get easier. A dataset that looks deidentified today can become linkable tomorrow because reference panels grow, genealogy databases expand, and methods improve. Your threat model should assume that your future adversary will have better tools than your present self.

Genomic data is powerful because it compresses a lifetime of biology into a format that machines can search, aggregate, and predict from. That same power is what makes it uniquely dangerous to handle casually. The organizations that earn trust in genomics will not be the ones that say they care about privacy. They will be the ones that build systems where privacy risk is engineered down as a default property of how data is collected, accessed, analyzed, and shared.
AI Is Becoming Biology’s New Lab Partner

November 2, 2025

AI is slowly becoming a real collaborator in understanding life. Over the past few months AI systems have gone from predicting structures or gene expression to actually helping design molecules, simulate cells, and guide lab experiments.

Much of this progress comes from a new generation of foundation models in biology, massive systems trained on DNA, protein, and multi-omics data. These models can learn patterns across biology, making them useful for everything from genome decoding to protein design. According to a recent review, such models are starting to connect different biological layers—genes, cells, tissues—in a unified framework.

Foundational Models for AI in Biology

Another example is a single-cell foundation model described in Nature Communications Biology, which can integrate cellular data from different species and conditions to reveal hidden regulatory links.

https://www.nature.com/articles/s12276-025-01547-5

Why does this matter? Because the way we do biology is changing. The time between hypothesis and experiment is shrinking dramatically. An idea that once took months to test can now move from model to lab in days. The space for innovation is also expanding. These systems let scientists ask questions that span molecules, cells, and tissues rather than treating them separately. And finally, the responsibility is growing. As AI starts generating biological designs, researchers must make sure results are reproducible, safe, and interpretable.

https://arxiv.org/abs/2505.23579

If you work in genomics, metabolomics, or synthetic biology, this shift affects you directly. Do you have the right datasets to fine-tune these models? Can your infrastructure support rapid cycles of prediction and validation? Do you track provenance and reproducibility for AI-generated hypotheses? The labs that can answer yes to these questions will lead the next phase of digital biology.

AI in biology is moving from being an assistant to becoming a creative partner. The next generation of discoveries will not just come from analyzing data but from collaborating with intelligent systems that can imagine new forms of life and help us test them responsibly.

References

Baek S, et al. “Single-cell foundation models: bringing artificial intelligence to biology.” Nature Communications Biology, 2025.

https://www.nature.com/articles/s12276-025-01547-5

Le Song, Eran Segal, Eric Xing. “Toward AI-Driven Digital Organism: Multiscale Foundation Models for Predicting, Simulating and Programming Biology at All Levels.” arXiv preprint, December 2024.

https://arxiv.org/abs/2412.06993

“Foundational Models for AI in Biology.” Ardigen, 2025.

Foundational Models for AI in Biology

“Foundation models in drug discovery: phenomenal growth in biotech.” ScienceDirect, 2025.

https://www.sciencedirect.com/science/article/pii/S1359644625002314
Embedding-Driven Protein Generation Enables Motif Diversification

October 22, 2025

A new paper published on arXiv, “Protein generation with embedding learning for motif diversification” (arXiv:2510.18790), introduces a new approach to protein design that combines deep learning embeddings with generative modeling. The paper is available at https://arxiv.org/abs/2510.18790

The study addresses a long-standing challenge in computational biology: generating new protein structures that preserve key functional motifs while introducing meaningful diversity. Conventional design pipelines often fail to balance these goals. Small modifications maintain stability but limit innovation, while large ones disrupt the structural or functional integrity of the protein.

The authors propose a model that learns high-dimensional embeddings of protein motifs and structures, allowing controlled perturbations in embedding space rather than direct coordinate manipulations. This makes it possible to generate diverse but still functional variants. Using a diffusion-based architecture, the system produces proteins that preserve biochemical motifs while varying scaffold backbones in a realistic manner.

Applied to three benchmark systems, including a protein-protein interface and a transcription-factor complex, the model produced substantially more viable structures than existing baselines. The generated designs were predicted to fold stably and retain the target motifs, suggesting the embeddings capture key biophysical constraints.

This work demonstrates how generative AI can move beyond prediction and toward active biological design. By integrating structural embeddings with diffusion processes, the model opens a path to broader exploration of sequence-structure space while maintaining biological plausibility. As experimental validation follows, methods like this may accelerate the creation of new enzymes, therapeutic proteins, and synthetic scaffolds.

It is another sign that AI is beginning to influence the creative side of molecular biology, offering not just analysis but generation of functional biological matter.
AI and CRISPR: AstraZeneca’s $555 Million Bet on the Future of Drug Discovery

October 8, 2025

In the past two weeks, one of the world’s largest pharmaceutical companies made a move that signals a turning point for modern biotechnology. AstraZeneca announced a $555 million collaboration with Algen Biotechnologies, a company emerging from Jennifer Doudna’s Berkeley lab, to merge artificial intelligence with CRISPR-based gene editing. The partnership marks one of the most ambitious attempts yet to fuse computational prediction with biological precision.

The premise is simple but transformative. AI has already revolutionized how scientists analyze genomic data, but AstraZeneca’s deal pushes the technology deeper into discovery itself. The new platform will use AI not only to interpret results but to generate hypotheses — identifying which genes to edit, which mutations to target, and which pathways are most likely to lead to successful therapies.

This approach represents a shift from analysis to design. Traditionally, drug discovery has been a long and costly process of trial and error. AI promises to change that by training on vast biological datasets and predicting, with increasing confidence, which interventions will work. CRISPR then acts as the experimental engine, rapidly testing those predictions in living systems. Together, the two technologies could compress years of lab work into months.

AstraZeneca is focusing this collaboration on immunology, where the genetic underpinnings of diseases like asthma, arthritis, and inflammatory disorders remain only partially understood. By combining AI-driven target discovery with CRISPR validation, the company hopes to uncover new therapeutic pathways that conventional screening would miss.

The financial structure of the deal — with $555 million in milestone payments — underscores how seriously the pharmaceutical industry now treats AI as a strategic core, not just an experimental add-on. Algen retains ownership of its platform, while AstraZeneca secures rights to commercialize any therapies that emerge, creating a model for how AI start-ups and established drug makers can work together.

Still, expectations are high, and reality will demand patience. Despite the hype, no AI-designed drug has yet completed clinical approval. Biology remains unpredictable, and algorithms that perform well in silico must still face the rigorous constraints of real cells, tissues, and patients. Yet even partial success would represent a leap forward in productivity and precision.

The convergence of AI and CRISPR may ultimately redefine what it means to discover a drug. Instead of searching through chemical space blindly, researchers will navigate biological systems as if guided by a map. With each iteration, the AI will learn from both failure and success, evolving alongside the science it helps create.

AstraZeneca’s new partnership is not just a business deal — it is a declaration that biology’s next revolution will be computational. The merger of AI and gene editing promises a future where designing cures is not a matter of chance, but of code.

References
https://www.ft.com/content/c4b5153f-be07-454d-911f-31bb011f09ae
https://www.nature.com/articles/d41586-024-02549-5
https://www.science.org/doi/10.1126/science.adj3475
AI and the New Biology of Discovery

October 6, 2025

Biology has entered a new era, one defined not by microscopes but by algorithms. Artificial intelligence is reshaping how scientists understand life, from the level of molecules to entire ecosystems. What once took years of manual experimentation can now happen in weeks, driven by models that learn directly from biological data. Far from replacing scientists, AI is expanding their reach, revealing patterns that no human could see unaided.

In genetics, AI is accelerating the decoding of complex traits. Machine learning models analyze entire genomes to uncover subtle combinations of mutations that influence health and disease. This approach is allowing researchers to predict risk factors and uncover previously hidden genetic relationships. In cancer research, AI algorithms sift through tumor data to identify new therapeutic targets and match treatments to patient-specific molecular signatures.

Protein science is another frontier transformed by AI. Deep learning models like AlphaFold have solved one of biology’s hardest problems: predicting how amino acid sequences fold into three-dimensional structures. This breakthrough has opened the door to designing new enzymes, antibodies, and materials, turning biology into a field where researchers can not only read nature’s code but also write it.

Even in medicine, AI is enabling a more personal understanding of the human body. By combining genomic, imaging, and clinical data, AI can detect disease earlier, suggest targeted therapies, and guide precision interventions. Doctors are beginning to use AI not as a replacement for judgment but as a companion that brings molecular insight into every decision.

The impact extends beyond humans. Ecologists use AI to monitor biodiversity, predict ecosystem shifts, and track endangered species. Synthetic biologists use AI-driven design tools to create sustainable materials and biofuels. The same techniques that once optimized web searches are now helping decode the language of life.

This is the quiet optimism of modern biology. Artificial intelligence is not an intruder in the life sciences but a collaborator. It turns vast biological complexity into actionable knowledge and brings the scientific imagination closer to creation itself. For the first time, we are not just observing life — we are beginning to understand its algorithms.

References
https://www.nature.com/articles/d41586-021-03819-2
https://www.science.org/doi/10.1126/science.abh1809
https://www.cell.com/cell/fulltext/S0092-8674(22)01350-4
AI-Powered Cybersecurity: Defending the Digital Frontier

September 30, 2025

Cybersecurity has always been an arms race. As attackers develop new tactics, defenders scramble to respond with updated rules, signatures, and monitoring systems. The scale and sophistication of modern threats, however, are overwhelming traditional approaches. Artificial intelligence is now reshaping the battlefield, offering tools that can adapt, learn, and defend in ways that static methods cannot.

AI excels at anomaly detection. Instead of relying on predefined rules, machine learning models learn what “normal” network behavior looks like and flag deviations that may indicate intrusions. This allows early detection of zero-day exploits or insider threats that would slip past conventional firewalls. Deep learning further refines this ability, correlating signals across logs, traffic, and endpoints to reveal patterns invisible to human analysts.

Automation is another advantage. AI-driven security orchestration platforms can respond in real time, isolating compromised devices, blocking malicious traffic, or rolling back suspicious changes before damage spreads. This reduces response times from hours to seconds, critical in stopping fast-moving ransomware or distributed denial-of-service attacks.

Adversarial AI adds a new dimension to the conflict. Attackers are beginning to use machine learning to generate phishing campaigns, craft malware variants, or probe defenses intelligently. Defenders must counter with equally adaptive models, leading to a dynamic contest of algorithms. Research into adversarial robustness is essential to ensure that defensive AI cannot be fooled by manipulated inputs.

Challenges remain in trust and transparency. Security teams must understand why an AI flagged a particular event, otherwise they risk being overwhelmed by false positives or missing real threats. Hybrid approaches that combine AI-driven detection with human expertise are emerging as the most reliable strategy.

AI will not eliminate cyberattacks, but it is redefining how defense operates. The future of cybersecurity lies in systems that learn continuously, adapt dynamically, and fight back at machine speed. In this frontier, intelligence itself has become the strongest line of defense.

References
https://arxiv.org/abs/2006.00564

https://www.nature.com/articles/s41586-019-1716-1

https://www.science.org/doi/10.1126/science.aar3787
AI in Protein Design

September 29, 2025

Proteins are the molecular machines of life, and designing them has long been one of biology’s greatest challenges. Traditional methods rely on trial and error or evolutionary insights, but artificial intelligence is opening a new frontier. By learning the rules of protein folding and function, AI systems can now generate novel proteins with tailored shapes and properties.

AlphaFold’s success in predicting protein structure was a turning point, proving that deep learning could capture the complex physics of folding with near-experimental accuracy. But the field has moved beyond prediction to creation. Generative models such as diffusion-based frameworks and language-model-inspired architectures treat amino acid sequences like text, enabling the design of entirely new proteins.

Applications are already emerging. AI-designed enzymes can catalyze chemical reactions not found in nature, promising greener industrial processes. In medicine, custom proteins are being explored as therapeutics, binding with high specificity to disease targets. Even materials science is benefitting, with AI-generated proteins forming the basis of new biomaterials and nanostructures.

Challenges remain in bridging the gap between in silico design and in vivo performance. Proteins designed computationally must still fold correctly, remain stable in biological environments, and function as intended. Experimental validation is costly and slow, making it the bottleneck. Researchers are working on better feedback loops, where experimental data continuously refines generative models.

AI in protein design is redefining what is possible in biotechnology. Instead of searching nature’s catalog for useful molecules, we are beginning to write our own entries in the book of life.

References
https://www.nature.com/articles/s41586-021-03819-2
https://www.science.org/doi/10.1126/science.ade6501
https://arxiv.org/abs/2304.04181

recent posts

about