Huge data generation efforts
You may have now seen a few announcements detailing efforts to generate giant single-cell resolved datasets specifically for AI applications. The two that jump out:
13JAN25: Tahoe-100M by Vevo Therapeutics (with Parse Bioscience and Ultima), and
6FEB25: Billion Cells Project by CZI (with 10x Genomics and Ultima)
Naturally, this is exciting from a sheer numbers perspective, but let’s take a shallow dive!
Tahoe-100M: Here, various cancer models are pooled and cultured in vivo and mice are treated with a drug of interest. Technically, this accounts for 60,000 conditions (1,200 drug treatments across 50 different tumor models), very impressive. However, where this breaks down is that this accounts for ~1,666 cells per perturbation — substantially more than the standard ~100ish cells you’d normally expect. Therefore, one has to wonder if it would be more reasonable to view this as a ~10M cell atlas in practical terms?
Billion Cells Project: Unfortunately the BCP provides substantially less information, but we can see that the focus is on “diverse model organisms” (mouse, zebrafish, etc). As a science project, this sounds amazing, but one has to wonder how generalisable the results will be to human disease.
Quick thoughts:
Generating data for AI purposes won’t lead to “normal” experimental designs: a true AI-first dataset will be a dataset designed by a machine learning algorithm to enable it to understand the world (e.g. using reinforcement learning or active learning). For example, would an ML system really want to see the same drug applied to 50 tumour models, or see it in 10 models at a bunch of different time points to understand the PKPD as the drug is metabolized?
Automation will be needed to generate data on demand: On the assumption that the AI will design the data, these experiments will likely be really frustrating for humans to perform necessitating automation. Consider a model that decides it needs hourly resolution on a cell model so it can understand transcriptional cascades — how happy will humans be to listen to its requests?
Maximum translation of models: Historically, I’ve always found challenges extrapolating animal data and cancer cell model data to healthy humans — these systems are just wired differently! One infamous paper comes to mind whereby virtually every knockout does nothing (compared to NTC) as genomic control in this melanoma model had truly been lost many cell passages ago!
Both projects are super exciting from a headline news figure, but I wonder about the long term true value of these initiatives for AI enabled drug discovery. The real winner is of course Ultima, who have now shafted Illumina for top sequencing provider for all things CRISPR-omics.
Side note: if you want to get our view on building foundation models for regulatory biology, check out our recent preprint (Warning, math heavy!).
Cellanome – moving the needle:
It’s very rare that we come across a platform that genuinely makes us go “wow, now that’s different”. The most recent contender is Cellanome, whereby they culture cells in a flow cell, trap them, treat them media changes, image them and sequence them… all within a single device… exceedingly cool! Unfortunately, there’s not a lot public (yet!), but we were lucky enough to catch them at SLAS 2025 in San Diego, where they showed off their technology in 2 separate sessions – shout out to the excellent Gary Schroth and Shawn Levy!
Watch this space for more!!
Market news:
10JAN25 - Akoya acquired by Quanterix
Quanterix is known for its ultrasensitive clinical diagnostics platform, so at face value this acquisition looks like route for Akoya to get increased distribution. However, we do wonder if there are synergies in IP portfolios relating to the Simoa paramagnetic particle technology. Needless to say, we watch with anticipation.
15JAN25 - Curio acquired by Takara
Congratulations to the Curio Bioscience team on an acquisition so soon after launching in 2021. With the pure size of Takara, it’s hard to know exactly what will happen next. Naturally, Takara own the smart-seq protocols, so is there a strategy in place to think about full length spatial transcriptomics?
4FEB25 - Parse is now a free house elf (or analogous LOTR reference)– 10x Genomics patent claims kicked! If you remember our original commentary, you’ll notice what a big deal this is! Congratulations Parse Bioscience.
Other news:
23DEC24: Singular Genomics acquired by Deerfield – check cash on the books for the real story ;)
Rumour mill: Roche is going something big in the NGS space, but details are thin! Great blogpost here.
Making this a living review!
You may have heard us reference scTrends as a living review. We’ve wanted to make sure our resources are up to date for anyone to access, so here goes: