A LETTER FROM THE FOUNDER

Why we built Nabu

A few months ago, a molecular biologist described three weeks of her life she'd never get back. She was studying molecular changes during cancer progression. The reviews she was reading all said the same thing - a specific class of molecules shifts during disease development. Authoritative claim, repeated across the literature, hundreds of citations behind it. So she traced the citations back to the primary data.

There was no primary data. Or rather, the data existed, but it didn't say what the reviews claimed it said. The directionality was unclear, the timing murky. Three weeks of careful work to discover that a foundational claim in her field was, at best, a long game of academic telephone.

She wasn't surprised. She was tired.

I've spoken with hundreds of researchers, and almost none have a paper-finding problem. Search and preprint access are no longer the bottleneck. The problem is the opposite - too much literature, no reliable signal for what's worth their time. The signals that exist are leaky. Even in well-known journals, around one in ten papers is poor quality - by the estimates of researchers I've interviewed, and by independent analyses of methodological reporting. Citation counts are slow and gameable. Author reputation works inside your home field, fails the moment you cross fields, and locks out anyone whose career hasn't yet earned them a name.

The judgment everyone actually trusts is peer review. But peer review itself is inconsistent and unstructured - we expect the world from reviewers who are unpaid, time-pressed, and working without a shared rubric. Editors I've spoken with describe an open secret: the reviewer pool is depleted, acceptance bars fluctuate with whoever happens to be available, and the scale of submissions long ago outpaced the scale of qualified human eyes. Researchers don't need more papers in their inbox. They need a second opinion on the ones already there.

Almost a decade inside scholarly publishing and communication gave me a view of every stakeholder in this ecosystem - researchers, editors, librarians, executives. People across roles know the citation-and-impact-factor machine is broken. They're trapped by the same lack of alternatives everyone else is. Replacing the dominant signal requires three things at once: a credible alternative, a way to operate it at scale, and a way for the field to verify its outputs. Until recently, none of those existed simultaneously.

What I believe is simple. Paper quality is assessable. The dimensions a thoughtful reviewer applies - does the work contribute something genuine, is it well executed, is it clearly written, does it situate itself honestly in the context of prior and future work - are real, articulable criteria. However, they've been hard to articulate and apply consistently, at scale, with traceable reasoning. Our research has codified these criteria into a concrete rubric that can be applied across papers while accounting for their field and methodology type.

Until about two years ago, applying this kind of rubric to a corpus of any size meant trusting a single algorithm - citations, journal name, h-index - or finding human reviewers. Today, frontier language models can read a paper end to end, apply a structured rubric, and write out their reasoning.

That's what changes the math.

But "AI can read papers" is what made the last generation of literature tools dangerous. A research lead I met described her team using an AI tool that returned fifty candidate papers in minutes; half were fabricated or off-topic. Her junior staff now spend more time verifying than they used to spend searching. So I want to be precise about what Nabu is and isn't.

Nabu doesn't summarize papers. It doesn't tell you what a paper says. It doesn't judge whether it answers your question or moves your research forward. It evaluates a paper you already have, against criteria a domain expert would apply, and shows its reasoning in writing. Multiple independent models score the paper; an editorial layer reconciles their evidence-based reasoning. No invented citations. Every score is traceable to its source. The point of the multi-model architecture isn't novelty - it's verification. One model can be miscalibrated. Multiple independent models with different training, scoring against the same rubric, produce a signal you can interrogate.

The methodology - the rubric, the validation work, the inter-rater reliability data - lives on the methodology page for anyone who wants to interrogate it.

Our vision is that the molecular biologist who lost three weeks doesn't lose them. The corporate research team doesn't ship hallucinated citations to a regulator. The early-career researcher working a field over from her training has the same signal a senior PI's two decades of intuition would give her. The systematic reviewer building an evidence base has a quality-weighted ranking, not a journal-weighted one.

We're not trying to replace peer review. We're trying to give every researcher who's ever stared at a stack of papers, late on a Sunday, the second opinion researchers should have had all along.

Founder, Nabu

nabu.science