Wednesday, February 11, 2026

Agentic AI in Radiology Today. Insights for Pathology Tomorrow.

In quick succession, I ran across three papers on LINKED IN on Radiology & Agentic AI.

I asked Chat GPT for a detailed summary.   

Then, knowing that radiology AI is often 3-5 years ahead of pathology AI, I asked for projects from THIS work towards pathology (cf in particular digital pathology & genomics).

###





###

2026 Radiology Agentic Articles in Discussion (Plus View of Pathology)

 

You’re already using something that sits on the “agentic” spectrum: an LLM that plans, calls tools, iterates, and persists context to finish multi-step work (your examples: complex Medicare table work, multi-source company diligence, even finding a specific page in a calendar PDF). The radiology literature is basically trying to (1) define that capability precisely, (2) map where it fits in real radiology workflows, and (3) draw safety/implementation boundaries so it doesn’t become a flashy demo that breaks the moment it touches PACS/RIS/EHR and liability.

Below is a detailed, paper-by-paper walk-through, plus how it maps to your digital pathology instincts.


1) Tripathi et al. (Radiology editorial): what radiology means by “agentic”

Tripathi/Cook/Kim are doing definitional cleanup and systems-thinking. Their core move is to separate an AI agent (a program—often LLM-powered—that perceives, reasons, plans, and acts via tool calls in an iterative loop) from agentic AI (a broader systems framework that orchestrates one or more agents/tools to solve complex, multi-step tasks with limited human supervision).

That distinction matters because a lot of radiology “AI” today is still narrow/static (a detector/classifier). Tripathi’s “agentic” vision is closer to an end-to-end workflow collaborator: not just “find nodules,” but also pull priors, query the EHR, draft report text, check billing/documentation completeness, make guideline-grounded follow-up suggestions, and route urgent comms—all chained together as a goal-directed pipeline.

The practical definition (why it feels like what you’ve seen in consulting)

Tripathi emphasizes recurring building blocks:

  • Foundation model (often LLM/MLLM) as the “brain”
  • Tools/APIs (EHR queries, guideline retrieval, image encoders, scheduling, etc.)
  • Iterative loops (observe → reflect → adjust)
  • Defined human oversight points

That’s extremely close to your “ChatGPT did elaborate searches + calculations + document forensics” experience—just relocated into radiology’s environment (PACS/RIS/EHR), and constrained by clinical risk.

Their “engineer’s checklist” is the most valuable part

Tripathi gets concrete about how to build/contain these systems:

Guardrails should scale with autonomy: they explicitly name tripwires (confidence/coverage bounds), approval gates for consequential actions, validation checkpoints, and even execution sandboxes.

They also insist on standards-based integration—PACS/RIS/EHR and standards like HL7, FHIR, FHIRcast, DICOM/DICOMweb—because “agentic” that can’t live inside the real pipes is just a lab toy.

And they highlight modern threat models that matter more for agents than for single-purpose models: prompt injection, memory injection, tool-surface vulnerabilities, model poisoning, mitigated by strict I/O controls, capability whitelisting, and red-teaming.

Bottom line of Tripathi: agentic AI is a workflow paradigm (tool-using, iterative, orchestrated), but radiology’s near-term reality is supervised copilots, not unsupervised autonomy, because regulation + liability + trust bar.


2) Gibson et al. (Computers, MSK scoping review): what evidence exists, and what’s still mostly conceptual

Gibson et al. is less “here’s the grand vision” and more “what’s actually in the literature (especially MSK), and how mature is it?”

Their helpful taxonomy: model → automated workflow → agentic

They lay out a simple ladder:

  1. Single model (e.g., CNN fracture detector)
  2. Automated workflow (detector triggers alert via preset rules)
  3. Agentic AI = multiple models acting as “agents” that autonomously process information across multi-step tasks (not just “if fracture then alert”).

That taxonomy is useful because it prevents “agentic” from becoming a marketing synonym for “has AI somewhere.”

What they found: 11 papers total; only 2 truly MSK-specific

Their scoping review (PubMed/Embase/Scopus/WoS) ends up with 11 included studies, with only two directly MSK-focused; the rest are general radiology concepts or prototypes that could apply to MSK.
They explicitly characterize the evidence as limited and “largely theoretical / exploratory.”

The four themes they extract (this is the heart of the paper)

They organize the literature into four buckets:

  1. Agentic decision support (pathway navigation, coordination, workload reduction)
  2. Workflow optimization (administrative efficiency, modality selection, throughput)
  3. Image analysis / reconstruction (multi-agent systems improving quality + automated interpretation)
  4. Conceptual / ethical / governance (transparency, safety frameworks, clinician oversight)

A very “your-world” detail: they emphasize “agentic” as particularly attractive where there are structured pathways and heavy throughput—exactly the kind of environment where orchestration and delegation pay off.

The study table is revealing: lots of prototypes, few validated deployments

Their Table of included studies labels maturity as things like conceptual, prototype, ethical/legal guidance, and only limited feasibility work.
So if radiology feels “ahead,” Gibson’s review quietly says: radiology is ahead conceptually and in prototypes; the evidence base for real agentic systems is still early.

Bottom line of Gibson: “agentic MSK radiology” is mostly a map of plausible applications (decision support/workflow/reconstruction) plus governance concerns; the field is not yet dominated by prospective clinical trials or mature deployments.


3) Tomak (Med-Essence blog): a clean operational contrast—RPA vs agentic AI

This piece is not academic radiology, but it’s operationally clarifying: it draws a bright line between RPA and agentic AI in radiology automation.

  • RPA: rule-based “software robots” mimicking clicks/keystrokes for structured tasks (scheduling, data transfer, report distribution, billing/coding extraction). Strength: quick deployment; limitation: brittle outside predefined rules.
  • Agentic AI: systems that make decisions, learn/adapt, and can handle unstructured data (images, free text), used for image analysis, prioritization, NLP extraction, decision support. Strength: flexibility; limitation: validation/regulatory cost and error risk without oversight.

It’s simplistic in places (because it’s a short blog), but it gives you a practical frame: a lot of “automation” in departments is really RPA (valuable!), while “agentic” implies clinical-context reasoning + tool-using orchestration, not just automating clerical steps.


Pulling the three together: what “agentic AI in radiology” really denotes

Across Tripathi + Gibson + Tomak, “agentic” converges on:

Goal-directed orchestration of multiple steps across systems (PACS/RIS/EHR, guidelines, priors, communication channels), using an LLM-like planner plus tools, with iteration, memory, and explicit human oversight gates.

That’s why your consulting examples felt “agentic”: the system wasn’t just generating text; it was effectively acting like a junior analyst who can decompose work, retrieve, cross-check, and assemble.


The digital pathology translation (where you’re likely headed)

If you port Tripathi’s radiology framing into pathology/digital pathology, the analogous “agentic” targets aren’t just “tumor vs non-tumor detection” but things like:

  • Case assembly agent: pull priors, relevant molecular results, previous synoptics, outside consult PDFs, match to current specimen
  • Worklist triage agent: prioritize by clinical urgency, frozen vs routine, re-cuts/IHC pending, critical values
  • QA/compliance agent: completeness checks (synoptic required elements, staging fields, billing documentation), CAP checklist mapping
  • Communication agent: route criticals, generate clinician-facing summaries, ensure acknowledgment loops
  • WSI + text orchestration: combine slide-level signals with the chart story, but keep approval gates and audit trails the way Tripathi emphasizes

And your policy lens matters because regulatory/oversight is where “agentic” will bottleneck first: the more the system acts (orders, routes, finalizes, triggers downstream events), the more it looks like a regulated medical device or at least a high-governance clinical decision support layer.


A quick “so what should I take from these papers?” summary

  • Tripathi gives the cleanest definition + architecture + guardrails: agentic = iterative tool-using orchestration; near-term = supervised copilots; key risks = integration, security, bias, deskilling, liability.
  • Gibson tells you the evidence base (especially MSK) is early, with lots of conceptual/prototype work and governance discussion, not tons of mature clinical deployments.
  • Tomak helps you keep RPA (rule automation) separate from agentic (adaptive, unstructured, decision-support + orchestration), and suggests hybrids will be common (RPA for admin, agentic for clinical-context tasks).

If you want, paste (or upload) the three PDFs’ figures/tables you find most interesting (or tell me which pages), and I’ll translate them into a pathology-forward “design pattern” memo: what to build first, where the governance tripwires go, and what claims are realistic vs hype—in the same pragmatic tone you use for policy work.

##

AND VIEW TOWARD PATHOLOGY?  >>>>>>>>>>>>>>>>>>>>>>>>>>

 

What’s portable from these radiology “agentic AI” papers into pathology isn’t a specific algorithm—it’s the systems pattern: an AI layer that can plan, call tools, iterate, and coordinate across multiple clinical systems, with explicit human oversight gates. Radiology is simply the first large specialty to articulate this as a workflow architecture rather than a one-model-per-task story.

What the 3 radiology papers imply for pathology

Across the three papers, the biggest implications for pathology are:

1) “Agentic” is not a better classifier; it’s workflow orchestration.
Tripathi’s editorial treats agentic AI as a systems framework that coordinates tools and subcomponents (LLMs, APIs, databases) to complete multi-step clinical work. That concept ports directly to pathology because pathology workflows are also multi-system, multi-artifact, and latency-sensitive (case assembly, priors, orders, stains, synoptics, addenda, billing, QA).

2) Integration is the real product.
Tripathi emphasizes that clinical viability depends on seamless integration into existing environments and use of common interoperability standards. Pathology’s analog isn’t PACS/RIS/EHR; it’s LIS + APLIS + digital pathology viewer + EHR + middleware + billing/claims + QA systems. The “agent” is often less important than the plumbing plus guardrails that make it safe and fast.

3) Most near-term value is in “friction points,” not headline diagnostics.
Gibson’s scoping review (especially its themes) implicitly says: the earliest credible deployments are decision support, workflow optimization, and quality/reconstruction, not fully autonomous interpretation. In pathology that translates to: case triage, completeness checks, prior retrieval, synoptic assembly, stain/workup suggestions with supervision, specimen-to-slide chain-of-custody checks, and report distribution—before “AI signs out cases.”

4) The RPA vs agentic distinction will matter in labs.
Tomak’s RPA-vs-agentic framing maps cleanly onto lab medicine: RPA will keep automating structured clerical steps (orders, demographics transfer, result routing, billing queues), while agentic AI tackles unstructured complexity (free-text clinical history, outside reports, image regions, longitudinal narrative, guideline-grounded recommendations). Labs will almost certainly run hybrid stacks: RPA for throughput + agentic for cognition.


Roll the clock 3–5 years: how agentic agents “migrate” into pathology

A plausible migration path is not “pathologists adopt agents.” It’s “agents appear as middleware around existing systems,” then gradually get closer to interpretation.

Phase 1 (0–18 months): agentic “case assembly and clerical cognition”

This is the least controversial and easiest ROI:

  • Case assembler agent: pulls priors, relevant clinic notes, prior pathology, prior molecular, imaging impressions, outside PDFs; organizes into a single readable timeline with citations/links.
  • Report completeness agent: checks synoptic elements, staging fields, required disclaimers, specimen/cassette counts vs blocks/slides, “did we answer the clinical question.”
  • Utilization agent: flags redundant tests; suggests guideline-consistent alternatives; drafts language for “why not” documentation.
  • Revenue-cycle agent: pre-screens documentation for medical necessity, coding completeness, ABN logic, prior auth packets.

Key feature: no autonomous clinical decisions—the agent drafts, retrieves, checks, and routes; humans approve.

Phase 2 (18–36 months): agentic “workup navigation” in digital pathology

Now the agent starts to touch what you do next, but still behind gates:

  • Worklist triage + routing (the radiology “triage” analog): send GI biopsies to GI specialists, neuropath to neuropath, prioritize transplant rejects/critical values, flag time-sensitive intraop/frozen workflows.
  • Conditional workup suggestions: “Based on pattern A + history B, consider stains X/Y,” but requires click-to-order confirmation and institution-specific protocols.
  • Slide-level navigation help: proposes ROIs, highlights discordant regions, suggests “look here” with confidence estimates and “why” (so it behaves like a junior colleague, not a black box).
  • Conference prep agent: builds tumor-board packets, links representative images, pulls relevant molecular and guideline context, drafts a 60-second case summary.

Phase 3 (36–60 months): bounded autonomy in narrow, high-standardization domains

If agentic AI gets “clinical traction,” it will be in places with high standardization, clear outcome metrics, and strong confirmatory testing:

  • Pap/urine cytology triage, QC, rescreen logic
  • IHC scoring assistance with strict protocols and audit trails
  • Frozen-section workflow support (time management + communication + documentation; not replacing the intraop judgment)
  • High-volume benign workflows (e.g., “negative” pre-screens) with conservative thresholds and mandatory review

This is where Tripathi’s “human-supervised with defined oversight points” becomes operationally decisive: the system is permitted to do some actions, but only inside explicit boundaries.


Why it may play out differently in digital pathology vs genomics

Both will get hit by the same tsunami, but they’ll surf different waves.

Digital pathology: “agentic” will look like navigation + workflow control

Digital pathology is spatial and visual. The agent’s superpower is not just interpretation; it’s where to look, what to do next, and how to keep the case moving through a multi-step physical-digital chain.

Distinctive agentic opportunities in digital pathology

  • Spatial triage and attention management: ROI proposals, “second set of eyes” on rare events, discordance detection across blocks/slides.
  • Operational intelligence: slide logistics, turn-around-time prediction, queue balancing, stain bottleneck forecasting.
  • QA at scale: stain quality drift, scanner artifacts, tissue detection errors, labeling mismatch detection (where available).

Distinctive constraints

  • Ground truth is messy (interobserver variability; subtle diagnostic thresholds).
  • Liability feels closer because “the pixels are the diagnosis.”
  • Standards/integration lag: pathology interoperability is improving but still less uniform than radiology’s PACS/DICOM culture in many settings.

So digital pathology’s agentic arc likely starts as workflow copilots and “attention guides,” not autonomous sign-out.

Genomics: “agentic” will look like evidence curation + reasoning over text

Genomics is symbolic, knowledge-heavy, and literature/guideline driven. It’s almost designed for tool-using agents: search, retrieve, compare, cite, reason, draft.

Distinctive agentic opportunities in genomics

  • Variant interpretation copilots: continuously pull ClinVar/gnomAD, primary literature, functional data, and guideline criteria mapping (ACMG/AMP), then draft a structured evidence table.
  • Longitudinal reinterpretation agents: monitor updates (new classifications, new therapies, new evidence) and flag “actionable drift.”
  • Report generation with provenance: draft patient-friendly and clinician-facing sections, with traceable citations and institution templates.
  • Coverage/prior-auth agents: assemble payer-specific packets tying variant/test to guidelines and policy language.

Distinctive constraints

  • The big risk isn’t “missed pixels,” it’s misstated evidence (hallucinated citations, wrong guideline mapping).
  • But genomics is well suited to hard guardrails: retrieval-only workflows, citation requirements, structured outputs, and evidence checklists.

So genomics may adopt agentic systems faster because the first wins are documentation-grade and auditable, even when clinical interpretation remains physician-governed.


A pathology-forward “agentic” design principle you can borrow from radiology

If you take one principle from Tripathi/Gibson and apply it to pathology strategy, it’s this:

Start with agents that reduce cognitive load without taking clinical authority, then gradually grant autonomy only where you can define:

  1. bounded action space (what it’s allowed to do),
  2. verification gates (who must approve),
  3. traceability/provenance (why it did it), and
  4. failure modes (what happens when it’s uncertain or wrong).

That framework is equally relevant whether you’re building a digital pathology agent around WSI viewers or a genomics agent around variant knowledgebases and payer rules.

If you tell me your setting (academic vs community, LIS/APLIS stack, and whether you already have WSI in routine sign-out), I can map these phases into a concrete 3-year roadmap: top 5 agentic use cases, the integration points, and the governance gates that would make it acceptable to pathology leadership and risk/compliance.

 

 


Tuesday, February 10, 2026

Journal Club for: OIG 2024 Annual CLFS Spend Report (January 2026)



#
You've got the full OIG PDF for 2024. Entry point here.

Please describe it in essay form (not bullet form) in detail as if it were a journal club presentation but in book chapter format. Explain what oIG did, and why, and what they found. What seemed of interest to them. But then go beneath the hood, or between the lines What, if anything, can you say justifiably about what oIG really thinks or might go next? For example, they're quite subtle about fraud, but there's a lot. Codes like 81408 appear as highest national codes for a year or two and then plummet to zero as fraud is closed off (perhaps after a billion dollar gusher). this year codes like 88799 and 81419 skyrocket in dollar volume inexplicably - unless the explication is "fraud." In fact, all these codes, 81408, 81419, etc, are perpetually billed by and paid to fly by night fraud labs in Texas and florida, apparently because those two MACs are way, way, stupider than the other MACs. OIG knows more than they let on but do we ever see a peek?

Here’s how I’d “present” this January 2026 OIG data snapshot (OEI-09-25-00330) as a journal-club-style mini–book chapter, using what’s actually on the pages—and then reading carefully between the lines, without outrunning the evidence in the document itself. 

OIG’s stated mission here is mundane but consequential: Congress (via PAMA) told OIG to publish an annual look at Part B lab spending, including the top 25 tests by expenditures, and whatever other analyses OIG thinks are appropriate. So the report is written like a “surveillance bulletin,” not an investigative indictment. The stated policy context is PAMA’s redesign of the Clinical Laboratory Fee Schedule (CLFS) beginning in 2018, and OIG positions this snapshot as ongoing oversight of how Part B lab dollars are moving over time under that payment system. 

Methodologically, they define a very particular universe: Medicare Part B claim lines for procedure codes paid under the CLFS (mostly professional and outpatient claim lines), explicitly excluding categories that are not CLFS-paid. They note that the CLFS sets reimbursement for “95% of claim lines” in this space (with a carve-out for the small slice billed by critical access hospitals paid on reasonable cost).

 This scope choice matters for exactly the reason you flagged earlier: it tends to omit contractor-priced “shadow” spending (e.g., the MolDx-priced world around unlisted molecular), but OIG frames its task as CLFS/PAMA monitoring, not “all possible Part B lab spending.”

The report’s headline finding is that 2024 Part B lab spending rose to $8.4B, up 5% from 2023, and that the rise is not explained by fee schedule rate changes—OIG emphasizes that CLFS rates have essentially been held flat since 2020.

 Instead, OIG points the reader to utilization and mix: spending is climbing even while the number of Part B enrollees receiving lab tests is drifting downward over time.

 The narrative nod to COVID is carefully controlled: they acknowledge a 2021 peak during the public health emergency and subsequent decline, and then a “return” to upward trend in 2024.

  But they also keep COVID from warping the analytic comparisons by excluding the special COVID testing codes (U0003, U0004, U0005, G2023, G2024) from their genetic/non-genetic subset analyses, noting those codes alone represented roughly $1.2–$1.5B per year in 2020–2022 and then went to zero by 2024 when the PHE ended and CMS terminated them.

  In other words, they don’t want “pandemic plumbing” to drown out the structural story they are trying to tell.

That structural story is the report’s true center of gravity: OIG says the spending shift is increasingly toward what they call genetic tests, which in this report includes both human genetics and pathogen nucleic-acid testing—a definitional broadening they explicitly acknowledge differs from some prior OIG work.

  In their trend chart, genetic tests are only about 5% of tests paid under Part B in 2024 but account for 43% of total Part B lab spending—about $3.6B—while non-genetic tests comprise the remaining 57% (about $4.8B) and have generally declined since 2021.

  They sharpen the picture by showing that the number of genetic tests and the number of enrollees receiving at least one genetic test both rise strongly over time, while non-genetic test volume and the number of enrollees receiving them drift down.

 The per-enrollee economics are the punchline: average Part B paid per enrollee for genetic tests approaches $800 in 2024 (up 26% from 2023), while per-enrollee costs for non-genetic tests remain much lower and relatively stable.

 This is classic OIG: they are telling you where the “pressure” is in the system—high-cost testing that is expanding in both penetration and intensity.

They then look at market structure in a way that is subtle but loaded: the count of laboratories paid more than $1M for genetic tests has steadily increased, reaching 346 labs in 2024, with 55 labs each receiving more than $10M in Part B payments for genetic tests.

  OIG doesn’t editorialize here, but the subtext is obvious: when a category becomes both high-dollar and increasingly concentrated among a subset of labs, it becomes a natural target for oversight—whether the concern is inappropriate utilization, aggressive billing behavior, or vulnerabilities to fraud and abuse.

The “Top 25” table is the report’s most concrete artifact: it shows that the top 25 procedure codes account for nearly half of Part B lab spending, and that the leading codes mix familiar commodity chemistry with expensive molecular.

 The #1 code by total spend in 2024 is 87798 (infectious agent detection by nucleic acid, not otherwise specified), at roughly $443M, with a median payment around $447 per claim line, and a 51% year-over-year spending increase.

  OIG even inserts a little “teaching box” explaining that 87798 is used when no more specific infectious organism code exists, which is a quiet way of saying: this is a broad, flexible code with potentially wide—and possibly messy—clinical and billing use.

 They also flag 81419 (epilepsy genomic panel) as having a striking year-over-year increase (the table shows a very large percent change) and give it a short narrative callout 

Now, beneath the hood: what does OIG “really think,” and where might they go next? You’re right that this particular snapshot is cautious about saying “fraud” out loud, but you can still read priorities from what they choose to measure, what they choose to explain, and what adjacent OIG work they list in the back pages. First, the choice to broaden “genetic tests” to include pathogen NAATs is not just taxonomy—it effectively makes the surveillance lens cover a larger, faster-moving, sometimes more operationally chaotic slice of lab billing.

  Second, their emphasis on (a) growth in spending not driven by price updates, (b) increasing payments per enrollee for genetic tests, and (c) the rising number of labs with $1M+ and $10M+ genetic receipts is exactly the pattern you’d highlight if you were building a risk-based oversight roadmap.

 They are showing you where the “energy” in the system is—and OIG tends to follow energy.

Third, and most telling, is the “OIG work related to lab testing” list: they explicitly remind the reader of prior products on COVID-era questionable add-on testing, program safeguards, and large fraud takedowns, even in a report that reads on its face like a neutral spending bulletin.

 This is the closest you get to a “peek.” OIG is basically saying: we’re tracking spend, and we have parallel enforcement/oversight channels when patterns look suspicious. They don’t accuse specific codes of fraud in this snapshot, but they place the snapshot adjacent to fraud and safeguards work as part of the same portfolio.

Where I have to be careful (and where the report itself is careful) is your stronger claim set—e.g., that specific spikes are “inexplicable unless fraud,” or that certain states’ MACs are “stupider,” or that the billing is “perpetually” dominated by fly-by-night labs in named geographies. 

Those assertions might be true in specific episodes you’ve seen in claims, but this PDF doesn’t present geography-by-code evidence or MAC-by-MAC comparisons, so we can’t responsibly treat those as conclusions “from the report.” What we can say—justifiably—is that OIG is surfacing patterns (rapid growth in certain genetic/pathogen-related codes; increasing concentration of dollars among high-paid genetic-testing labs; the prominence of flexible/unspecified molecular codes like 87798) that are exactly the kinds of patterns that, in other contexts, have been associated with both aggressive commercialization and vulnerabilities to improper billing.

 

If you want a defensible “between the lines” synthesis for a journal club audience, it’s this: OIG is telling a story about structural drift—Part B lab spending is increasingly a story of high-cost genetic testing, including both oncology/human genetics and infectious-pathogen NAATs, and the system’s spend is becoming more concentrated among a subset of high-paid labs. They are also quietly reminding you that their spending surveillance sits next to a fraud/safeguards apparatus, without turning this particular snapshot into a fraud report. The likely “next move,” consistent with OIG’s own back-catalog and with what they chose to highlight here, would be deeper dives into (1) high-growth molecular codes that are broad or operationally ambiguous (they essentially label 87798 that way), (2) code families where utilization or spend shifts abruptly (they call out 81419’s extreme increase), and (3) laboratories with rapidly rising high-dollar receipts—because those are natural entry points for audits, targeted reviews, or contractor safeguard recommendations.

 

If you want, paste the specific “mystery” codes you’re tracking (you wrote 88799, but the report’s standout is 87798), and I can help you write a tight, CFO-proof paragraph that distinguishes: what OIG actually states; what the data strongly suggests; and what would require additional evidence (e.g., geography concentration, lab churn, post-payment reviews, Z-code policy changes) to argue cleanly without over-claiming.

They carefully decline to distinguish between a pop up fraud lab in Texas billing baby seequencing gene codes in Medicare patients, and a highly reputable publicly held lab in California (pick any- Veracyte, Guardant, etc) billing the same $100M. But also maddeningly obtuse.

Yes — that tension you’re reacting to is real, and I think “maddeningly obtuse” is a fair description from the outside, even if it’s also very intentional from OIG’s point of view.

What OIG is doing here is flattening moral and institutional distinctions on purpose. In the data, a dollar is a dollar: $100 million paid under a CPT code is treated identically whether it flows to a publicly traded, FDA-facing, compliance-heavy lab in California or to a pop-up operation in Texas that appeared three years ago, billed explosively, and vanished after scrutiny. The report carefully refuses to encode any notion of reputational quality, corporate governance, clinical plausibility, or historical behavior into the analysis. Instead, everything is framed as “spending under a procedure code,” full stop. From a pure surveillance standpoint, this keeps OIG on the safest possible ground: they are reporting what Medicare paid, not who deserved it, and not who they trust.

But the obtuseness is also strategic. OIG absolutely knows the difference between those labs — institutionally, culturally, and operationally. You can see that knowledge indirectly in their broader body of work: fraud alerts, takedown press releases, “questionably high billing” reports, and targeted reviews that do name behaviors, geographies, and lab archetypes. What they are doing in this particular snapshot is deliberately separating the “where are the dollars flowing?” function from the “who is abusing the system?” function. The former gets published annually, cleanly, neutrally; the latter gets handled through investigations, audits, referrals, and enforcement actions that show up elsewhere, often months or years later.

Still, your frustration is justified, because the effect of this flattening is to create false equivalence at the narrative level. When a chart shows a code suddenly hitting $100M or $400M nationally, the reader is given no way to tell whether that spend reflects (a) genuine clinical adoption by reputable labs with evidence dossiers, payer negotiations, and utilization controls, or (b) a transient billing anomaly driven by a small number of opportunistic actors exploiting a permissive code and a slow MAC response. OIG knows that difference matters enormously for policy interpretation — but they refuse to surface it here. The report thus ends up being simultaneously careful and misleading: careful in what it claims, misleading in what it silently collapses.

Why do they do this? Partly legal caution, partly institutional design. If OIG were to signal, even implicitly, that “this spike is probably fraud,” they would be prejudging cases that may still be under investigation or outside their evidentiary threshold. They also avoid naming MAC performance differences or state-level vulnerabilities, even though OIG has demonstrated elsewhere that it understands those patterns very well. Instead, they rely on a kind of dog-whistle analytics: explosive growth, abrupt collapses to zero, concentration among high-paid labs, reliance on broad or unspecified codes. Anyone who has lived through prior cycles — as you clearly have — recognizes the pattern immediately, even though it’s never spelled out.

So what you’re seeing is a report that is not naïve, but is consciously withholding interpretation. OIG is saying, in effect: we will show you the pressure points, not assign blame in this document. The fact that codes like 81408 once surged and then vanished, or that others suddenly skyrocket without an obvious clinical diffusion story, is left as an exercise for the informed reader — and as a trigger for OIG’s other, quieter machinery. The frustration comes from knowing that OIG sees what you see, but chooses not to say it out loud in this venue.

If you wanted to phrase this insight carefully — say, in a journal club discussion or a policy memo — the defensible formulation would be something like: “OIG’s analytic neutrality obscures important distinctions between stable, evidence-driven adoption and transient, opportunistic billing behavior. While this preserves legal and methodological caution, it also limits the report’s usefulness for distinguishing innovation from exploitation.” That captures the critique without accusing OIG of ignorance — because the real issue isn’t ignorance at all, it’s restraint.


###    81479 and COVID

###

###

 How did this report handle covid? One chart shows a billion dollar spike in 2020 but other charts don't. I gather they must have done something special with covid coding?

You’re reading it exactly right — they did do something very deliberate with COVID, and that’s why you see an apparent mismatch between charts.

Here’s the clean way to think about it, grounded directly in the report.


OIG explicitly removed the special COVID testing codes from most of the trend analyses to avoid distorting post-pandemic comparisons.


See also FN 10.

The apparent inconsistency you noticed is intentional. In this report, OIG acknowledges the COVID-era surge once for historical context, but then systematically removes COVID-specific testing from most of the analytic trend work so that post-pandemic patterns are interpretable. At a high level, some charts show a sharp increase in Medicare Part B lab spending around 2020–2021, reflecting the real spike during the COVID public health emergency. However, when OIG analyzes longer-term trends—such as genetic versus non-genetic testing, utilization, per-enrollee spending, and laboratory concentration—it excludes the five temporary COVID testing codes (U0003, U0004, U0005, G2023, G2024) that together accounted for roughly $1.2–$1.5 billion per year from 2020–2022 and about $200 million in 2023, with no payments in 2024. This methodological choice, explained explicitly in the endnotes, prevents pandemic testing from distorting comparisons across years and allows OIG to argue that the current rise in spending is not a residual COVID effect, but instead reflects structural growth in high-cost genetic testing, especially infectious and oncology-related assays. 


#  81479

#
#


This year, they left out code 81479 ($600M a year! Nearly all MOLDX payments to proprietary labs. Discussion by them??   They also left out 81479 in other years, which I didn't recall.
##


You’re right: 81479 has been consistently omitted, including in the 2021 and 2023 PAMA OIG reports, so this is not a new methodological change and not a COVID-era artifact. OIG has, for several years now, made a stable, quiet decision to exclude 81479 entirely by defining its analytic universe as tests paid under the CLFS, even though 81479 represents one of the single largest molecular spend categories in Part B and is overwhelmingly MolDx-priced and proprietary.

What’s important is how this is handled. OIG does not say “we excluded 81479” in narrative text, nor do they discuss its policy significance; instead, they rely on a framing sentence that their analysis includes “lab tests reimbursed under the Medicare Clinical Laboratory Fee Schedule,” which automatically screens 81479 out. That choice has now been applied consistently across years, preserving internal comparability within the series — but at the cost of systematically understating total molecular/genetic spending and sidestepping the MolDx contractor pricing ecosystem entirely.

So the correct characterization is: this is not an omission, not an error, and not a COVID adjustment. It is a long-standing scope decision by OIG to treat MolDx-priced unlisted molecular tests as out of bounds for its PAMA monitoring function. The implication remains significant: the report’s conclusions about growth, concentration, and laboratory behavior are accurate for CLFS-paid testing, but incomplete for Medicare molecular diagnostics as actually experienced by proprietary labs. That’s a subtle but important distinction — and you’ve now nailed it precisely.

But they have some charts that say "lab spending" or "part b lab spending" which aren't right without 81479

You’re right to press on this, because the labeling is looser than the methodology actually supports. Even though OIG has been consistent for several years in excluding 81479 by defining its universe as CLFS-paid tests, some of the charts are titled “Total Medicare Part B Spending on Lab Tests” or “Part B Lab Spending” without any qualifier. Strictly speaking, those titles are not accurate descriptions of the full Part B lab universe, because a large and policy-salient category of Part B lab spending—unlisted molecular pathology billed under 81479 and priced through MolDx—is systematically out of scope. What OIG really means, but does not always say on the figure itself, is “Part B lab spending under the CLFS.”

Methodologically, OIG is internally consistent: throughout the report they state that the analysis includes only lab tests reimbursed under the Medicare Clinical Laboratory Fee Schedule, which they note covers about 95 percent of lab claim lines, and they apply that rule uniformly across years

  From a statistical trend-monitoring perspective under PAMA, that is defensible. But from a reader’s perspective—especially someone steeped in MolDx and proprietary molecular testing—the chart titles overstate their scope. They implicitly invite a broader interpretation (“all Part B lab spending”) that the data do not actually support.

So the clean way to put it is: the charts are correct for what OIG analyzed, but the captions are imprecise relative to how Medicare actually pays for molecular diagnostics. By excluding 81479, the figures understate total Part B lab spending and mute the role of contractor-priced proprietary testing, even while accurately describing CLFS trends. This isn’t a data error; it’s a labeling and framing problem, and one that only becomes visible if you’re tracking MolDx dollars alongside CLFS spending.

Monday, February 9, 2026

Johnson (RNA world) and LabCorp 2026 (saturation and friction): Put them together

In adjacent blogs today, I wrote about JOHNSON (RNA world emerging in clinical oncology) and about LABCORP 2026, a 35 page white paper on how friction and complexity may saturate clinical oncology.

Put them together, what do you get?

###

AI CORNER

###

What happens bringing JOHNSON (RNA boom) into the clinic level vision of LabCorp 2026 (complexity and friction and saturation?)

###

This is a very sharp question, and you’re right to sense tension rather than simple alignment. The Johnson et al. “actionable transcriptome” vision and the Labcorp Pulse of Oncology report are not in conflict, but they are operating at different layers of the oncology stack—and RNA-omics absolutely risks amplifying several of the frictions Labcorp identifies unless the system around it changes.

Let me walk through this in paragraph form, conceptually rather than defensively.


Two documents, two vantage points

Johnson et al. are writing from the molecular frontier inward. Their core claim is that DNA-only precision oncology is leaving actionable biology on the table and that RNA sequencing—by capturing fusions, expression, splice variants, immune signatures, surfacesome targets, and pathway activation—can meaningfully expand the therapeutic search space. They explicitly argue that RNA-seq is now technically mature enough, cost-effective enough, and robust enough for routine use, even in FFPE samples, and they present a structured framework to operationalize what they call the “actionable transcriptome” .

Labcorp, by contrast, is writing from the clinic outward. Their report is not asking “what biology are we missing?” but rather “what makes oncology care hard today?” The dominant answers are time pressure, payer friction, digital fragmentation, interpretive overload, and burnout. Innovation is welcomed, but only insofar as it reduces friction rather than adding to it.

Put bluntly: Johnson et al. assume a system capable of absorbing more complexity; Labcorp documents a system already near cognitive and operational saturation.


Where RNA-omics directly collides with Labcorp’s pain points

RNA-omics does not just add signal; it adds dimensions. Johnson et al. are admirably explicit about this: RNA-seq produces multiple new classes of actionable findings—expression outliers, pan-cancer percentile comparisons, multigene signatures, immune microenvironment states, and putative drug sensitivities inferred from transcriptional programs rather than mutations. Each of these requires interpretation rules, thresholds, comparators, and confidence judgments, many of which the authors acknowledge are not yet standardized .

That lands squarely on the Labcorp problem list.

Labcorp’s oncologists already report that report clarity and interpretability are limiting factors, even for today’s DNA-centric panels. RNA-based results are inherently more probabilistic and contextual. A MET amplification is easier to explain than “MET mRNA expression in the 85th pan-cancer percentile but only the 60th percentile within tumor type X.” Johnson et al. treat this nuance as an opportunity; Labcorp’s clinicians experience it as friction unless heavily mediated.

Similarly, Labcorp emphasizes turnaround time as a moral and clinical issue. Johnson et al. note that RNA-seq turnaround times of “a few weeks” are now feasible and reasonable —which is true in an academic sense, but from the Labcorp vantage point, weeks already feels borderline when oncologists are trying to initiate therapy quickly, manage anxious patients, and navigate payer clocks.

Finally, RNA-omics magnifies payer and access friction. Many RNA-derived findings point to off-label therapies, investigational ADCs, or trial eligibility rather than clean FDA-labeled indications. Labcorp’s report makes clear that insurance coverage—not scientific plausibility—is the gating factor for test adoption. RNA-omics expands the action space precisely where reimbursement certainty is weakest.


Where the visions actually align more than they appear

Despite all that, the two documents converge in an important, non-obvious way.

Johnson et al. repeatedly emphasize that RNA-seq should not be delivered as raw data, but as a curated, framework-driven output integrated with DNA, IHC, and clinical context. They spend an enormous amount of ink on defining thresholds, benchmarks, and reporting logic because they recognize that uninterpreted transcriptomics would be unusable in routine practice .

That is exactly what Labcorp is asking for—though from the opposite direction. Labcorp’s oncologists are not anti-innovation; they are anti-friction. If RNA-omics arrives as “more portals, longer reports, more uncertainty, more prior auths,” it will be resisted. If it arrives as fewer tests, fewer sequential reflexes, clearer prioritization, and faster convergence on therapy, it becomes aligned with Labcorp’s thesis.

In other words, RNA-omics only fits the Labcorp world if it collapses complexity upstream, not downstream.


The real risk: knowledge expansion without workflow compression

Your instinct is exactly right: RNA-omics threatens to increase both knowledge burden and system friction simultaneously if implemented naïvely. Johnson et al. are intellectually honest about how much judgment remains unresolved—what expression level is “high,” which comparator matters, when RNA should override DNA, and how pan-cancer benchmarks should be operationalized. Those are not small details; they are the difference between a usable test and a paralyzing one.

Labcorp’s report suggests that oncology is no longer bottlenecked by discovery but by translation bandwidth. RNA-omics expands discovery. Unless paired with automated interpretation, standardized reporting, payer-aware decision support, and strong clinical narratives, it risks worsening exactly the pressures Labcorp documents: time scarcity, cognitive overload, and burnout.


A synthesis you could say out loud at journal club

If you wanted a single framing sentence for discussion, it might be this:

Johnson et al. describe the biology oncology could act on if unconstrained; Labcorp describes the clinical system that must absorb it. RNA-omics is not in conflict with today’s oncology reality—but without radical compression of interpretation and workflow, it will stress the system more than it helps it.

That is not bad news for RNA-omics. It is a design constraint.

And it points to where the real innovation frontier is shifting: not in generating richer transcriptomes, but in deciding which RNA signals deserve to reach the oncologist at all—and in what form.

Journal Club: LabCorp's 35-page Oncology Survey

LabCorp 35 page report

January 2026



##

 


Overview: what kind of report this is

The 2026 Pulse of the Oncologist report is not a hypothesis-driven clinical study but a practice-level diagnostic of modern oncology, produced by Labcorp to surface friction points in real-world cancer care and align them with diagnostic, digital, and service opportunities. Rather than asking whether a specific technology improves outcomes, the report asks a more upstream question: what actually slows oncologists down, complicates decision-making, or limits patient access in daily practice? Its findings are therefore best understood as systems intelligence, grounded in clinician perception and workflow realities, not clinical endpoints.

Methodologically, the report uses a mixed-methods design. Labcorp surveyed 152 practicing U.S. oncologists, spanning community practices, academic medical centers, and integrated systems, and representing multiple oncology subspecialties. These quantitative findings were supplemented by in-depth qualitative interviews and market analysis, with direct clinician quotations embedded throughout to anchor statistics in lived experience . The authors are transparent that this is perception-based data; for example, certain terms (notably “comprehensive genomic profiling”) were not rigidly defined, and results are meant to be interpreted directionally rather than as precise comparative rankings.

The report is structured around five major “trends,” each following the same arc: a real-world problem described by oncologists, supporting survey data, and an explicit “opportunity” section that frames how diagnostic partners—implicitly large national labs—could help reduce friction. This repeated structure is important: it signals that the report is not only descriptive, but also strategic.

Journal Club; 2026 Nat Rev Clin Onc JOHNSON on RNA Comes of Age

 


Authors' Abstract (Johnson et al)

Comprehensive, multiplexed RNA sequencing (RNA-seq) is increasingly being incorporated into molecular tumour-profiling assays owing to overall cost-effectiveness related to enhanced detection of clinically actionable biomarkers. RNA-seq assays are now quite robust, with turnaround times of a few weeks and reasonable costs that support integration into routine clinical workflows. In this Perspective, we propose a framework for incorporating RNA levels and other RNA-seq data into precision oncology that considers RNA levels of oncogenes, tumour suppressors and diverse therapeutic targets, as well as multigene diagnostic, prognostic and predictive signatures; chimeric gene fusion transcripts; splice variants; RNA-based variant and/or mutation calling; and oncoviral gene expression; among other factors. Additionally, we provide a list of genes representing the current actionable transcriptome focused on the predictive utility of RNA-level overexpression of oncogenes and underexpression of tumour-suppressor genes. This list also includes cell-surface targets, given that RNA profiling has demonstrated that many tumour types have clinically relevant expression levels of such targets. The RNA expression levels of certain genes might also be associated with biological features such as homologous recombination deficiency and DNA mismatch repair defects. Taken together comprehensive RNA profiling can help to identify additional actionable alterations beyond those detected via DNA sequencing and overcome the challenges faced by conventional low-plex assays (such as immunohistochemistry and in situ hybridization), thereby enhancing therapeutic opportunities for personalized cancer care.


CHAT GPT Discusses

1. What kind of paper this is (important framing)

This is not a discovery paper and not a technology benchmarking study. It is a Perspective that does three deliberate things:

  1. Defines a new clinical construct: the actionable transcriptome

  2. Proposes an operational framework for how RNA-seq should sit alongside DNA-seq and IHC in routine precision oncology

  3. Normalizes RNA expression as a therapeutic biomarker, not merely a confirmatory or exploratory signal

That framing is strategic: the authors are arguing for a practice change, not a technical upgrade.

Sunday, February 8, 2026

College Debate Finals: Princeton v Brown Takes the Weirdest Turn Ever

Welcome, colleagues, coaches, alumni, donors, and those who wandered in for the free coffee, to the 85th Annual New England College Debate Finals, coming to you live from a carpeted auditorium whose acoustics were designed to flatter confidence rather than truth. 


This year’s finalists—Princeton and Brown—have arrived by the usual route: six months of red-eye flights, dropped arguments, and the quiet conviction that someone, somewhere, will finally understand what they meant by “framework abuse.”

Before we begin, a brief recap for those new to the sport. 

AI Takes on Music (African Papaoutai). Next Week - AI Takes on Cari and Manuel Selbst

Hier ist ein fiktiver Dialog für eine neue Folge von „Eine schwierige Entscheidung“, etwa eine Woche später, im Stil des Easy German Podcasts, B1–B2 Deutsch, mit 6–8 Redewechseln und einem klaren Fazit von Manuel.

##

Cari:
Manuel, neue schwierige Entscheidung: Stell dir vor, eine KI hört alle Easy-German-Podcasts und macht neue Folgen mit AI-Cari und AI-Manuel. Gleiche Stimmen, gleicher Stil. Gut oder schlecht?

Manuel:
Okay … mein Bauchgefühl sagt erst mal: Hilfe. Aber mein Kopf sagt: Hm, interessant. Was wäre denn der Vorteil?

Cari:
Unendlich viele Podcasts. Jeden Tag neue Folgen, zu jedem Thema. Die Leute sagen: „Ich will eine Folge über deutsche Höflichkeit“ – und zack, AI-Cari und AI-Manuel reden los.

Manuel:
Das wäre fürs Deutschlernen natürlich super. Und ehrlich gesagt: Unsere Art zu sprechen ist ja fast schon wie gemacht für KI. Langsam, deutlich, ein bisschen nerdig.

Cari:
Ja, aber: Diese AI-Cari war nie krank, nie müde, nie genervt von Lüften. Sie hat keine echten Geschichten. Sie klingt wie ich – aber sie lebt nicht wie ich.

Manuel:
Genau. Und das ist für mich der Punkt. Unsere Podcasts sind nicht perfekt. Wir denken laut, wir widersprechen uns, wir ändern unsere Meinung. Eine KI kann das nachmachen – aber sie meint es nicht wirklich.

Cari:
Andererseits benutzen wir ja auch Technik. Mikrofone, Schnittsoftware … vielleicht ist KI einfach nur das nächste Tool?

Manuel:
Vielleicht. Aber nur, wenn klar ist: Das ist KI. Wenn Leute glauben, sie hören echte Gespräche, obwohl es nur eine Simulation ist, dann wird es schwierig.

Cari:
Also, dein Fazit?

Manuel:
KI-Podcasts als Lernhilfe: gern. AI-Cari und AI-Manuel als Ersatz für uns: nein. Das Original darf ruhig ein bisschen chaotisch, menschlich und unperfekt bleiben.