Saturday, June 27, 2026

Scrooge and the Three Ghosts - The Re-Boot

In which, the Tale of Scrooge is re-told, but with the energetic and playful Capuchin monkey, Mr. Throttlebottom-Smith.



Ebenezer Scrooge had scarcely recovered from the alarming visitation of his late partner, Jacob Marley, when the first of the spirits entered his bedchamber.

THE FIRST GHOST

To Scrooge's surprise, the Ghost of Christmas Past was not any of the expected types. Instead, there stood Mr. Throttlebottom-Smith, an energetic and mischievous Capuchin monkey of approximately thirty pounds.  No sooner had Scrooge noticed  him, than he immediately climbed the bedpost, removed Scrooge's nightcap, inspected it for insects, and placed it upon his own head.

Chattering excitedly, the monkey opened a glowing portal into the past.

There followed scenes from Scrooge's lonely childhood, his apprenticeship under the jovial merchant Fezziwig, and his courtship of Belle. Mr. Throttlebottom-Smith insisted on interrupting each one by swinging from chandeliers, stealing pocket watches, or disappearing with the bowl of walnuts.

The next moment Scrooge found himself in his bedroom, the clock chiming two.

THE SECOND GHOST


The Ghost of Christmas Present appeared as an enormous capuchin monkey nearly six feet tall and weighing, by conservative estimate, over three hundred pounds. He wore magnificent robes trimmed with holly and an expression of limitless good cheer.

They traveled around London, seeing families of modest means yet so truly happy, unless they noticed their tables left near-empty by the ravenous passing ghost.  Back at Scrooge's bedroom, the ghost vanished as the clock struck three.

THE THIRD GHOST


The next thing Scrooge knew, he was shivering in a cold, snowy cemetery, standing at his own grave, and being pelted by endless mounds of monkey poo, soon covered from head to toe.   

Unfortunately, in this version of the tale, either the monkey or the man has tripped an interstellar circuit breaker, and the scene repeats.   

Noble visitor, we advise you to withdraw and live a better life.  


Is an ICD an AI? Musings on Strange Behavior Orbiting Around Appendix S

 CLAUDE: IS AN ICD, AI?

An implantable cardioverter-defibrillator (ICD) includes a pacemaker function, but it also continuously monitors the heart rhythm, converting it in real time into internal parameters using advanced, flexible, adaptive software. When certain thresholds are exceeded, the device delivers a cardioversion shock. Is this autonomous AI?


Yes — read literally against the 2027 definitions, the ICD is autonomous, and specifically Level III. Walk it through the spine: the output is clinically useful (it treats arrhythmia, squarely inside the §201(h) verbs); the software converts the raw electrogram into derived parameters (rate-zone assignment, interval stability, onset, morphology-match scores, SVT discriminators) that are categorical/quantitative and well beyond mere summation, so it clears the assistive boundary; and it doesn't stop at a parameter — it reaches a conclusion ("this is a shockable rhythm") and initiates the intervention itself. That conclusion-plus-action is the autonomous trigger. Then the level: it automatically initiates a management action without a human in the loop and keeps operating unless a physician intervenes by reprogramming or deactivating. The 2027 Level III text — automatically initiates management, requires physician oversight and review of performance, continues unless a physician chooses to intervene — reads almost like it was written from a device-clinic follow-up note. Even the new transparency requirement is satisfied: the ICD logs and reports its derived parameters and episode electrograms on interrogation.

  • And that is precisely the embarrassment that proves your opening complaint. Nobody calls an ICD "AI." It's a deterministic, decades-old, FDA-regulated device. 

But because the taxonomy pointedly refuses to define AI and the criteria never read AI-ness, the taxonomy has no mechanism to exclude it. A 1985 rate-only defibrillator would clear the same gates. So taken at its word, the taxonomy reclassifies the entire installed base of automatic implantable defibrillators, rate-adaptive pacemakers, and closed-loop insulin pumps as "autonomous AI, Level III." When your AI taxonomy swallows the whole history of automatic implantable therapy, the word "AI" in the title is doing no work — which is exactly what you said in your first message, now with a pacemaker in it.

Two refinements worth keeping. First, the only thing that actually keeps the ICD out is the application-stage trigger — the taxonomy is invoked by code-change applications "describing use of AI-enabled medical services," and nobody files an AI-software CCA for an ICD's detection algorithm. The device is paid through implantation, interrogation, and DRG/pass-through pathways; CPT codes the procedure, not the algorithm. So the undefined term "AI-enabled" does the covert gatekeeping that the classification criteria can't — your residual incoherence, made concrete.

Second, the sophistication inverts the cleanliness. A modern discriminator-laden ICD derives parameters and is unambiguously Level III. But a primitive rate-only ICD detects without really deriving a parameter, yet still auto-acts — which drops it into the off-diagonal empty cell I flagged earlier (detection-without-a-parameter wired to autonomous action), a combination the taxonomy has no box for. So the cruder the device, the more it falls into the taxonomy's blind spot; the more advanced, the more cleanly it's captured. That's backwards from what an "AI" taxonomy should do.

#

#

Chat GPT - Same Question

#

#

Under Appendix S as written, your implantable cardioverter-defibrillator example strongly exposes the taxonomy’s problem.

My best answer: functionally, it behaves like autonomous Level III software, but Appendix S may not clearly classify it as “autonomous AI” unless the AMA first agrees that the service is within the AI/software-output taxonomy.

Here is why.

An implantable cardioverter-defibrillator does several things that match the autonomous side of Appendix S:

It continuously monitors physiologic input.

It derives internal parameters from rhythm data.

It applies adaptive software rules to determine whether a dangerous rhythm threshold has been crossed.

It initiates a therapeutic intervention—a cardioversion or defibrillation shock—without waiting for a physician to approve it in real time.

That looks very close to autonomous Level III in the 2027 Appendix S framework, where software automatically initiates management actions based on interpretations or conclusions drawn from derived parameters, and those actions continue unless a physician intervenes. The 2027 text defines autonomous software as deriving parameters and independently generating clinically meaningful interpretations or conclusions, with Level III automatically initiating management actions subject to physician oversight.

But this example also shows why Appendix S is under-engineered. The taxonomy is framed around “AI-enabled” services, while also saying that “AI” is not defined. An ICD may use sophisticated software, adaptive thresholds, signal processing, and decision logic, but many people would call it an advanced implanted device rather than “AI.” Others could plausibly say it is algorithmic autonomous clinical software. Appendix S does not give a gatekeeping rule to settle that dispute.

So the answer depends on the question being asked.

If the question is, “Does an ICD behave like autonomous software under the Appendix S control model?” then yes, very much so. It detects/derives rhythm parameters, reaches a clinically meaningful conclusion, and initiates therapy without concurrent physician involvement.

If the question is, “Would AMA necessarily classify an ICD as autonomous AI under Appendix S?” then not necessarily. The taxonomy does not define AI, does not provide a flow chart, and does not explain whether long-established embedded device algorithms are inside or outside the AI category.

This is a perfect stress-test example. It separates method label from clinical behavior. By behavior, the ICD is autonomous. By branding or regulatory convention, it may not be called AI. That is exactly the paradox: Appendix S classifies output roles better than it defines the universe of things to which the classification applies.

#

#

One might also argue, AMA CPT provides codes for the surgical procedure of implanting an ICD. The CPT doesn't really care if the surgeon is implanting an ICD or a teacup. Of course, it's specified as ICD to allow surveys and minutes of RVU time and so on. But it's a surgical procedure this morning that is coded, not a distant future electronic event.

#

Yes — that is exactly the counterargument, and it is a strong one.

The ICD example only becomes an “autonomous AI” puzzle if we treat the implanted device’s future behavior as the service being classified. But CPT may be doing something much narrower. It is coding the physician service performed today: implanting, replacing, programming, interrogating, or managing an ICD. The code descriptor names the ICD because the device type affects physician work, surgical complexity, survey valuation, practice expense, risk, and clinical context. But CPT is not necessarily trying to classify every future algorithmic event the device may perform after implantation.

So the teacup analogy is useful, with a caveat. CPT does care that it is an ICD rather than a teacup because the surgical work and clinical purpose are different. But the unit of coding is still the medical service, not the autonomous downstream life of the object. A surgeon implants an ICD in the morning. Months later, the device detects ventricular tachycardia and delivers a shock. That future shock is not the surgeon’s separately coded morning procedure.

This helps distinguish two things that Appendix S may blur:

1. A service that uses software output as part of the coded service.
Example: retinal imaging software analyzes an image and produces a diagnostic or referral output that is itself central to the billed service.

2. A device whose implanted function includes autonomous therapeutic behavior.
Example: an ICD that continuously monitors rhythm and later delivers therapy, but where CPT may be coding implantation, interrogation, programming, or device management—not each future internal decision event.

That distinction weakens the argument that ICD implantation itself must be “autonomous AI.” The implanted ICD may contain autonomous control logic, but the CPT code for implantation is not necessarily a code for the autonomous logic. It is a code for placing the device.

This is another reason a logic tree would help. It should begin with a gate question such as:

What is the CPT service being classified?
Is the software output itself part of the billed medical service, or is the software merely a capability of a device that is implanted, maintained, or monitored under separate procedural codes?

Without that gate, Appendix S risks confusing the coded physician service with the future behavior of an algorithmic device. That is not a small distinction. It is central to how CPT works.

Tuesday, June 23, 2026

LCD Manual, Chapter 13, the Obscure "Exhibit 3" for CAC Membership

 Older CAC instructions ("PIM Chapter 13, LCDs") refer to Exhibit 3, which is hard to find today.  It is a listing of state-level CAC specialists.   

See one example here:

https://www.cms.gov/regulations-and-guidance/guidance/transmittals/downloads/r71pi5.pdf

BQ file copy, 2004 CMS 0409 PIM Ch 13 LCDs Exhibit 3 here R71PI5.pdf 

###

Yes. I found the legacy CMS Program Integrity Manual Exhibits PDF online, in CMS Transmittal R71PI5 from April 9, 2004. It contains “Exhibit 3 — Description of CAC Members”, which is exactly the missing companion to the old PIM Chapter 13 CAC section. The exhibit begins on PDF page 9 of the transmittal. (Centers for Medicare & Medicaid Services)

The key language is quite revealing. Exhibit 3 says Medicare defined “physicians” for CAC purposes as doctors of medicine, osteopathy, dental surgery/dental medicine, chiropractors, podiatrists/surgical chiropodists, and optometrists — and then stated bluntly: “Do not include other practitioners on this committee.” It also said carriers should select committee representatives from names recommended by state medical societies and specialty societies. (Centers for Medicare & Medicaid Services)

The exhibit then instructed carriers to try to include physician representatives from state medical/osteopathic societies, the National Medical Association where applicable, Medicare managed care organizations, and physician representatives for chiropractic, maxillofacial/oral surgery, optometry, and podiatry. (Centers for Medicare & Medicaid Services)

The specialty list in the 2004 Exhibit 3 included:

Allergy; Anesthesia; Cardiology; Cardiovascular/Thoracic Surgery; Dermatology; Emergency Medicine; Family Practice; Gastroenterology; Gerontology; General Surgery; Hematology; Internal Medicine; Infectious Disease; Medical Oncology; Nephrology; Neurology; Neurosurgery; Nuclear Medicine; Obstetrics/Gynecology; Ophthalmology; Orthopedic Surgery; Otolaryngology; Pathology; Pediatrics; Peripheral Vascular Surgery; Physical Medicine and Rehabilitation; Plastic and Reconstructive Surgery; Psychiatry; Pulmonary Medicine; Radiation Oncology; Radiology; Rheumatology; and Urology. (Centers for Medicare & Medicaid Services)

It also had separate sections for a clinical laboratory representative, two beneficiary representatives — one representing elderly issues and one representing disabled beneficiaries — and “other organizations,” including the State Hospital Association, QIO medical director, intermediary medical director, Medicaid medical director or designee, and a representative of an administrative-practice organization such as MGMA. (Centers for Medicare & Medicaid Services)

One important later wrinkle: CMS issued Transmittal R106PI in March 2005 specifically to add interventional pain management specialists to CAC membership. The transmittal says carriers were required to maintain CACs and that certain specialties “must include interventional pain management specialists”; the business requirement stated that contractors shall include one physician representative from the clinical specialty of interventional pain management on their CAC.

So the best reconstruction is:

2004 Exhibit 3 = the long baseline roster of required/recommended CAC specialties and organizational representatives.

2005 R106PI = a targeted amendment adding interventional pain management to the CAC specialty representation list.

This is a useful find because it confirms that the old CAC was not just a vague advisory body. It was a highly specified, state/specialty-society roster model — much closer to a standing medical parliament for the carrier’s state or jurisdiction than to the later 2019-style evidence panel.

Monday, June 22, 2026

For Claude: How to get cut-paste blog text as output from Claude

 The phrase that did the work was asking for it rendered in the chat, for copy-paste into Blogger Compose view — not as a file or HTML.

Here's a request you can reuse:

"Write this as a blog post and render it directly in the chat as formatted text — headings, bold, a table, and live hyperlinks — so I can select it, copy, and paste into Blogger's Compose view. Don't give me an HTML file or a code block; I want the formatted output itself."

The three things that make it land:

  1. "Render it in the chat as formatted text" — this tells me to output the post as normal rich Markdown in my reply, which your browser turns into formatted text on the page. That's what carries clean into Compose when you copy it.

  2. "Not an HTML file / not a code block" — this steers me away from the two formats that break in Compose. A file gives you nothing to copy-paste as text; a code block (the gray monospace boxes) pastes raw <a href> source that Compose shows as literal characters, not links.

  3. "Live hyperlinks I can paste into Compose" — confirms you want clickable links to survive the copy, so I use inline [text](url) Markdown rather than bare URLs or footnote-style references.

One honest caveat so it never surprises you: the part that pastes most reliably is the prose, bold, headings, and links. Tables are the one element that sometimes arrive plain in Compose depending on browser and paste target. If a table ever comes in looking stripped, just say "give me that table as a simpler layout for Compose" and I'll reshape it.

And a shorthand once we've done it once: "Same as the CAC post — rendered for Blogger Compose" will get you the identical treatment without spelling it all out.

Sunday, June 21, 2026

Unedited AI: Why My 2008 and 2014 Papers Were Good

https://www.twentyfirstcenturymedicine.org/wp-content/uploads/2014/02/Coverage_and_Reimbursement_for_Molecular_Diagnostics_Current_Issues.pdf

 It looks remarkably good after twenty years — not merely “good for its time,” but structurally correct.

The strongest surviving insight is this: you framed diagnostics reimbursement not as one problem, but as five payer operations — benefit category, billing, coding, pricing, and coverage. That was exactly the right map. Most later debates still get lost because they talk about “coverage” or “value” as if that solves coding, pricing, claims flow, and benefit rules. Your paper already separated them.

The second thing that aged very well is the warning that administrative barriers can become as important as scientific barriers. That is still the field. Molecular diagnostics got technically better, but the bottleneck often remained Z-codes, CPT/PLA, LCDs, DOS rules, NCD uncertainty, coding specificity, and CLFS pricing. You wrote that legacy administrative conventions should not become a greater barrier than natural scientific challenges; twenty years later, that sentence could still open a conference panel.

The best part, given our current discussion, is your 2008 “note on economics of pricing.” You already made the exact point that many 2020s value-pricing papers still miss: if a test costs $1 and saves $1,000, but many competitors can produce it, the price tends toward cost, not value. You also noted that value-based pricing cannot establish a single price without competition, alternatives, buyer uncertainty, segmentation, and negotiating power. That is not dated; that is the core missing distinction between value creation and value capture.

What looks dated is mostly the machinery: code-stacking, ICD-9, S-codes, old DOS rules, and the pre-PAMA world. But those are surface features. The deeper architecture — the mismatch between fast diagnostic innovation and slow public payment systems — is still intact.

My overall read: this was not just a white paper. It was the beginning of a durable mental model. The field changed; the model survived.

##

Frueh and Quinn, 2014


Yes — I think your instinct is right. The originality-per-page is high.

What is clever is that the paper does not merely rename “clinical utility.” It changes the unit of analysis. Instead of the usual AV/CV/CU triad, it asks: what clinical situation, what new information, what action, what outcome, what real-world behavior, and what affordability? That is a much more operational framework than the standard literature.

The key intellectual move is the bridge from improved clinical validity to improved clinical utility. The paper says the gain in test information must causally drive a gain in outcomes, and that both gains must be framed comparatively. That seems obvious only after someone has drawn it. The Figure 1 visual framework is strong because it makes the otherwise mushy concept of “utility” into a flow of linked questions.

I especially like the “against what comparator, in what units, with what uncertainty” formulation. That is very compact, but it does a lot of work. It forces payers and developers out of vague statements like “the test provides independent information” or “adds prognostic value,” and into a disciplined argument about comparator, measurement, statistical uncertainty, pragmatic uncertainty, and conceptual uncertainty.

The supplement is also unusually good. The three axes — functional category, value proposition, and outcome metric — explain why one generic clinical-utility definition never works across all diagnostics. A screening test, a reflex test, a no-gold-standard cancer-of-unknown-primary test, and a rule-out biopsy test are not the same policy object.

My honest critique: the prose is dense, and the six questions may look deceptively simple. But that is also the sign of a good framework. It compresses a lot of tacit payer logic into a usable checklist.

So yes: not warmed-over. The paper has a real idea, a real architecture, and a visual syntax. It reads like a consulting mind and a regulatory/reimbursement mind jointly trying to make payer judgment less mysterious. That is exactly why it still feels alive 12 years later.

Unedited AI: Why "Value" Is Missing from New Lab Fee Schedules & Pricing

 (Chat GPT 5.5, 6/2026, unedited)

##

Below is a blog-ready essay draft. I verified the key source anchors, especially Phillips/Science, CMS/MedPAC CLFS mechanics, Garau/Towse/Garrison, Sireci/AMP, and the ASPE molecular diagnostics report. (Book Cafe)

Diagnostics May Create Value, But Medicare Lab Pricing Usually Does Not Price Value

In a recent Science article, Kathryn A. Phillips, David M. Horn, and Robert M. Califf argue that diagnostics are underfunded and undervalued relative to the burden of disease they can address. Their article, “Diagnostics investments and disease burden,” makes an important and timely point: diagnostics can shape care pathways, reduce uncertainty, guide treatment, avoid ineffective interventions, and sometimes produce value well beyond the cost of running the test itself.

Full link: https://bookcafe.yuntsg.com/ueditor/jsp/upload/file/20260416/1776310800792028847.pdf

Phillips et al., 2026, Diagnostic investments and disease burden.  Science 392:151-3.  

This argument is correct as far as it goes. But it leaves out a critical distinction. It is one thing to say that diagnostics create value. It is another thing to design a payment system that allows the developer or performing laboratory to capture that value. In the U.S. laboratory market, especially under Medicare’s Clinical Laboratory Fee Schedule, that second step is often missing.

Medicare lab pricing is not a miniature health technology assessment system. It does not ask, in any routine way, whether a test avoids a CT scan, prevents a hospitalization, substitutes for a biopsy, prevents use of a futile drug, or changes a clinical decision in a way that produces downstream savings. Instead, new laboratory tests generally enter Medicare payment through crosswalk, gapfill, or—after PAMA implementation—private-payer median pricing.

CMS describes the CLFS process directly. For most clinical diagnostic laboratory tests, Medicare now pays based on the weighted median of private-payer rates reported by applicable laboratories. For new tests, CMS uses the annual CLFS process. Crosswalking occurs when a new test is judged similar to an existing test, so an existing code or group of codes is used to set payment. Gapfilling occurs when no comparable existing test is available, in which case Medicare Administrative Contractors develop local payment amounts and CMS later calculates a national median.

CMS CLFS page:
https://www.cms.gov/medicare/payment/fee-schedules/clinical-laboratory-fee-schedule-clfs

CMS CLFS annual public meetings page:
https://www.cms.gov/medicare/payment/fee-schedules/clinical-laboratory-fee-schedule-clfs/annual-public-meetings

CMS MLN CLFS fact sheet:
https://www.cms.gov/files/document/mln006818-clinical-laboratory-fee-schedule.pdf

This is not value-based pricing. It is administrative pricing, comparability pricing, resource pricing, or market-reported pricing, depending on the route. The test may be valuable, but “value” is not the central input into the Medicare price.

MedPAC’s description of the CLFS makes the same point in more policy-neutral terms. Before PAMA, Medicare lab payments were based on historical charges and national limitation amounts. After PAMA, Medicare shifted to private-payer-based rates. For gapfilled tests, MedPAC describes the process as using information such as charges, discounts to charges, and resources required to perform the test. Again, those are not measures of clinical utility or downstream value.

MedPAC June 2021 report:
https://www.medpac.gov/wp-content/uploads/import_data/scrape_files/docs/default-source/default-document-library/jun21_ch9_medpac_report_to_congress_sec.pdf

MedPAC Payment Basics, Clinical Laboratory Services Payment System:
https://www.medpac.gov/wp-content/uploads/2024/10/MedPAC_Payment_Basics_25_clinical_lab_FINAL_SEC.pdf

The Association for Molecular Pathology’s “Molecular Pathology Economics 101” paper is also useful background. It walks through coding, coverage, and reimbursement for molecular diagnostics, including CPT coding, PLA codes, coverage determinations, CLFS pricing, PAMA, and related mechanisms. It is an excellent professional overview because it shows how much of molecular diagnostic reimbursement is determined by coding architecture, coverage policy, claims processing, and fee schedule mechanics—not by a formal economic valuation of the test’s contribution to the care pathway.

Sireci AN et al., “Molecular Pathology Economics 101: An Overview of Laboratory Reimbursement,” Journal of Molecular Diagnostics, 2020:
https://pmc.ncbi.nlm.nih.gov/articles/PMC7267794/

This distinction—value creation versus value capture—is the missing center of the debate.

Suppose a diagnostic test costs $10 to perform and avoids a $400 CT scan. A simple value argument says the test is worth much more than $10. Perhaps Medicare should pay $100, $200, or even something near the avoided cost, because the system saves money. But the CLFS does not normally work that way. Crosswalk asks what similar tests are paid. Gapfill asks about charges, costs, resources, and payer rates. PAMA asks what private payers have paid. None of these mechanisms asks: “How much downstream medical spending did this test avoid?”

The problem becomes even sharper in an open laboratory market. Imagine Medicare, or a value assessment body, concludes that a diagnostic test produces $1,000 of system value. That does not mean the performing laboratory can actually charge $1,000. If the test is technically replicable, Quest might offer it to a payer for $900. Labcorp might bid $800. Quest might respond at $700. A regional lab might go lower still. The final price may drift toward the competitive cost of producing the service, not the social value created by the information.

This is not just a thought experiment. A 2008 HHS ASPE report on coverage and reimbursement for complex molecular diagnostics states the economic point unusually clearly. It asks the reader to imagine a test that costs $1 to run but saves $1,000 in health care costs. If many competitors can produce the test, economics suggests that the price will tend toward the cost of production. In other words, high clinical value does not automatically produce high market price.

ASPE report, “Current Issues and Options: Coverage and Reimbursement for Complex Molecular Diagnostics,” 2008:
https://aspe.hhs.gov/reports/current-issues-options-coverage-reimbursement-complex-molecular-diagnostics-0

PDF version:
https://www.twentyfirstcenturymedicine.org/wp-content/uploads/2014/02/Coverage_and_Reimbursement_for_Molecular_Diagnostics_Current_Issues.pdf

That is probably the cleanest published statement of the issue. Diagnostics may create system value, but unless the payment system has a way to assign and preserve that value, the price can collapse back toward cost, comparability, or payer-negotiated market rates.

The health economics literature has recognized parts of this problem for years. Garau, Towse, Garrison, Housman, and Ossa asked directly whether value-based pricing could be applied to molecular diagnostics. Their answer was broadly yes in principle, especially for companion diagnostics and other tests closely linked to drug use. But they also recognized that current diagnostic pricing systems are often driven by administrative practices and expected production cost. That observation remains central. Value-based pricing can be conceptually attractive, but conceptually attractive pricing is not the same thing as an enforceable payment system.

Garau M, Towse A, Garrison L, Housman L, Ossa D. “Can and should value-based pricing be applied to molecular diagnostics?” Personalized Medicine, 2013:
https://pubmed.ncbi.nlm.nih.gov/29783475/

Office of Health Economics working paper version:
https://www.ohe.org/wp-content/uploads/2014/07/369-Can-and-Should-VBP-Diagnostics-Garau-2012.pdf

The issue is especially difficult because diagnostics are not drugs. A patented drug often has a period of market exclusivity. A drug manufacturer may be able to defend a value-based price because competitors cannot immediately produce the same molecule. Diagnostics are different. Some proprietary tests have trade secrets, algorithms, FDA status, brand recognition, or evidence packages that create differentiation. But many laboratory tests are more vulnerable to replication, substitution, parallel development, or payer-driven contracting. If multiple laboratories can offer sufficiently similar information, the payer’s purchasing logic becomes competitive procurement, not value capture.

The stratified medicine literature has described this problem in another way. Trusheim and colleagues have emphasized that diagnostic developers often face lower reimbursement and weaker protection than drug manufacturers, even when their tests are essential to the value of a therapy. The diagnostic may be the gatekeeper for the right drug, the right patient, and the right timing, but the economic surplus is often captured elsewhere—by the drug manufacturer, the payer, the provider system, or the patient—not by the diagnostic developer.

Trusheim MR et al., NBER working paper, “An Overview of the Stratified Economics of Stratified Medicine”:
https://www.nber.org/system/files/working_papers/w21233/w21233.pdf

Trusheim MR et al., “The clinical benefits, ethics, and economics of stratified medicine,” Personalized Medicine, 2015:
https://pubmed.ncbi.nlm.nih.gov/26542060/

This creates a policy paradox. The diagnostic may be essential to value-based care, but the diagnostic itself may be paid under a commodity-like laboratory fee schedule. A $5,000 drug decision can depend on a $500 test. A $100,000 oncology regimen can depend on a molecular profile. A costly Alzheimer’s disease care pathway may depend on a biomarker result. But if the lab payment system prices by crosswalk, gapfill, or private-payer medians, the price of the diagnostic is not naturally linked to the value of the downstream decision.

Phillips, Horn, and Califf are therefore right to argue that diagnostics deserve more attention in health policy. Diagnostics can be a neglected infrastructure for better medicine. But the hard part is not only proving value. The hard part is designing institutions that convert diagnostic value into durable payment.

Several mechanisms could, in theory, do this. A payer could create a shared-savings arrangement in which a test developer receives part of the downstream savings. A bundled payment model could include diagnostic-informed management and reward the provider group for choosing high-value tests. A coverage-with-evidence-development model could pay more during a defined evidence-building period. A companion diagnostic could be valued jointly with the therapy it enables. A public payer could create special statutory payment rules for certain high-impact diagnostics. A procurement system could award exclusive or preferred status in exchange for evidence, access commitments, and price discipline.

But without some such mechanism, “value-based pricing” is mostly a slogan. The open market will not necessarily preserve a value price. Payers seek discounts. Laboratories seek volume. Competitors enter. Fee schedules normalize. PAMA collects market prices and feeds them back into Medicare rates. The system can take an initially high-value test and gradually re-express it as a code, a crosswalk, a weighted median, or a contract line item.

This is why the simple phrase “diagnostics should be reimbursed based on value” can be misleading. It compresses three separate questions into one.

First, does the diagnostic create clinical and economic value? Often yes.

Second, can that value be measured credibly? Sometimes yes, sometimes no, depending on the evidence base, timing, comparator, and downstream assumptions.

Third, can the payment system allow the test developer or performing laboratory to capture an appropriate share of that value? That is the hardest question, and in many cases the current answer is no.

The Medicare CLFS is particularly ill-suited to this task. It is a fee schedule, not a value-capture engine. Its logic is administrative, procedural, and code-based. It can assign a payment amount, but it does not typically model the test’s value in the care pathway. Nor does it protect that payment amount from later market erosion when private-payer rates are collected under PAMA. In fact, PAMA can make the problem circular: private market discounts influence Medicare rates, and Medicare rates then influence the broader market.

That does not mean value-based diagnostic payment is impossible. It means it requires explicit policy design. If the goal is to reward diagnostics for avoided costs, better outcomes, reduced uncertainty, or improved targeting of therapy, then those goals must be built into the payment mechanism. Otherwise, value remains a narrative used in coverage arguments, investor decks, and policy essays, while the actual payment amount is determined by crosswalk, gapfill, private-payer medians, or competitive contracting.

The most precise formulation may be this: diagnostics often create value, but laboratory payment systems usually price services. Value creation and service pricing are not the same thing.

Phillips and colleagues have performed a useful service by elevating diagnostics as a neglected category of medical innovation. But the next article needs to go one layer deeper. It should ask not only whether diagnostics are valuable, but how a fragmented and competitive laboratory market can preserve a value-based price. Without that answer, policy may continue to celebrate diagnostic value while reimbursing diagnostics as commodities.

A Few Facts about Contractor Advisory Committee Meetings

Historic: A standing state-level body of specialists on-call to MAC

Today:  A special committee, often of national experts, brought together by ZOOM for a two hour structured evidence discussion with a dozen questions in advance.

####

Historically (2000-2010 or later), MACs were asked to have standing committees of advisors in 15-20 medical specialties.   The appointment process, or duration of standing, were not defined.  I believe that the LCD manual - called "IOM PIM Chapter 13" - had a list of specialties either in the body of the chapter or in an appendix.

Today, the 2019 IOM PIM Chapter 13 does not list specialties, and I believe CAC panels are often assembled as small groups of experts for a topical meeting 

https://www.cms.gov/regulations-and-guidance/guidance/manuals/downloads/pim83c13.pdf 

In addition, today the panels are often assembled to discuss a topic PRIOR to the MAC writing (or revising) an LCD on the topic.

Here is the webpage of past CAC meetings at the Noridian website:

https://med.noridianmedicare.com/web/jeb/policies/lcd/cac

While there is no overall list of standing CAC members (if there are any), and there is no simple list of CAC contributors for a specific topic, each meeting generates a transcript so the number of experts and their names/affiliations, are easy to construct from the transcript.   

Noridian held a CAC on July 15, 2024, on biomarkers to risk-stratify patients with ductal carcinoma in situ.   There is an AGENDA, a BIBLIOGRAPHY, a KEY QUESTIONS, and a TRANSCRIPT.


https://med.noridianmedicare.com/documents/10546/20409183/071524+CAC+Agenda.pdf

https://med.noridianmedicare.com/documents/10546/20409183/071524+CAC+Bibliography.pdf

https://med.noridianmedicare.com/documents/10546/20409183/071524+CAC+Key+Questions.pdf

https://med.noridianmedicare.com/web/jeb/policies/lcd/cac/cac-transcript-071524

The agenda is trivial (Welcome, Discussion, Closing!).

The bibliography has 30 citations, which were likely provided to panelists in advance.

Likely the most pivotal MAC contribution is the Key Questions, of which there were 12.

The meeting generated a 121-page transcript. AI placed names verbalized in the transcript, into a table:

Affiliations included MSK x2, Columbia Univ, Moffitt, Baylor,  and private practice.

Given the reasonable range of the key questions and the caliber and specialization of the faculty, one would assume this is a strong discussion.  

The meeting ran exactly two hours.  There is no public comment, though the public can passively watch the meeting livestream.   Transcripts generally appear one or sometimes several months after the broadcast.


#
#
Chat GPT 5.5 reviews the transcript.
#
#

The July 15, 2024 MolDX CAC was not a conventional “yes/no” advisory vote; it was a structured expert conversation designed to help MolDX decide what evidence would be needed for Medicare coverage of biomarker tests that risk-stratify DCIS patients after breast-conserving surgery. Gabriel Bien-Willner opened by saying MolDX wanted to write “the best possible policy” using accurate evidence, and that diagnostic coverage would turn on analytical validity, clinical validity, and clinical utility. He emphasized that the hardest questions were not merely whether a test changes management, but whether it changes management to benefit the patient.

The panel itself was intentionally small despite substantial public interest. It included breast surgeons and radiation oncologists from academic and private-practice settings, including Eileen Connolly of Columbia, Kimberly Van Zee and Atif Khan of Memorial Sloan Kettering, John Williams in private practice in Northern Virginia, Brian Czerniecki of Moffitt, and Alastair Thompson of Baylor. The discussion therefore had a useful mix of radiation oncology, surgery, academic trial knowledge, and real-world practice experience.

The central clinical problem was that DCIS is biologically and clinically ambiguous. It is “cancer” in terminology and patient psychology, but its natural history, mortality risk, recurrence risk, and need for radiation vary widely. The panel generally agreed that current DCIS management is imperfect: radiation after lumpectomy reduces ipsilateral breast tumor recurrence, but often without a survival advantage, and many patients may receive radiation for a small absolute benefit. Atif Khan captured the mood bluntly, saying the current process for deciding radiation in DCIS is “very flawed,” while also noting that use of biological signatures to refine risk is a rational extension of oncology practice.

A major theme was the difference between prognosis and prediction of radiation benefit. Several experts accepted that clinicopathologic factors can estimate recurrence risk, but the more valuable clinical question is whether a patient will actually benefit from radiation. Eileen Connolly made this point clearly: one can predict recurrence risk, but that is not the same as predicting benefit from radiation. She argued that a test suggesting a patient will not benefit from radiation can materially help clinicians and patients avoid treatment, because many patients otherwise say, in effect, “even if my risk is low, I still get some benefit, so I want radiation.”

The panel wrestled with what “low risk” means. There was no single agreed threshold, and experts repeatedly emphasized that low risk depends on patient age, life expectancy, comorbidities, tolerance for risk, and personal preferences. John Williams noted that a “low risk” for a 90-year-old woman might be very different from a “low risk” for a 50-year-old. He suggested that roughly 5% to 10% might be low risk for a general population, while other discussion touched on values around 10% at 10 years or 15% at 15 years.

MolDX pressed the panel toward a policy-usable threshold. Bien-Willner asked what degree of recurrence risk, and what degree of radiation-associated risk reduction, should be considered meaningful enough to justify treatment or coverage of a test. He made clear that management change alone was not enough; the policy question was patient benefit. The discussion eventually centered on absolute, not merely relative, benefit. Connolly noted that with ordinary clinicopathologic features, radiation is often described as giving about a 50% relative risk reduction, but the absolute benefit may be small.

That distinction led to one of the meeting’s most important practical conclusions: a small absolute reduction, around 5 percentage points, may not be clinically meaningful for many DCIS patients. The transcript shows MolDX probing whether a low-risk patient with only modest expected absolute benefit from RT should be treated differently, and the later LCD summarizes the CAC as reaching general consensus that 5% or less absolute reduction in IBTR was a reasonable boundary for “no significant clinical impact” from RT.

There was also a strong patient-communication theme. Kimberly Van Zee described reframing recurrence statistics positively: instead of saying risk falls from 10% to 5%, she tells patients they have a 90% chance of no recurrence, or 95% with radiation. Connolly replied that some patients still focus intensely on the 5% difference, particularly in her majority-minority patient population, where trust and fear may shape decision-making. The exchange made clear that risk tools do not operate in a vacuum; they enter emotionally charged conversations where “small” absolute numbers may still matter deeply to patients.

The panel was divided on current biomarker tests. Some experts saw DCISionRT as valuable because it might identify patients with little or no radiation benefit, which is more clinically useful than generic recurrence prediction. Others were more skeptical. Van Zee argued that before adopting a new biosignature, the field needs published evidence on discrimination, calibration, and actual predictive ability. She also distinguished clearly between the better-established Oncotype DX use in invasive breast cancer and the much less settled evidence base for Oncotype DX DCIS.

A recurring concern was whether biomarker tests add enough beyond existing clinicopathologic tools and nomograms. The experts recognized limitations in tools such as VNPI, MSKCC nomograms, RTOG-style criteria, and physician judgment, but they also worried that biomarker tests might partly repackage or imperfectly improve on those same variables. The later LCD summarizes the CAC as identifying a lack of direct comparisons between biomarker tests and existing scoring systems as a major concern.

Finally, the meeting broadened the endpoint question. IBTR mattered, but some experts placed greater value on invasive recurrence and on predicting relative response to radiation. There was also discussion of “residual risk” groups: patients who might remain at high risk despite standard treatment and therefore may be undertreated rather than overtreated. This is important because a DCIS biomarker could have two opposed uses: sparing low-benefit patients from radiation, or identifying higher-risk patients who need more aggressive care.

Overall, the transcript reads as a sophisticated policy workshop. The experts did not reject biomarkers; most accepted that a genuinely predictive test would be clinically valuable. But the meeting also exposed why MolDX would be cautious: DCIS endpoints are delayed and relatively infrequent, treatment patterns have changed over time, patient preferences are powerful, and current tools must prove incremental accuracy and clinical utility over accessible clinicopathologic methods. The practical policy takeaway was not “biomarkers are useless,” but rather: coverage requires evidence that a test can reliably identify patients for whom radiation provides no meaningful clinical benefit, preferably using a clear absolute-benefit threshold and robust comparison to existing risk-stratification methods.