Try this. Upload a blood panel into any general-purpose AI chatbot, save the answer, and come back tomorrow with a fresh chat. Upload the same panel. Ask the same question. You will get a different answer.
Different priorities, different recommendations, sometimes a different overall read of what's going on with the patient. The technology is built that way, and clinicians who have started using it for lab review have started to notice.
I want to spend a few minutes on why this matters, because it's the latest chapter in a much longer story.
I started practice in 1998. Google search was new, and patients were arriving at appointments with printouts from the web, often convinced they knew more than the practitioner sitting across from them. That was the first wave. Information became something the patient brought to the appointment instead of left with.
The second wave was direct-to-consumer lab testing. Patients walked in with the actual diagnostic data, not just articles about it. Tests that previously required a practitioner to authorize became available to anyone with a credit card.
The third wave was the rise of consumer health platforms. Companies bundling the lab order, an interpretation built on their own clinical methodology, and often a recommended set of supplements into one consumer-facing product. The patient no longer needs a practitioner to tell them what their labs mean. The platform tells them, and ships the bottles to their door.
AI is the fourth wave. The patient has the lab data and now has a tool that will read it for them in plain language, on demand, for free, regardless of where the data came from.
Every wave has narrowed the gap between what a practitioner offers and what a patient can access on their own. The honest question for any of us in practice is whether that gap has fully closed. For lab review I don't think it has, and the reason has to do with how AI actually produces an answer.
Large language models don't look up answers from a database. They generate them, word by word, by sampling from a probability distribution. The same input can produce a different output every time you run it, and most public chatbots are configured to behave this way by design.
Researchers have started measuring what this means in clinical contexts. A 2025 study evaluating LLM behavior on medical reasoning tasks found that an AI's ability to reach a correct answer was a separate question from its ability to reach the same answer twice. Accuracy and consistency turned out to be two different things.
The implication for lab review shows up immediately. A patient could get a reasonable interpretation on Monday and a worse one on Friday with no way to tell the difference. The same thing happens to a practitioner who runs the same panel through the same chatbot a couple of weeks apart.
Variable output is a design property of these systems, baked into how they generate text. No patch is coming for it.
Whatever you think of any particular consumer health platform, those companies run on a fixed methodology. The same labs produce the same report. That puts them in a categorically different bucket from a generative chatbot, and the third wave got that part right.
A clinical framework built on Functional Blood Chemistry Analysis works on the same principle. The optimal ranges are fixed, the pattern logic is rule-based, and the condition assessments are computed against defined clinical criteria. The same labs produce the same report every time.
Six months later, when that patient sits across from their practitioner for a follow-up, the comparison is honest. The biomarkers move against fixed reference points, and the patterns surface the same way they did before. A public chatbot cannot deliver that level of consistency. Chatbots are designed to produce fluent, plausible text on demand, which is a separate engineering problem from producing identical text on repeat.
This is also why a clinical framework like Optimal DX sits in the practitioner's hands rather than the patient's. It is built to support a practitioner's reasoning across visits, with a methodology a practitioner can explain to a patient and stand behind in writing.
I'm not against AI. I use it in my own work every day, and practitioners who refuse to engage with it are going to fall behind. My argument is narrower than that. It's about where AI belongs in the workflow.
A structured, reproducible report should be the foundation. AI works well as a layer on top of that foundation, a second-opinion tool that can help spot relationships across biomarkers, suggest lines of inquiry, and pressure-test the practitioner's thinking. The structure handles what needs to be repeatable, while AI handles what benefits from generative reasoning. The two together work better than either alone.
A patient uploading labs into a chatbot gets a different answer every Friday.
A practitioner working from a structured Functional Blood Chemistry Analysis framework gets the same answer every Friday, can explain why, and brings something to the conversation that 25 years of internet, direct-to-consumer testing, consumer health platforms, and AI have not been able to replace.
After this much time in practice, that's the part I find worth protecting.
To learn more about how Optimal DX supports practitioners with reproducible Functional Blood Chemistry Analysis, visit optimaldx.com/pricing.
