In partnership with

Fast browsing. Faster thinking.

Your browser gets you to a page. Norton Neo gets you to the answer. The first safe AI-native browser built by Norton moves with you from idea to action without slowing you down. Magic Box understands your intent before you finish typing. AI that works inside your flow, not beside it. No prompting. No copy-pasting. No switching apps.

Built-in AI, instantly and for free. Privacy handled by Norton. Built-in VPN and ad blocking protect you by default. No configuration. No extra apps. Nothing to think about.

Fast. Safe. Intelligent. That's Neo.

Download Norton Neo

Hi!

Welcome to AIMedily.

I was reading the Stanford AI Index 2026 report on medicine this week, and it’s a really good snapshot of where things are right now.

AI is advancing across a lot of areas, but most of what’s actually being used are the tools that fit into existing workflows, like documentation. A lot of the evidence is still based on simulated data, so there’s still a gap between what we see in research and what’s happening in real clinical care.

It feels like the potential is very clear, but there’s still a lot to figure out in how this fits into practice.

Also, we had a tornado in Ann Arbor last night — not something I expected, but all good here.

Let’s dive into today’s issue.

🤖 AIBytes

LLMs Reach Diagnoses—but Struggle With Clinical Reasoning

Researchers evaluated 21 large language models (LLMs) to assess their ability to perform clinical reasoning tasks compared with final diagnosis accuracy.

🔬 Methods

Study design: Cross-sectional evaluation
Models: 21 large language models
Tasks evaluated:
- Differential diagnosis generation
- Clinical reasoning steps
- Final diagnosis accuracy
Evaluation approach: Compared model outputs to reference standards, focusing on reasoning quality, not only final answers.

📊 Results

LLMs achieved higher performance in final diagnosis tasks compared to reasoning tasks.
Performance decreased significantly in:
- Differential diagnosis generation
- Step-by-step clinical reasoning
Models frequently made errors in:
- intermediate reasoning steps
- prioritization of diagnoses
Correct final answers were sometimes reached despite flawed reasoning pathways.

🔑 Key Takeaways

LLMs can produce correct diagnoses, but reasoning remains inconsistent and unreliable.
Weak performance in differential diagnosis limits clinical decision-making use.
Current models function best as support tools, not independent clinical reasoners.
Careful physician oversight is required for safe implementation.

🔗 Basu S, et al. Large language model performance and clinical reasoning tasks. JAMA Netw Open. 2026;9(4):e236580. doi:10.1001/jamanetworkopen.2026.6580

AI Chatbot Improves Psychiatric Symptoms

Researchers evaluated a conversational AI agent designed to treat psychiatric symptoms and build a therapeutic alliance in a randomized clinical trial.

🔬 Methods

Study design: Randomized clinical trial
Participants: Adults with psychiatric symptoms
Intervention: Conversational AI agent delivering therapeutic interactions
Comparison: Control condition (non-AI or standard approach)
Outcomes measured:
- Psychiatric symptom improvement
- Therapeutic alliance (patient–AI relationship quality)

📊 Results

The AI intervention group showed significant improvement in psychiatric symptoms compared to control.
Participants reported meaningful therapeutic alliance with the AI agent.
The system demonstrated the ability to:
- engage users consistently
- deliver structured therapeutic interactions
Outcomes suggest AI can replicate key elements of therapeutic engagement.

🔑 Key Takeaways

Conversational AI can improve psychiatric symptoms in a controlled setting.
Patients can form a therapeutic alliance with AI, a key factor in mental health care.
AI may expand access to mental health support, especially where resources are limited.
Further validation is needed before broad clinical implementation.

🔗 Fulmer R, Joerin A, Gentile B, et al. Efficacy of a conversational artificial intelligence agent for reducing symptoms of depression, anxiety, and stress: a randomized clinical trial. JAMA Netw Open. 2026;9(4) doi:10.1001/jamanetworkopen.2026.6713

🦾TechTools

Qure.ai

AI platform used to analyze medical imaging and support early detection at scale. Detects abnormalities in chest X-rays and CT scans, including lung nodules.
Prioritizes urgent findings to support faster clinical workflows.
Deployed globally in high-volume and resource-limited settings.

Docus.ai

AI platform designed to help users better understand their health data and medical information. Allows users to upload labs and medical reports for analysis.
Provides AI-generated explanations in more accessible language.
Includes options for physician-reviewed second opinions.

Productivity Tool:

Arc Search

AI-native browser that turns the web into a single, structured answer.
Combines multiple sources into one clear, readable page.
Replaces tabs and links with direct, synthesized information.

🧬AIMedily Snaps

AMA: Applications of AI in health care: Augmented intelligence vs artificial intelligence in medicine (Link).
NEJM Online Event: Value Alignment & Incentive Divergence in Clinical AI (Link).
NIH: WEST AI Algorithm May Help Speed Diagnosis of Rare Diseases (Link).
Jonathan Chen from Stanford on AI in Medicine: Promise, pitfalls, and practice (Link).
Mount Sinai Researchers Develop Machine Learning Model to Predict How CPAP Affects Cardiovascular Disease Risk in Patients With Obstructive Sleep Apnea (Link).
New WHO database helps countries turn health data into better policy (Link).

🧪Research Signals

Nature: Innovating global regulatory frameworks for generative AI in medical devices is an urgent priority (Link).
NEJM AI: I Hope You Are Doing Well — Will AI Widen or Close Health Care’s Disparity Gap? (Link).
JAMA: Machine Learning Model to Predict Postmastectomy Breast Reconstruction Complications (Link).
The Lancet: A multiagent large language model-based system to simulate the liver transplant selection committee: a retrospective cohort study (Link).
NEJM AI: Health Systems Govern Only the Tip of the AI Iceberg (Link).
The Lancet: AI-based BRAIx risk score for the intermediate-term prediction of breast cancer: a population cohort study (Link).