In partnership with

The Business Brief Executives Actually Trust

In a world of sensational headlines and shallow analysis, The Daily Upside stands apart. Founded by former bankers and seasoned journalists, it delivers crisp, actionable insights executives actually use to make smarter decisions.

From market-moving developments to deep dives on business trends, The Daily Upside gives leaders clarity on what matters — without the noise.

That’s why over 1 million readers, including C-suite executives and senior decision-makers, start their day with it.

No fluff. No spin. Just business clarity.

Hello!

Welcome to LLM Friday.

Today, I will share with you 1 research, 2 AI tools, and 3 News on Large Language Models.

Are you ready?

LLMs

This study evaluated how well ChatGPT 4.5 answered common patient questions about lower back pain (LBP), focusing on accuracy, readability, and clinical utility.

🔬 Methods

Questions: 25 standardized patient questions across 5 categories: diagnosis, medical guidance, treatment, self-care, and physical therapy.

Prompting styles tested:

  • No special instructions

  • Patient-friendly (plain language)

  • 4th, 6th, 8th grade reading level

  • Reference-based (with citations)

Evaluation: Three blinded physicians rated responses as:

1= incorrect

2= partially correct

3= correct according to the International Pain and Spine Intervention Society Guidelines.

📊 Results

Accuracy was high (mean score 2.81/3), with less than 6% of responses rated fully incorrect.

Prompt style matters:

  • 8th grade and Reference prompts were most accurate.

  • 4th grade prompts reduced accuracy, often omitting safety details.

  • Overall accuracy: 2.81/3. <6% fully incorrect.

Readability mismatch:

  • Most answers are above 6th-grade level.

  • Only 4th grade prompts produced responses below 8th grade level, but accuracy failed.

Verbosity doesn’t correlate with accuracy: Reference prompts were longer, but not more accurate than shorter prompts.

Errors: 18–30% rated partially correct due to missing safety details, not false information.

🔑 Key Takeaways

  • How questions are made affects ChatGPT’s medical accuracy.

  • Moderate literacy prompts (8th grade level) give the most reliable balance between clarity and accuracy.

  • ChatGPT tends to increase the readability level, producing technical responses.

    💡LLMs may support patient education but should not replace professional guidance.

🔗Basharat A, Shah R, Wilcox N, et al. ChatGPT and low back pain: Evaluating AI-driven patient education in the context of interventional pain medicine. Interventional Pain Medicine. 2025;4:100636. doi:10.1016/j.inpm.2025.100636

🦾TechTools

  • Gives you access to 100+ AI models for language, images, video, and code.

  • Creates visuals and videos with advanced tools like Ideogram and Runway.

  • Uses top LLMs such as GPT-4o, Claude, and DeepSeek.

  • Can code, debug, and do technical tasks.

  • Allows you to ask questions and get answers you can trust based on 420 peer-reviewed journals, and medical guidelines.

  • Shows you the reasoning and evidence behind the advice.

  • Designed to integrate with Electronic Health Records and hospital systems.

  • Reduced risk of hallucination or misinformation.

🧬AIMedily Snaps

  • LLMs in neurology: a paper on the evidence of LLMs in neurological treatment (Link).

  • Do ChatGPT and Gemini's recommendations align with established guidelines for hand and upper extremity surgery? (Link).

  • Nonclinical information in patient messages — like typos, extra white space — reduces the accuracy of LLMs (Link).

That’s all for today.

You’re already ahead of the curve in medical LLMs — don’t keep it to yourself. Forward AIMedily to a friend who’d appreciate it.

Before I go, help me answer this 2 question (anonymous)

Login or Subscribe to participate

Thank you for your answers!

See you next week.

Itzel Fer, MD PM&R

Follow me on LinkedIn | Substack | X | Instagram

Forwarded this email? Sign up here

How did you like today's newsletter?

Login or Subscribe to participate

Seeking impartial news? Meet 1440.

Every day, 3.5 million readers turn to 1440 for their factual news. We sift through 100+ sources to bring you a complete summary of politics, global events, business, and culture, all in a brief 5-minute email. Enjoy an impartial news experience.

AIMedily 17
Exoskeleton

AIMedily 17

Can AI Match PT Feedback in Lower-Limb Rehab? Robotic Hand Therapy at Home for Stroke. Do Doctors Trust their peers who use GenAI? AI is creating viruses that target bacteria. AI turns your text into diagrams. An app for lower back pain.

Itzel Fer
Itzel Fer
Sep 25, 2025
LLMs Friday 7
LLMs

LLMs Friday 7

Physician Burnout Is Rising — Can AI Really Help? An AI tool built on Elsevier data, an AI medical diagnosis assistant, and more.

Itzel Fer
Itzel Fer
Sep 20, 2025
AIMedily 16
Wearables

AIMedily 16

Can a Soft Ankle Exosuit Improve Foot-drop in Teens with Cerebral Palsy? Who’s a better teacher on Spinal Cord Injury: ChatGPT or DeepSeek. ChatGPT for Physiotherapy, motion capture with your phone (for free), and voice notes are easy for you, plus more.

Itzel Fer
Itzel Fer
Sep 17, 2025
LLM Friday 6
LLMs

LLM Friday 6

Can Commercial LLMs Actually Learn Updated Medical Guidelines? An LLM for searching medical guidelines. The medical LLM from OpenAI. A platform to access multiple LLMs.

Itzel Fer
Itzel Fer
Sep 13, 2025