30-Second Takeaway
- Most commercial radiology AI validations lack per‑subgroup performance reporting, limiting bias assessment.
- A single non‑open VLM matched or exceeded radiologists for ED chest x‑ray report acceptability.
- Radiomics ML for bladder cancer shows strong AUROCs but has high risk of bias and low certainty.
Week ending June 13, 2026
Selected AI, radiomics, and communication methods with immediate relevance to radiology practice
Most commercial radiology AI validations omit demographic subgroup performance.
This scoping review screened 545 validation studies of 252 commercial radiology AI products and found only 392 reported any demographic subgroup data. Just 77 studies presented subgroup performance results, preventing robust assessment of algorithmic bias across sex, age, and race/ethnicity. 14 of 21 tuberculosis datasets were likely underpowered for post‑hoc subgroup meta‑analysis, limiting minority‑group inference. Authors conclude fragmented reporting impedes clinician trust and call for mandatory, transparent subgroup performance reporting by stakeholders.
One vision‑language model produced more acceptable ED chest x‑ray reports than radiologists.
In 478 ED patients with same‑day CXR and CT, the VLM “AIRead” had higher clinical acceptability than radiologist reports (84.5% vs 74.3%). AIRead also showed a lower RADPEER 3b rate (5.3% vs 13.9%) with hallucination rates comparable to radiologists. Other tested VLMs had higher disagreement, lower acceptability, and more hallucinations, with variable sensitivity for common thoracic findings. These results support piloting select VLMs for preliminary CXR reporting but require local validation against CT and radiologist performance.
AI‑assisted DICOM label standardization reduced reading times and achieved high label accuracy.
A hybrid AI tool standardized DICOM labels across 422 CR images and 1503 CT series with labeling accuracies from 83–100% for body part and 91–100% for plane. After implementation, mean reading times fell significantly for several CT protocols (eg, CT abdomen −2.9 minutes; temporal bone −2.5 minutes). Extrapolated savings equaled 270 hours annually, with relative efficiency gains of 8–22% for impacted protocols. Evaluate accuracy on your local study mix and monitor for unchanged protocols (CT chest, sinus) that showed no time benefit.
References
Numbered in order of appearance. Click any reference to view details.
Additional Reads
Optional additional studies from this edition.