Wednesday, 5 February 2025
23.7 C
Singapore
22 C
Thailand
21.1 C
Indonesia
25 C
Philippines

Hospitals adopt AI transcription tool, but accuracy concerns grow

Hospitals use OpenAI's Whisper for medical transcription, but accuracy concerns rise as AI "hallucinations" emerge, raising patient care risks.

Hospitals nationwide increasingly use an AI transcription tool powered by OpenAIโ€™s Whisper model to record and summarise patient meetings. While this tool shows promising results in easing doctors’ documentation, researchers have raised concerns about its accuracy. Evidence suggests the tool sometimes “hallucinates” โ€“ a term for AI systems producing information that sounds plausible but is incorrect. In these cases, Whisper has been shown to generate completely fabricated phrases, which may be particularly troubling in medical settings.

Widespread use of Whisper in healthcare

According to ABC News, the transcription tool is developed by Nabla. This healthcare tech company estimates its software has processed approximately 7 million medical conversations across more than 30,000 clinicians and 40 health systems. While many doctors and healthcare providers report that the transcription tool improves efficiency, Nabla acknowledges the modelโ€™s potential for inaccuracies and states it is working to address the hallucination issue.

Whisperโ€™s hallucinatory responses can produce errors that range from inserting random, unrelated statements to inventing medical conditions that do not exist. Nabla has confirmed its awareness of these limitations and reassures its clients that it is improving the model to ensure greater accuracy in clinical settings.

Study reveals concerning hallucinations in transcriptions

A recent study by researchers from Cornell University, the University of Washington, and other institutions explored Whisper’s performance under various conditions, including during moments of silence or while working with people affected by language disorders, such as aphasia. The researchers found that the model occasionally inserted sentences or words without input, creating phrases that had no basis in the conversation. Examples of these hallucinations include fabricated conditions and irrelevant comments, such as โ€œThank you for watching!โ€ โ€“ a phrase likely drawn from Whisperโ€™s exposure to millions of hours of YouTube videos during its training.

The study highlighted that Whisper hallucinated in about 1% of the transcriptions, a seemingly small percentage but one that can have serious implications in healthcare. While researchers primarily used samples from TalkBankโ€™s AphasiaBank, they argue that the toolโ€™s tendency to generate content during silent pauses could affect various clinical situations, especially communication difficulties.

OpenAI’s response and ongoing research

OpenAI knows these issues and has responded to researchersโ€™ findings with promising ongoing improvements. OpenAI spokesperson Taya Christianson emphasised that the company is actively refining Whisper to reduce hallucinations. OpenAI has also set strict usage guidelines for its API, advising against using Whisper in high-stakes decision-making contexts without additional checks. OpenAIโ€™s model card for Whisper advises developers against applying it in sensitive areas where accuracy is critical.

Despite Whisper’s potential as a transcription tool, its limitations may leave healthcare providers hesitant to rely on it entirely for medical documentation. For now, hospitals and clinicians may need to review transcriptions thoroughly, especially in sensitive situations where accuracy is paramount.

Hot this week

Resolution Games unveils Battlemarked, a new VR Dungeons & Dragons game

Resolution Games and Wizards of the Coast unveil Battlemarked, a new VR D&D game. Featuring turn-based combat and story-driven campaigns, it launches soon.

Dub: The influencer-led trading app changing investing

Social media and investing collide with Dub, the copy trading app gaining traction among Gen Z. Will it revolutionise investing or face regulatory hurdles?

AI agents transform e-commerce as Qeen.ai secures US$10M seed funding

AI startup Qeen.ai secures US$10M to revolutionise e-commerce with autonomous AI agents, enabling small businesses to scale without relying on ads.

Newgen recognised as a leader in Forresterโ€™s Q1 2025 content platforms report

Newgen named a โ€œLeaderโ€ in Forresterโ€™s Q1 2025 content platforms report, earning top scores in 10 criteria and praised for AI-driven innovation.

Exabeam launches AI-powered LogRhythm Intelligence Copilot to revolutionise threat detection

Exabeam unveils LogRhythm Intelligence Copilot, an AI-driven feature designed to improve threat detection and security team workflows globally.

SECO partners with impact.com to boost Senheng appโ€™s growth through affiliate marketing

SECO partners with impact.com to scale the Senheng app through affiliate marketing, aiming for growth, better ROI, and personalised consumer engagement.

Commvault partners with CrowdStrike to improve cyber threat detection and recovery

Commvault partners with CrowdStrike to enhance threat detection and data recovery, providing businesses with faster responses and stronger cyber resilience.

Unlock free skins during the Overwatch 2 spotlight livestream on February 12

Watch the Overwatch 2 spotlight livestream on February 12 to claim free skins, including Lucioโ€™s Cyber DJ and Flirty Flare Baptiste.

Singtel dominates mobile speeds in Singapore

Singtel and MyRepublic top Ooklaโ€™s 2024 Speedtest Connectivity Report, offering Singaporeans faster and more reliable mobile and broadband internet.

Related Articles