Paging Dr AI to the Emergency Room! Artificial Intelligence passes U.S Medical Licensing Exam

Paging Dr AI to the Emergency Room! Artificial Intelligence passes U.S Medical Licensing Exam
HIGHLIGHTS

ChatGPT and Flan-PaLM passed the U.S. Medical Licensing Examination (USMLE)

Many believe that AI tools can have significant applications in the medical field, particularly in clinical research

However, not everyone is convinced.

Two AI programs, including ChatGPT, have successfully passed the U.S. Medical Licensing Examination (USMLE), according to recent research papers. The papers discussed different methods of using large language models to take the USMLE, which includes three exams: Step 1, Step 2 CK, and Step 3. ChatGPT, developed by OpenAI, is a language AI model that generates human-like text based on prompts from users. It has gained popularity for its potential use in clinical practice, but results have been mixed.

Artificial Intelligence

How did AI perform on USMLE?

In a December medRxiv paper, researchers from Ansible Health in California evaluated ChatGPT's performance on the USMLE without any additional training or preparation. The results showed that ChatGPT was able to perform at greater than 50% accuracy across all of the exams and achieved 60% accuracy in most of the analyses. The authors noted that while the passing threshold for the USMLE varies year to year, it typically is around 60%.

"ChatGPT performed at or near the passing threshold for all three exams without any specialized training or reinforcement," said the report, adding that the AI model demonstrated "a high level of concordance and insight in its explanations."

"These results suggest that large language models may have the potential to assist with medical education, and potentially, clinical decision-making," said the report.

 ChatGPT

Flan-PaLM also scored well on the USMLE

Interestingly, in a December arXiv paper, another large language model called Flan-PaLM was evaluated on the USMLE. The key difference between Flan-PaLM and the model in the first paper was that Flan-PaLM was heavily modified using a medical question-answering database called MultiMedQA before taking the exams, said researchers including Vivek Natarajan an AI researcher. The model achieved 67.6% accuracy in answering USMLE questions, which was about 17 percentage points higher than the previous best performance using PubMed GPT.

Should AI tools be used in the medical field? 

According to Natarajan and his team, large language models "present a significant opportunity to rethink the development of medical AI and make it easier, safer and more equitable to use."

Recently, ChatGPT, and other AI models, have been spotted as authors of papers published on PubMed, discussing the various applications of such technology in medicine. However, not everyone is convinced that this is a good idea. 

 Flan-PaLM

One concern about using AI programs in research is whether they can truly make meaningful contributions to a paper, while another issue is that AI tools cannot provide consent to be a co-author. The editor of one of the papers that listed ChatGPT as an author stated that it was a mistake and would be corrected, according to an article by Nature. Despite this, researchers have published multiple papers showcasing the potential use of these AI programs in medical education, research, and clinical decision-making.

Natrajan and his team disagree. They believe that AI tools can contribute significantly to the medical field, and hope that their findings will help "spark further conversations and collaborations between patients, consumers, AI researchers, clinicians, social scientists, ethicists, policymakers and other interested people in order to responsibly translate these early research findings to improve healthcare."

Kajoli Anand Puri

Kajoli Anand Puri

Kajoli is a tech-enthusiast with a soft-spot for smart kitchen and home appliances. She loves exploring gadgets and gizmos that are designed to make life simpler, but also secretly fears a world run by AI. Oh wait, we’re already there. View Full Profile

Digit.in
Logo
Digit.in
Logo