Google Unveils Advanced Medical AI Models, Claimed to Surpass GPT-4
Google launched a new family of artificial intelligence (AI) models last month targeting the medical domain. These AI models, known as Med-Gemini, are not yet accessible for public use, but the tech behemoth has issued a pre-print edition of its research paper outlining its capabilities and methods. The company claims that the AI model outperforms the GPT-4 model in benchmark testing. One of the unique qualities of this AI model is its long-reference capabilities, which allow it to process and interpret health data and research publications. The research work is currently in the pre-print stage and has been published in arXiv, an open access Internet repository for scholarly papers.
“I am very excited about the possibilities of these models to help clinicians provide better care, as well as help patients better understand their medical conditions. In my opinion, AI for healthcare will be one of the most impactful application domains for AI,” Jeff Dean, Chief Scientist, Google DeepMind and Google Research, said at X. Gemini 1.0 and 1.5 LLM serve as the foundation for Med-Gemini AI models. There are four versions in total: Med-Gemini-L 1.0, Med-Gemini-M 1.0, Med-Gemini-M 1.5 and Med-Gemini-S 1.0. Each multimodal model can produce text, image, and video output.
The models are coupled with online searches that have been enhanced by self-training to make the models “more realistically accurate, reliable and nuanced” when displaying results for complex clinical reasoning tasks. The company further states that the AI model has been optimized for better speed during long-reference processing. High-quality long-reference processing will allow the chatbot to provide more accurate and precise answers when questions are not fully asked or when it must process a large number of medical records.
According to Google statistics, Med-Gemini AI models performed better on text-based reasoning tasks than OpenAI’s GPT-4 models in the GenTuring dataset. Med-Gemini-L 1.0 also achieved 91.1 percent accuracy on MedQ (USMLE), beating its predecessor Med-PALM2 by 4.5 percent. Notably, the AI model is not available to the general public or for beta testing. The company is likely to make further improvements before introducing the model to the public.