BMC Medical Informatics and Decision Making

Table 6 Accuracy of different models on simple questions

From: Effectiveness of various general large language models in clinical consensus and case analysis in dental implantology: a comparative study

Model	Accuracy	CL95% (Wilson Score Interval)
ChatGPT-4	0.74	0.604 to 0.841
Qwen 2.0 72B	0.6	0.462 to 0.724
Claude 3 Opus	0.72	0.583 to 0.825
Gemini Pro 1.5(0801)	0.8	0.670 to 0.888

Back to article page

ISSN: 1472-6947

Contact us

General enquiries: journalsubmissions@springernature.com