The ChatGPT artificial intelligence (AI) has become good enough to fool trained scientists into thinking they are reading text written by a human.
A team of researchers used the AI to generate fake research paper abstracts to test whether other scientists could spot them.
Abstracts are neat summaries added to the top of research papers to give an overall picture of what’s being studied. ChatGPT was tasked with writing 50 medical research abstracts after being ‘trained’ on a selection from the likes of The British Medical Journal (BMJ) and Nature Medicine.
The chatbot, which has taken the internet by storm since being released to the public in November, didn’t disappoint.
Not only did the computer’s text pass successfully through an anti-plagiarism detector, but the actual scientists couldn’t spot the fakes. The human reviewers correctly identified only 68 per cent ChatGPT’s abstracts and 86 per cent of the authentic ones.
The group of medical researchers believed that 32 per cent of the AI-generated abstracts were real.
‘I am very worried,’ said Sandra Wachter, who studies technology and regulation at the University of Oxford.
Professor Wachter was not involved in the research but told nature.com: ‘If we’re now in a situation where the experts are not able to determine what’s true or not, we lose the middleman that we desperately need to guide us through complicated topics.’
The researchers that conducted the test, led by Catherine Gao at Northwestern University in Chicago, Illinois, said the ethical boundaries of this new tool have yet to be determined.
‘ChatGPT writes believable scientific abstracts, though with completely generated data,’ they explained in the pre-print write-up of their study.
‘These are original without any plagiarism detected but are often identifiable using an AI output detector and skeptical human reviewers.
‘Abstract evaluation for journals and medical conferences must adapt policy and practice to maintain rigorous scientific standards; we suggest inclusion of AI output detectors in the editorial process and clear disclosure if these technologies are used.
‘The boundaries of ethical and acceptable use of large language models to help scientific writing remain to be determined.’