A groundbreaking tool has been developed by scientists, boasting an impressive 99% accuracy in identifying scientific text produced by AI, specifically targeting ChatGPT and similar systems. This unique detection tool, created through a combination of a smaller dataset and human insight, is designed to distinguish AI-generated content from human writing, with a particular emphasis on scientific literature found in peer-reviewed journals. It surpasses the accuracy of general-purpose detectors by focusing on the specific domain of scientific writing.
Heather Desaire, a chemist at the University of Kansas who specializes in applying machine learning to biomedical studies, has successfully devised this novel tool capable of discerning scientific text generated by ChatGPT with exceptional accuracy.
A recent study, published in the esteemed peer-reviewed journal Cell Reports Physical Science, showcased the effectiveness of Desaire’s AI-detection method, along with providing the necessary source code for others to replicate the tool.
According to Desaire, who holds the Keith D. Wilner Chair in Chemistry at KU, precise AI-detection tools are urgently needed to safeguard scientific integrity.
“ChatGPT and other AI text generators have the ability to fabricate information,” she explained. “In the realm of academic science publishing, which deals with groundbreaking discoveries and the forefront of human knowledge, we simply cannot allow the contamination of the literature with plausible-sounding falsehoods. If AI text generators become commonplace, these fabrications will inevitably find their way into publications. To my knowledge, there is currently no foolproof automated method to identify these ‘hallucinations,’ as they are known. When authentic scientific facts are combined with perfectly believable AI-generated nonsense, the trustworthiness and value of such publications will undoubtedly diminish.”
Desaire emphasized that the success of her detection method lies in narrowing the scope of scrutiny to scientific writing commonly found in peer-reviewed journals. This specialized approach significantly enhances accuracy compared to existing AI-detection tools like the RoBERTa detector, which aim to identify AI in more general writing.
“While it is relatively easy to create a highly accurate method to distinguish human writing from ChatGPT-generated writing, it necessitates restricting the analysis to a particular group of humans who write in a specific manner,” Desaire elaborated. “Existing AI detectors are generally designed as versatile tools applicable to any form of writing. While they serve their intended purpose, they cannot match the precision of a tool developed for a specific and narrow objective when it comes to a particular type of writing.”
Desaire highlighted the crucial need for precise AI detection tools in academia, urging university instructors, grant-giving entities, and publishers to have a reliable means of distinguishing AI-generated output from genuine human-authored work.
“When considering the issue of ‘AI plagiarism,’ an accuracy rate of 90% is insufficient,” Desaire emphasized. “Accusing individuals of surreptitiously employing AI and frequently being mistaken in those accusations is unacceptable. Accuracy is of utmost importance. However, achieving accuracy often comes at the expense of generalizability.”
Desaire’s coauthors, all members of her KU research group, include Romana Jarosova, a research assistant professor of chemistry; David Huax, an information systems analyst; and graduate students Aleesa E. Chua and Madeline Isom.
The success of Desaire and her team in detecting AI-generated text can be attributed to the significant human insight involved in their coding process, as opposed to relying solely on machine-learning pattern detection.
“We utilized a much smaller dataset and relied heavily on human intervention to identify key distinctions that our detector could focus on,” Desaire explained. “To be precise, we developed our strategy using only 64 human-written documents and 128 AI-generated documents as training data. This is roughly 100,000 times smaller than the dataset sizes used by other detectors. People often underestimate the significance of numbers. However, a difference of 100,000 times is akin to the difference in cost between a cup of coffee and a house. Thus, our dataset was small enough to be processed rapidly, and every document could be thoroughly examined by humans. We utilized our cognitive abilities to identify valuable differences in the document sets, rather than relying solely on previously developed strategies for distinguishing between humans and AI.”
Indeed, the research team at KU designed their approach without consulting the existing literature on AI text detection until after they had already developed their own functional tool.
“I must admit, it’s a little embarrassing, but we didn’t even refer to the literature on AI text detection until after we had a functional tool of our own,” Desaire admitted. “Our approach was not based on the way computer scientists typically think about text detection. Instead, we relied on our intuition regarding what would be effective.”
Additionally, Desaire and her team took a different approach compared to previous researchers when it came to designing AI-detection methods.
“We did not prioritize AI-generated text when identifying key features,” she clarified. “Our focus was on human-generated text. Most researchers developing AI detectors tend to ask themselves, ‘What does AI-generated text look like?’ We, on the other hand, asked, ‘What does this distinctive group of human writing look like, and how does it differ from AI-generated text?’ Ultimately, AI-generated writing is still human writing since AI generators are constructed using vast repositories of human-authored text. However, AI-generated writing, particularly from ChatGPT, is generalized human writing sourced from various origins.
“Scientific writing, on the other hand, is not generalized human writing; it is scientists’ writing. We scientists form a highly unique group.”
Desaire has made the code for her team’s AI-detection tool fully accessible to other researchers interested in expanding upon their work. She hopes that more individuals will recognize that AI and AI detection are within reach for those who may not currently consider themselves computer programmers.
“ChatGPT represents a monumental advancement, and its rapid adoption by countless individuals signifies a pivotal moment in our reliance on AI,” Desaire remarked. “However, the truth is, with the right guidance and effort, even a high school student could achieve what we have accomplished.
“There are immense opportunities for people to engage in AI, even without a computer science degree. None of the authors of our manuscript hold degrees in computer science. One of the outcomes I hope to see from this work is that individuals interested in AI will realize that the barriers to developing real and valuable products, like ours, are not insurmountable. With a modest amount of knowledge and a dash of creativity, numerous individuals can contribute to this field.”
Reference: Desaire, H., Chua, A. E., Isom, M., Jarosova, R., & Hua, D. (2023). Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using off-the-shelf machine learning tools. Cell Reports Physical Science. DOI: 10.1016/j.xcrp.2023.101426
Frequently Asked Questions (FAQs) about AI-detection
What is the purpose of the AI-detection tool mentioned in the text?
The purpose of the AI-detection tool is to identify scientific text generated by AI, specifically targeting systems like ChatGPT, with 99% accuracy. It aims to distinguish between AI-generated content and human writing, particularly in the context of academic science publishing.
How accurate is the AI-detection tool?
The AI-detection tool achieves an accuracy rate of 99% in identifying AI-generated scientific text. This high level of accuracy ensures that AI-generated falsehoods are not inadvertently included in publications, preserving the trustworthiness and value of scientific literature.
How does the AI-detection tool differ from existing detectors?
Unlike general-purpose detectors, the AI-detection tool described in the text focuses specifically on scientific writing found in peer-reviewed journals. By narrowing its scope to this domain, it offers greater accuracy in distinguishing AI-generated content from human-authored text compared to more generalized detectors.
What is the significance of the smaller dataset used in developing the tool?
The researchers utilized a smaller dataset consisting of 64 human-written documents and 128 AI-generated documents for training the AI-detection tool. This smaller dataset, combined with significant human intervention, allowed for a more thorough examination and identification of key distinctions, contributing to the tool’s high accuracy.
Can the AI-detection tool be utilized by researchers and others interested in AI?
Yes, the researchers have made the AI-detection code fully accessible to others who are interested in building upon their work. They hope to encourage more individuals, even those without a computer science background, to engage in AI research and contribute to the field of AI detection.
More about AI-detection
- “Distinguishing academic science writing from humans or ChatGPT with over 99% accuracy using off-the-shelf machine learning tools” (Research Paper): Link
- Cell Reports Physical Science (Journal): Link
- University of Kansas Department of Chemistry: Link