Picture a lecturer, plagued by suspicion and casting a wary gaze upon student essays he suspects were written by Chat GPT. The unsettling question then emerges; how can the lecturer prove it? To put simply, if Chat GPT writes like humans, how can he or she distinguish whether the text was written by a human or by Chat GPT? In this precarious situation, the burden of proof, in my view, should be on the lecturer who has the arduous task of proving the contrary. This is due to the present state of academia, where distinctions between human and machine are increasingly blurred and human and AI authorship closely resemble each other. As academia undergoes transformative changes and accusations of plagiarism can lead to severe consequences such as failing grades or expulsion, it is crucial not to unfairly place the burden of proof on the student author.
Regrettably, many believe that the greatest threat Chat GPT poses to academia lies in its ability to write like humans. However, this author believes that the greater threat lies in the inverse: that humans write indistinguishably from Chat GPT.1 Although it may appear as a mere reversal of the same statement, these are two different problems. In addition, the distinction is very significant. In the former scenario, Chat GPT emulates human-written text in its output. The danger with this is that humans could claim works generated by Chat GPT, when in fact they did not write it.2 In the latter scenario, humans intrinsically write in a way that resembles Chat GPT written output. The danger with this is that actual “human written work” can falsely be said to have been written by ChatGPT.3 This goes largely unnoticed, thus posing a great danger.
In the previous blog, the focus was on Chat GPT’s ability to mimic human writing, which presents a significant challenge in academia in terms of plagiarism and related concerns. In this blog, the focus will shift to the challenge of detecting or proving the use of ChatGPT in academic work. This blog begins by re-highlighting the issue of authorship, focusing on proving the use of ChatGPT in written work, and then discusses why the current methods of detecting ChatGPT’s involvement in academic writing are not only inaccurate, but will likely become redundant as AI continues to advance, and humans continue to learn new creative ways to use it, undetected.
Authorship Dilemma: Who Wrote It? Prove It?
In this blog, the “authorship” dilemma concerned the delicate task of ascertaining true authorship in written works and was discussed in three limbs. First, where the written word is purely generated by ChatGPT. Secondly, where both human and AI efforts generate the work. Thirdly, where the work is generated purely by humans. If work is generated purely by Chat GPT, authorship rights can be argued to belong to the human author since the AI-generated text is not protected by copyright law, as an AI is not recognized as an author.4 Thus, human authors may claim authorship of such works and may choose whether to disclose the use of Chat GPT or even list it as co-author, as has happened in some cases.5 It does, however, become problematic where authors do not disclose the use of Chat GPT where it is disallowed or where disclosing is required. This opens the floodgate for offences like plagiarism, which involves reusing ideas without acknowledgement, and copyright infringement, which involves reusing ideas without permission.6 A need arises in such instances to prove “true” authorship when humans wrongfully attribute Chat GPT-generated work as their own.
Where work is generated purely by Chat GPT, it may be easier to detect the use of Chat GPT using AI detector tools, though not guaranteed since AI detector tools are not always accurate.7 If the human author, however, tweaks the response by Chat GPT appropriately, the text may be able to evade AI detectors.8 Similarly, the practice of plagiarism tools can be applied within the AI-written work context, where AI-detector tools can detect AI-generated text in advance and the student author can make necessary adjustments.9 Additionally, if the human author prompts ChatGPT for a really specific and unique request, chances are high that the response will be unique, which would be harder to detect using AI detector tools.10 For instance, if someone requests Chat GPT to ‘craft an essay, in the style of Bridgerton’s Lady Whistledown, on “The rise of pop culture” in the Kenyan context,’ the inclusion of ‘in the style of Bridgerton’s Lady Whistledown’ and ‘in the Kenyan context’ can be deemed a ‘unique’ prompt. By incorporating such specific elements into their prompt, the response is likely to possess greater distinctiveness, making it harder to discern if it was authored by AI and thus more difficult to prove.
What if more than one person inputs the same data into ChatGPT? However, less likely, that the responses still may be unique since Chat GPT prides itself in giving “unique” responses.11 However, in some cases, Chat GPT might produce content, which is extremely similar to, or the same as what is experienced through training.12 The more unique the text is, the harder it will be to detect the use of AI.13 What if the human author purely retains the output from Chat GPT but adjusts the writing style? Alternatively, if the human author retains their human-written output but relies on Chat GPT to adjust the writing style? An interesting question arises: What is the standard of “unique”? Is it writing style? Sentence structure? Content? In the academic context, all of these matter.14
Where work is a result of both human and AI collaboration, AI detectors may not easily detect Chat GPT use. For example, if someone asks Chat GPT to ‘rewrite their work in the style of Michelle Obama’, or if they ask Chat GPT to rewrite someone else’s work in their preferred context, or writing style, or by adding content regarding a certain subject matter. Consequently, the reader may not be able to tell who the true author is, since both the human author and the AI collaborated in writing.15 In these scenarios, the distinction between human and AI input becomes less apparent, making it harder to detect the AI’s specific influence.16
Prove GPT? AI Detectors et al
The utilization of AI detection tools in such cases runs the risk of unfairly incriminating individuals for plagiarism due to their inaccuracies.17 The repercussions of acting solely on suspicion can be equally troublesome, as exemplified by the unfortunate incident at Texas A&M University.18 In a peculiar turn of events, an entire class found themselves entangled in accusations of plagiarism, leading to the temporary withholding of their hard-earned diplomas. The root cause? A professor’s misguided utilization of ChatGPT as a means to ascertain if the students had employed artificial intelligence to compose their final assignments.
Various AI detector tools have been invented to detect AI-generated work.19 To begin with, OpenAI, the innovator of Chat GPT, bestowed upon us their grand unveiling—a “classifier” that seeks to assess the likelihood of computer or human origins.20 Yet, let us not be swayed by the allure of this creation, for it confesses its limitations.21 By way of illustration, the tool only accurately applies the ‘likely’ mark 26% of the time. This means there may be a 74% probability that it is erroneous.22 Furthermore, by editing computer-generated text, people can deceive it, and it may fail to recognize AI-generated text on topics, which were missing in the initial training details.23 The ever-evolving nature of AI technology, compounded by the fact that it currently only reads a maximum of 250 words at a time,24 is likely to get things wrong on text written by children,25 and on text not in English,26 which also presents a constant challenge. Most of these shortcomings are present in other AI detectors as well.27 In another case, a professor submitted suspect text into the Classifier to figure out whether or not the written response originated from artificial intelligence.28 He received a 99.9% chance of matching. However, unlike traditional plagiarism detection software, AI detector software provided no citations.29 This means that the credibility of this AI detector is questionable since the AI could falsely flag human-written text as AI-generated, and not be held accountable since no alternative sources are offered.
On the other hand, GPTZero, an innovative AI detection tool, employs an intriguing approach by pitting ChatGPT against itself. It assesses the level of AI system participation in creating a particular text, discerning whether there is a complete absence or substantial involvement of the AI system.30 In its quest to determine the authorship of a text, GPTZero employs a clever duo of metrics: perplexity and burstiness. Perplexity serves as a test for measuring the level of sentence randomness, while burstiness captures the overall degree of randomness within the text.31 It operates under the understanding that human writing typically exhibits a mixture of sentence complexities, varying in intricacy, whereas AI-generated texts often display a consistently low complexity throughout, highlighting their distinguishable nature.32 This is problematic in my view since it assumes that all humans have similar writing styles and may present a challenge where a human author’s writing style naturally happens to also have low complexity throughout (similar to AI), thus may be wrongly flagged as AI written text by GPT Zero. In addition, some human authors already creatively combat GPT Zero by simply instructing Chat GPT to “increase perplexity” or “increase burstiness.”33
GPT Zero also demonstrated its lack of accuracy in a study in which it was fed a total of sixteen different types of written content, eight composed by humans and eight created by AI, and accurately recognized human writing only six out of eight times and AI text only seven out of eight times.34 The results of this study demonstrate that if a professor attempted to employ this AI detector tool to catch students doing coursework with AI tools like ChatGPT, almost 20 per cent of the students would be wrongfully accused of academic dishonesty.35 Moreover, when allegations of plagiarism are concerned, where the consequences are profound — that is not good enough. Even though technological advancements can aid in the detection of AI involvement, they have proven to not be fully accurate, thus also unreliable for credible results. Some AI detection tools will even flag human-written content as AI-written because it includes overused terms like ‘”it is important to consider both sides of the argument”, “Firstly”, “Secondly” etc.36
Prove By Writing Style? Or Mere Suspicion?
Attempting to substantiate the utilization of ChatGPT based solely on ‘mere suspicion’ is not only comically absurd but also riddled with significant problems and complications. For instance, Chat GPT often overuses certain phrases like ‘it is important to consider both sides of the argument’.37 If a lecturer gives an assignment in essay format, and most students, unknowingly string together the phrase ‘it is important to consider both sides of the argument’, the lecturer may furrow in suspicion that the students used artificial intelligence and fail them. Moreover, the students, through no fault of their own, may face unjust retribution. Such an action by the lecturer is unethical in my view. We are well past the point where we can say that a particular content was generated by a computer owing to its fluency.38 Additionally, considering that ChatGPT keeps advancing, its errors will become less and less obvious.39 However, lecturers should not assume that Chat GPT possesses a mastery of human expression surpassing that of its human creators. So, how can we detect the subtle presence of Chat GPT, the digital chameleon, infiltrating the realm of academia, when both AI detector tools fail us and the consequences of “plagiarism” are too profound to rely on ‘writing style’ as judged from human suspicion?
Brace yourselves, for academia, will never be the same again. Fighting the use of Chat GPT, as concluded in my previous blog, will likely lead you to a dead end. Proving the use of Chat GPT, as I have demonstrated in this blog, is also a challenge that will have to be confronted. Humans can skillfully navigate Chat GPT detectors, their presence concealed through the artful mastery of manoeuvres like the enigmatic “rewrite”. Additionally, the unreliability of AI detectors does not aid the matter. Consequently, the ability to differentiate between genuine human contributions and AI-generated content erodes, casting doubt upon the credibility of academic discourse. The very essence of scholarly work, an embodiment of intellectual integrity, teeters on the edge of ruin, vulnerable to the insidious forces of manipulation and deceit. This unforeseen consequence unveils a treacherous landscape where claims of originality, authorship and ownership become muddled. As the dangers become more evident, authorities will find they have an important role as custodians of the truth.40 For if the very fabric of trust, woven from the words we read, watch, see, and hear, is mercilessly torn asunder, the very essence of our existence stands threatened. This is a battle that will be fought beyond academia. Our ability to navigate the convoluted paths of life, from politics to science, rests on the foundation of informed decisions. Should that foundation crumble, we shall wander aimlessly.41
image by www.pexels.com
1 Rosalsky Greg; et al. ‘This 22-year-old is trying to save us from ChatGPT before it changes writing forever’ <https://www.npr.org/sections/money/2023/01/17/1149206188/this-22-year-old-is-trying-to-save-us-from-chatgpt-before-it-changes-writing-for >.
2 Mark,‘Does Chat Gpt Plagiarize? Is it Plagiarism Free?’<https://www.mlyearning.org/does-chat-gpt-plagiarize/?expand_article=1 >
3 Justin Gluska,’Falsely Accused or Caught Using ChatGPT? Here’s What To Do’<https://goldpenguin.org/blog/falsely-accused-of-using-chatgpt/ >
4 ‘ChatGPT and Copyright: What You Need to Know’
5 Marche Stephen, ‘The College Essay Is Dead’<https://flipboard.com/topic/educationtechnology/the-college-essay-is-dead/a-_vEGJWblQTqI4M2l2WgpsA%3Aa%3A84975642-9fd6da3e3b%2Ftheatlantic.com >
6 Murray L. J. (n7)
8Jonathan Gillham,‘How To Avoid AI Detection As A Writer’<https://originality.ai/blog/how-to-avoid-ai-detection-as-a-writer >
9 Uzair Khan, ‘ How To Pass Turnitin AI Detection and Plagiarism’ <https://www.linkedin.com/pulse/how-pass-turnitin-ai-detection-plagiarism-uzair-khan/ >
11 Charles Ross,’Does ChatGPT Give the Same Answer to Everyone?’ <https://medium.com/@charles-ross/does-chatgpt-give-the-same-answer-to-everyone-521e3e9355a4 >
12 ‘Chat GPT in the Academic World’ https://medium.com/@Keenious/chat-gpt-in-the-academic-world-c21ecc6888fe
15‘ Tom Comitta, Death of an Author Prophesies the Future of AI Novels’ <https://www.wired.com/story/death-of-an-author-ai-book-review/ >
18 Miles Klee, ‘Professor Flunks All His Students After ChatGPT Falsely Claims It Wrote Their Papers’<https://www.rollingstone.com/culture/culture-features/texas-am-chatgpt-ai-professor-flunks-students-false-claims-1234736601/ >
19 ‘Best AI Content Detection Tools : Free ChatGPT Output Detector’<https://www.outlookindia.com/outlook-spotlight/best-ai-content-detection-tools-free-chatgpt-output-detector-news-256773 >
20 Jan Hendrik Kirchner, Lama Ahmad, et. al ‘New AI classifier for indicating AI-written text’ <https://openai.com/blog/new-ai-classifier-for-indicating-ai-written-text >
27 Brainard Jeffrey, ‘As scientists explore AI-written text, journals hammer out policies’ “As scientists explore AI-written text, journals hammer out policies”.
28 Mitchell Alex, ‘‘Professor catches student cheating with ChatGPT: ‘I feel abject terror’’ <“Students using ChatGPT to cheat, professor warns”>
30 Rosalsky Greg, et. al (n1)
31 Deborah Nas, ‘Think you’re getting away with AI-generated text?’ <https://deborahnas.medium.com/think-youre-getting-away-with-ai-generated-text-311dfe673539 >
32 Tony Ho, ‘A College Kid Built an App That Sniffs Out Text Penned by AI’ “A College Kid Built an App That Sniffs Out Text Penned by AI”.
33 Jake Akins,’How to Make Chat GPT Content Undetectable’
36 Nicole Levine ‘Check if Something Was Written by ChatGPT: AI Detection Guide’<https://www.wikihow.com/Check-if-Something-Was-Written-by-Chat-Gpt>
38 Cain Sian, ‘This song sucks’: Nick Cave responds to ChatGPT song written in style of Nick Cave’ <“‘This song sucks’: Nick Cave responds to ChatGPT song written in the style of Nick Cave” >
39 Mitchell Alex, (n21).
40 Bernard Marr, ‘How To Detect If Content Was Created By ChatGPT And Other AIs’
Centre for Intellectual Property and Information Technology law Read More