Feeds:
Posts
Comments

Posts Tagged ‘plagiarism’

Today is my second online ESL (English as a Second Language) class for the ’25-’26 school year. I assist a more experienced teacher once a week — have been doing so for nearly ten years. One task she likes me to do is to go over the writing homework that students put on an edublog.

Lately, it feels like these otherwise highly motivated adults may not be learning much about writing English. Often they seem to have copied from Google Translate or another AI program. What I want to see is a few mistakes in their answers. At the same time, I am wary of accusing anyone of not doing their own work.

Today’s article didn’t give me a clear answer to my ESL situation, but I was intrigued to learn about programs that help identify who the real writer of a book was or whether AI was used in a journal article.

Roger J. Kreuz, associate dean and professor of psychology, University of Memphis, writes at the Conversation that although it’s common to use chatbots “to write computer codesummarize articles and books, or solicit advice … chatbots are also employed to quickly generate text from scratch, with some users passing off the words as their own.

“This has, not surprisingly, created headaches for teachers tasked with evaluating their students’ written work. It’s also created issues for people seeking advice on forums like Reddit, or consulting product reviews before making a purchase.

“Over the past few years, researchers have been exploring whether it’s even possible to distinguish human writing from artificial intelligence-generated text. … Research participants recruited for a 2021 online study, for example, were unable to distinguish between human- and ChatGPT-generated stories, news articles and recipes.

“Language experts fare no better. In a 2023 study, editorial board members for top linguistics journals were unable to determine which article abstracts had been written by humans and which were generated by ChatGPT. And a 2024 study found that 94% of undergraduate exams written by ChatGPT went undetected by graders at a British university. …

“A commonly held belief is that rare or unusual words can serve as ‘tells’ regarding authorship, just as a poker player might somehow give away that they hold a winning hand.

“Researchers have, in fact, documented a dramatic increase in relatively uncommon words, such as ‘delves’ or ‘crucial,’ in articles published in scientific journals over the past couple of years. This suggests that unusual terms could serve as tells that generative AI has been used. It also implies that some researchers are actively using bots to write or edit parts of their submissions to academic journals. …

“In another study, researchers asked people about characteristics they associate with chatbot-generated text. Many participants pointed to the excessive use of em dashes – an elongated dash used to set off text or serve as a break in thought – as one marker of computer-generated output. But even in this study, the participants’ rate of AI detection was only marginally better than chance.

“Given such poor performance, why do so many people believe that em dashes are a clear tell for chatbots? Perhaps it’s because this form of punctuation is primarily employed by experienced writers. In other words, people may believe that writing that is ‘too good’ must be artificially generated.

“But if people can’t intuitively tell the difference, perhaps there are other methods for determining human versus artificial authorship.

“Some answers may be found in the field of stylometry, in which researchers employ statistical methods to detect variations in the writing styles of authors.

“I’m a cognitive scientist who authored a book on the history of stylometric techniques. In it, I document how researchers developed methods to establish authorship in contested cases, or to determine who may have written anonymous texts.

“One tool for determining authorship was proposed by the Australian scholar John Burrows. He developed Burrows’ Delta, a computerized technique that examines the relative frequency of common words, as opposed to rare ones, that appear in different texts.

“It may seem counterintuitive to think that someone’s use of words like ‘the,’ ‘and’ or ‘to’ can determine authorship, but the technique has been impressively effective.

“Burrows’ Delta, for example, was used to establish that Ruth Plumly Thompson, L. Frank Baum’s successor, was the author of a disputed book in the Wizard of Oz series. It was also used to determine that love letters attributed to Confederate Gen. George Pickett were actually the inventions of his widow, LaSalle Corbell Pickett.

“A major drawback of Burrows’ Delta and similar techniques is that they require a fairly large amount of text to reliably distinguish between authors. A 2016 study found that at least 1,000 words from each author may be required. A relatively short student essay, therefore, wouldn’t provide enough input for a statistical technique to work its attribution magic.

“More recent work has made use of what are known as BERT language models, which are trained on large amounts of human- and chatbot-generated text. The models learn the patterns that are common in each type of writing, and they can be much more discriminating than people: The best ones are between 80% and 98% accurate.

“However, these machine-learning models are ‘black boxes’ – that is, we don’t really know which features of texts are responsible for their impressive abilities. Researchers are actively trying to find ways to make sense of them, but for now, it isn’t clear whether the models are detecting specific, reliable signals that humans can look for on their own.

“Another challenge for identifying bot-generated text is that the models themselves are constantly changing – sometimes in major ways.

“Early in 2025, for example, users began to express concerns that ChatGPT had become overly obsequious, with mundane queries deemed ‘amazing’ or ‘fantastic.’ OpenAI addressed the issue by rolling back some changes it had made.

“Of course, the writing style of a human author may change over time as well, but it typically does so more gradually.

“At some point, I wondered what the bots had to say for themselves. I asked ChatGPT-4o: ‘How can I tell if some prose was generated by ChatGPT? Does it have any “tells,” such as characteristic word choice or punctuation?’

“[It provided] me with a 10-item list, replete with examples. These included the use of hedges – words like ‘often’ and ‘generally’ – as well as redundancy, an overreliance on lists and a ‘polished, neutral tone.’ It did mention ‘predictable vocabulary,’ which included certain adjectives such as ‘significant’ and ‘notable,’ along with academic terms like ‘implication’ and ‘complexity.’ However, though it noted that these features of chatbot-generated text are common, it concluded that ‘none are definitive on their own.’ ” More at the Conversation, here.

If I were in the room with students, I could more or less stand over them and see how they go about writing. But these are adults, after all, and they want to learn, so the goal is to persuade them how learning is more likely to happen. Let me know if you have ideas that could help me.

Read Full Post »

Photo: Wikimedia.
Above, the portrait of William Shakespeare that was long thought to be the only one with any claim to have been painted from life — until the Cobbe portrait was revealed in 2009.

A week ago I saw that the Globe Magazine had a cover story on a newly discovered source for Shakespeare, but I didn’t think I was interested. Like others who have read theories about Shakespeare, I thought, “Here we go again.” And I have an extra reason to roll my eyes. A great uncle I never met was known for trying to prove that Francis Bacon was Shakespeare. His theory was put to rest by his own codebreakers.

But then blogger Carol got in touch to tell me the article was about her brother-in-law, and I got interested.

Michael Blanding, a Boston-based journalist, has written a book about self-taught Shakespeare researcher Dennis McCarthy and his quest to uncover a possible Shakespeare source. The Globe article was an excerpt.

It seems that McCarthy, a polymath with no academic credentials but with expertise in deep internet searches, has identified a 16th century writer called Thomas North as the source of a lot of Shakespeare themes and even some phrases. North was already known as a writer, but his plays are no longer in existence. Nevertheless, 16th century references to his work are a treasure trove if you know what you’re looking for. No one else has done McCarthy’s deep dive into North. Blanding’s aim seems not only to cover the new ground but to make a sort of scandal out of it by using words like “plagiarism.”

Blanding recounts his first reaction to McCarthy: “Oh, he is one of those, I thought to myself — a conspiracy theorist who thought Shakespeare didn’t write Shakespeare. But McCarthy hurriedly added that in fact he believed the Bard of Avon wrote every word attributed to him during his lifetime. He also believed, however, that Shakespeare had used the earlier plays written by Thomas North for his ideas, his language, and even some of his most famous soliloquies.” Blanding is eventually persuaded.

My reaction: Sure, why not? If the guy has proof of Shakespeare using similar language to North’s, so what? Proof is proof. The importance lies in its newness. Blanding’s emphasis on McCarthy’s — and Darwin’s — lack of standard credentials strikes me as irrelevant.

After all, this is what writers do. They build on previous writers.

Look. Here is T.S. Eliot writing “Ash Wednesday”:

“Because I do not hope to turn again
“Because I do not hope
“Because I do not hope to turn
“Desiring this man’s gift and that man’s scope …”

And here’s Eliot’s source, Shakespeare’s Sonnet 29: “Desiring this man’s art, and that man’s scope.”

Would anyone accuse Eliot of plagiarism for that? Just because so many centuries have passed and manuscripts have been lost, does that mean Shakespeare was hiding a deep, dark secret? And just because McCarthy has no PhD or scholarly cred, does that mean he can’t notice things?

I admit I don’t have a PhD either and I’m often accused of being gullible, but I have no problem with research into a possible inspiration for some of Shakespeare’s art, especially as no one is saying he didn’t write the plays and poetry himself. For me, the only problem that McCarthy and Blanding could have would be over-hyping and using words like “plagiarism.” I really wish them success getting the word out, though.

More at the Globe, here.

Read Full Post »