Liam Dugan
Department of Computer and Information Science
University of Pennsylvania
Stylistic Signatures of LLMs and How to Detect Them
Many believe that text output by Large Language Models (LLMs) passes the “Turing Test” (i.e. that it is indistinguishable from human-written text). However recent work has shown that automatic classifiers and human experts who frequently use LLMs can identify AI writing with over 99% accuracy. This suggests that certain LLMs have a highly particular stylistic signature. What are the linguistic and statistical characteristics of this signature? How can we robustly identify it? And, to what extent is it present across contexts? In this talk I will discuss recent work on quantifying and detecting the differences between humans and language models and how such differences relate to core human questions (creativity, voice, etc.). In addition I will discuss the practical application of AI detectors in real world context and how, despite our progress, we may still be a long way away from robust and deployable AI detectors. Finally, I will finish by outlining future directions in so-called “interpretable” classifiers and how they may help us to better understand our models.