Emerging Watermarks in Advanced AI Models: A New Challenge for Students

Our research team at Rumi has made a significant discovery regarding the latest versions of OpenAI's Generative Pre-trained Transformer models, specifically GPT-o3 and o4-mini. These models have been found to embed special character watermarks in the text they generate, marking a notable development in AI technology. During our testing, we noticed that these watermarks appear predominantly in longer responses. For instance, when we prompted GPT-o3 to 'Write a full essay on the Department of Education,' the model included these hidden markers.
The watermarks consist of special Unicode characters, primarily the Narrow No-Break Space (NNBSP). Although these characters appear indistinguishable from regular spaces in typical word processing applications, they differ in their ASCII codes, which creates an invisible form of identification for the text. Interestingly, our evaluations revealed that older iterations of the AI, such as GPT-4o, did not exhibit this watermarking feature.
For users looking to identify these concealed characters, there are various online tools available, as well as text editors like Sublime Text. These resources can expose the normally invisible markers, revealing a systematic pattern that indicates an intentional design. Furthermore, we found that once these watermarks are embedded, they can survive the copy-paste process into other text editors, including widely-used platforms like Google Docs.
This revelation comes shortly after OpenAI's announcement regarding watermark testing for images. However, the company has yet to formally comment on the text watermarking feature, likely due to concerns that such transparency could hinder its effectiveness in detecting plagiarism. While these watermarks could assist educators in identifying AI-generated content, they are relatively easy to bypass once users become aware of their existence; a straightforward find-and-replace operation can effectively eliminate these special characters.
Now, why does this development matter? With ChatGPT being available for free to students until the end of May via