为什么人工智能总是使用长破折号——无意的人工智能水印

📄 中文摘要

长破折号(—)在英语排版中曾经是一个常见的标点符号,但随着人工智能的普及,它成为了AI生成文本的一个显著特征。无论是电子邮件、LinkedIn帖子、求职信还是食谱,AI生成的文本几乎每隔一行就会出现长破折号。这种现象显得机械且令人厌烦,因为在日常交流中,正常人更倾向于使用逗号、句号和括号等常见标点。AI之所以如此使用长破折号,主要是因为其训练数据中包含了大量的文本样本,而这些样本中频繁使用这种标点符号,导致AI在生成文本时也不自觉地模仿这种风格。

📄 English Summary

Why the f*** does AI always use em dashes — the involuntary AI watermark

The em dash (—) has been a staple in English typography for centuries, but it has now become a hallmark of AI-generated text. Whether it's an email, a LinkedIn post, a cover letter, or a pasta recipe, AI outputs tend to feature em dashes every few lines. This pattern is systematic and mechanical, often frustrating readers, as it deviates from typical human writing styles that favor commas, periods, and parentheses. The prevalence of em dashes in AI text can be attributed to the vast amounts of training data that include this punctuation, leading AI to unconsciously replicate this style in its outputs.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等