Atrás

Polite Replies Signal AI Bots, Study Shows

Polite Replies Signal AI Bots, Study Shows
Ars Technica2

University Collaboration Uncovers AI Tell‑Tale

Researchers from four institutions – the University of Zurich, the University of Amsterdam, Duke University and New York University – conducted a systematic analysis of large language models (LLMs) operating on popular social‑media platforms. Their goal was to determine how closely AI‑generated replies resemble authentic human comments and to identify reliable markers that distinguish the two.

Computational Turing Test Framework

The team introduced a “computational Turing test,” an automated classification system that replaces subjective human judgment with objective linguistic analysis. By feeding real‑world posts from Twitter/X, Bluesky and Reddit to nine open‑weight models, the researchers generated reply texts and then evaluated them using their classifiers.

Models Evaluated and Accuracy Results

The study examined a diverse set of models, including Llama 3.1 (8B, 8B Instruct, 70B), Mistral 7B (v0.1, Instruct v0.2), Qwen 2.5 7B Instruct, Gemma 3 4B Instruct, DeepSeek‑R1‑Distill‑Llama‑8B and Apertus‑8B‑2509. Across all platforms, the classifiers identified AI‑generated replies with an accuracy ranging from 70 percent to 80 percent.

Emotional Tone as a Persistent Indicator

Analysis revealed that the most consistent differentiator was affective tone. AI outputs tended to be overly polite, friendly and emotionally restrained, contrasting sharply with the casual negativity and spontaneous emotional expression typical of human users. This “politeness” signal persisted even after the researchers applied various optimization strategies, such as providing writing examples, fine‑tuning, or contextual retrieval.

Lower Toxicity Scores in AI Replies

In addition to tone, the study measured toxicity—a metric of hostile or harmful language. AI‑generated replies consistently scored lower on toxicity than authentic human comments, indicating a reluctance of current models to produce the more abrasive language often found in everyday social‑media discourse.

Optimization Attempts and Limits

The research team experimented with several calibration techniques aimed at reducing structural differences like sentence length or word count. While these adjustments narrowed some gaps, the emotional‑tone disparity remained robust. The authors concluded that simply making models larger or more finely tuned does not automatically yield human‑like emotional expression.

Implications for Detection and Trust

These findings suggest that platforms and users can rely on affective cues—especially an unusually polite or friendly tone—to flag potential AI‑generated content. The study challenges the assumption that advanced optimization will erase all detectable signatures of machine‑authored text, underscoring the need for continued development of detection tools.

Usado: News Factory APP - descubrimiento de noticias y automatización - ChatGPT para Empresas

Source: Ars Technica2

También disponible en: