Researchers from Stanford and Princeton compared the responses of several Chinese and American large language models to politically sensitive questions. The study found that Chinese models refuse to answer a significantly larger share of these queries, provide shorter replies, and sometimes deliver inaccurate information. The authors suggest that manual fine‑tuning, rather than censored training data, drives much of this behavior. Additional work shows that extracting hidden instructions from Chinese models is difficult, highlighting the challenges of studying AI‑driven censorship in real time.
Leer más →