Back

Study Finds 35% of New Websites Use AI, Driving an Overly Cheerful Tone Online

Researchers from Imperial College London, Stanford University and the Internet Archive have published a preprint that quantifies the spread of artificial‑intelligence‑generated content across the public web. By sampling snapshots from the Wayback Machine, the team identified that about 35 percent of sites created from 2022 through 2025 were either fully AI‑generated or heavily assisted by large language models.

To reach that figure, the investigators tested four detection approaches before settling on a tool from Pangram Labs, which delivered the most consistent results despite acknowledged imperfections. The sample, drawn from the Internet Archive’s massive repository, was intended to represent the broader ecosystem of new web pages.

The study’s most striking headline is the surge in positive language. Sentiment analysis shows that AI‑written pages score roughly 107 percent higher on positive sentiment than their human‑crafted counterparts. The researchers describe the effect as an "artificial cheerfulness" that makes the overall tenor of online writing feel saccharine.

Beyond tone, the analysis suggests a narrowing of viewpoints. Using semantic similarity metrics, the team found that AI‑driven sites are about 33 percent more alike in content than human‑written sites, indicating a shrinkage in ideological diversity across the web.

Four hypotheses the researchers entered the study with proved false. Contrary to popular belief, the data did not show a spike in misinformation linked to AI content. Likewise, AI‑generated pages were just as likely to include outbound links as human‑written ones, and the writing style did not flatten into a generic voice.

"Everyone on the team expected that to be true," said Stanford researcher Maty Bohacek, noting the surprise at the lack of evidence for a stylistic homogenization. "But we just don’t have significant evidence for that." The unexpected findings highlight how assumptions about large language models can outpace empirical reality.

Before the technical work began, the team commissioned a public poll on attitudes toward AI‑written content. Respondents largely anticipated a rise in fake news, a decline in external linking and a uniform, bland writing style—outcomes that the study ultimately did not confirm. The mismatch between perception and measurement underscores a broader gap in public understanding of AI’s real‑world effects.

The authors stress that this research is an early step, not a final verdict, on how AI reshapes the internet. They hope the data will spur deeper investigations into the nuanced ways large language models influence both the tone and the diversity of online discourse.

Used: News Factory APP - news discovery and automation - ChatGPT for Business

Source: Wired AI

Also available in: