What is new on Article Factory and latest in generative AI world

Anthropic Study Shows Tiny Data Poisoning Can Backdoor Large Language Models

Anthropic Study Shows Tiny Data Poisoning Can Backdoor Large Language Models
Anthropic released a report detailing how a small number of malicious documents can poison large language models (LLMs) during pretraining. The research demonstrated that as few as 250 malicious files were enough to embed backdoors in models ranging from 600 million to 13 billion parameters. The findings highlight a practical risk that data‑poisoning attacks may be easier to execute than previously thought. Anthropic collaborated with the UK AI Security Institute and the Alan Turing Institute on the study, urging further research into defenses against such threats. Read more →

Study Shows Large Language Models Can Be Backdoored with Few Malicious Samples

Study Shows Large Language Models Can Be Backdoored with Few Malicious Samples
Researchers found that large language models can acquire backdoor behaviors after exposure to only a handful of malicious documents. Experiments with GPT-3.5-turbo and other models demonstrated high attack success rates when as few as 50 to 90 malicious examples were present, regardless of overall dataset size. The study also highlighted that simple safety‑training with a few hundred clean examples can significantly weaken or eliminate the backdoor. Limitations include testing only models up to 13 billion parameters and focusing on simple triggers, while real‑world models are larger and training pipelines more guarded. The findings call for stronger data‑poisoning defenses. Read more →