Former Cloudflare executive John Graham-Cumming has launched lowbackgroundsteel.ai, a catalog that preserves pre-2022 human-generated content from before widespread AI contamination began. The archive draws its name from scientists who once sought “low-background steel” from pre-nuclear shipwrecks to avoid radiation contamination, creating a parallel between nuclear fallout and AI-generated content polluting the internet.
The big picture: The project treats pre-AI content as a precious commodity, recognizing that distinguishing between human and machine-generated material has become increasingly difficult since ChatGPT’s November 2022 launch.
Why this matters: AI contamination has already forced at least one major research project to shut down entirely—wordfreq, a Python library that tracked word frequency across 40+ languages, announced in September 2024 it would stop updating because “the Web at large is full of slop generated by large language models, written by no one to communicate nothing.”
What’s included: The archive points to several major repositories of verified pre-AI content that researchers and developers can trust.
Model collapse concerns: Some researchers worry about AI models training on their own outputs, potentially degrading quality over time, though recent evidence suggests this fear may be overblown under certain conditions.
The backstory: Graham-Cumming created the website in March 2023 but only recently announced it publicly, having kept it as a quiet clearinghouse for uncontaminated online resources.
Looking ahead: Graham-Cumming emphasizes the project documents human creativity rather than opposing AI itself, similar to how low-background steel eventually became unnecessary as atmospheric nuclear testing ended and radiation levels normalized.