Tuesday, May 19, 2026

Flooded Zones Part 2

Source
This is the promised follow-on to Flooded Zones Part 1, which discussed the Distributed Denial of Service (DDoS) attack being mounted by AI against the scholarly publication system. By reducing the cost of generating and submitting a paper or a review, AI has caused a massive increase in the quantity and a significant decrease in the quality of submissions to a system that was already vastly overloaded.

Below the fold I look at AI-enabled DDoS attacks against two other even more important areas; software security and political discourse (as shown in the overview image).

Software Security

Last month, Raffi Krikorian's New York Times op-ed announced It’s the End of the Internet as We Know It:
Last week, Anthropic announced that its newest artificial intelligence model, Claude Mythos Preview, would not be released to the public, after the company learned it was capable of finding and exploiting vulnerabilities that have gone undetected in critical software systems for decades. Instead, Anthropic gave access to Mythos — and $100 million in credits to use it — to more than 50 of the world’s largest organizations, including Amazon, Apple, Microsoft, Google and JPMorgan Chase, as part of a defensive cybersecurity initiative called Project Glasswing.
It sounded like a double-edged sword, helping both the attackers and the defenders, with Anthropic claiming kudos for favoring the defenders. It is true that, once the maintainers of all the software in the world have used these tools and incorporated them into their build process, the world will be a safer place. Daniel Steinberg, who maintains curl, is among the maintainers who really care about security and were already using similar tools. In MYTHOS FINDS A CURL VULNERABILITY he reported that:
Back in April 2026 Anthropic caused a lot of media noise when they concluded that their new AI model Mythos is dangerously good at finding security flaws in source code. Apparently Mythos was so good at this that Anthropic would not release this model to the public yet but instead trickle it out to a selected few companies for a while to allow a few good ones(?) to get a head start and fix the most pressing problems first, before the general populace would get their hands on it.

The whole world seemed to lose its marbles. Is this the end of the world as we know it? An amazingly successful marketing stunt for sure.
Steinberg got access to Mythos' report on his codebase. It had found five "confirmed security vulnerabilities":
Five issues felt like nothing as we had expected an extensive list. Once my curl security team fellows and I had poked on the this short list for a number of hours and dug into the details, we had trimmed the list down and were left with one confirmed vulnerability. The other four were three false positives (they highlighted shortcomings that are documented in API documentation) and the fourth we deemed “just a bug”.

The single confirmed vulnerability is going to end up a severity low CVE planned to get published in sync with our pending next curl release 8.21.0 in late June. The flaw is not going to make anyone grasp for breath. All details of that vulnerability will of course not get public before then, so you need to hold out for details on that.
...
My personal conclusion can however not end up with anything else than that the big hype around this model so far was primarily marketing. I see no evidence that this setup finds issues to any particular higher or more advanced degree than the other tools have done before Mythos. Maybe this model is a little bit better, but even if it is, it is not better to a degree that seems to make a significant dent in code analyzing.
It seems Mythos isn't as revolutionary as Anthropic would like the world to believe as they head for an IPO. Nevertheless, Steinberg stresses that using tools like Mythos is an essential security practice.

To understand the DDoS facing software maintainers you need to understand how security vulnerabilities are found and fixed:
  • An anomaly is discovered using techniques such as code analysis or fuzzing.
  • The anomaly is analyzed to distinguish between bugs and exploitable vulnerabilities.
  • A proof-of-concept exploit is developed to confirm that this is an exploitable vulnerability.
  • A fix is developed.
  • A report is generated and submitted to the maintainers.
LLMs have made this part of the process much cheaper and quicker, so the flow of vulnerabilities reaching this point has greatly increased. Now the maintainer has a report of a vulnerability, hopefully a proposed fix and maybe exploit code confirming that it is real. What happens next?
  • The maintainers go through the process Steinberg described to evaluate the claimed vulnerability, the exploit and the proposed fix.
  • The maintainers accept the fix or develop a better one.
  • The maintainers test the fix and package it as patch to the multiple supported versions of their software.
  • The maintainers test the effect of patching each of the supported versions.
  • The maintainers release the patch(es).
  • The various products that use the software in question test the patch.
  • The products add the patch to their software update mechanism.
  • Sysadmins for critical systems test the patched products before releasing them for production use.
  • The patched product replaces the vulnerable version.
LLMs don't help with any of these steps, so the zones of the humans who have to perform them are being flooded. Much of the flood is shit. To take perhaps the most critical example, Simon Sharwood reports that Linus Torvalds says AI-powered bug hunters have made Linux security mailing list ‘almost entirely unmanageable’:
“So just to make it really clear: If you found a bug using AI tools, the chances are somebody else found it too. If you actually want to add value, read the documentation, create a patch too, and add some real value on *top* of what the AI did. Don't be the drive-by ‘send a random report with no real understanding’ kind of person. OK?”
Jamie John provides another example in Bug bounty businesses bombarded with AI slop:
Businesses that run “bug bounty” schemes have long relied on independent security researchers to spot vulnerabilities. But the rise of AI tools is now overwhelming them with spurious submissions.

Bugcrowd, whose customers include OpenAI, T-Mobile, and Motorola, said the number of reports it received more than quadrupled over a three-week period in March, with most proving to be false.

Curl, a widely used tool to transfer data across the Internet, suspended its paid bug bounty program in January, citing an “explosion in AI slop reports” and lower-quality submissions.

Cyber security experts say advances in generative AI are reshaping the economics of bug bounty programs. While the tools allow experienced researchers to find flaws more quickly, they are also lowering the barrier to entry, triggering a flood of automated or erroneous submissions that companies must sift through.
The problems caused by DDoS-ing the patch develop-test-release-install cycle are vividly illustrated by recent vulnerabilities in the Linux kernel:
  • Copy Fail (CVE-2026-31431): described in Thomas Claiburn's Linux cryptographic code flaw offers fast route to root:
    The newly disclosed LPE, dubbed Copy Fail (CVE-2026-31431), comes from a vulnerability in the Linux kernel's authencesn cryptographic template.

    "An unprivileged local user can write four controlled bytes into the page cache of any readable file on a Linux system, and use that to gain root," the writeup from security biz Theori explains.

    The kernel reads the page cache when it loads a binary, so modifying the cached copy amounts to altering the binary for the purpose of program execution. But doing so doesn't trigger any defenses focused on file system events like inotify.

    The proof of concept exploit is a 10-line, 732-byte Python script capable of editing a setuid binary to gain root on almost all Linux distributions released since 2017.

    Copy Fail is similar to other LPE bugs such as Dirty Cow and Dirty Pipe, but its finders claim it doesn't require winning a race condition and it's more broadly applicable.
  • Dirty Frag (CVE-2026-43284, CVE-2026-43500): described in Carly Page's Dirty Frag:
    A fresh Linux privilege escalation bug dubbed "Dirty Frag" has dropped into the wild with no patches, no CVE, and a public exploit that hands attackers root access across major distributions.

    Security researcher Hyunwoo Kim disclosed the local privilege escalation flaw on Friday after what he said was a broken embargo forced the issue into the open.

    Kim described Dirty Frag as a "universal LPE" affecting "all major distributions" and warned that it delivers the same kind of immediate root access as the recent CopyFail mess – only this time, defenders do not even have patches to throw at the problem.

    "As with the previous Copy Fail vulnerability, Dirty Frag likewise allows immediate root privilege escalation on all major distributions," Kim said. "Because the responsible disclosure schedule and embargo have been broken, no patches exist for any distribution."
  • Fragnesia (CVE-2026-46300): described in Carly Page's Dirty Frag gets a sequel as Fragnesia hands Linux attackers root-level access:
    According to Google-owned Wiz, the flaw sits in the Linux kernel's XFRM subsystem, specifically ESP-in-TCP processing tied to IPsec support. By carefully triggering the bug, attackers can modify protected file data in memory without changing the original files stored on disk.

    Wiz describes Fragnesia as part of the broader "Dirty Frag" bug family rather than a completely separate class of issue. Dirty Frag itself only surfaced days ago and was already attracting attention thanks to public exploit code, incomplete patch coverage, and unusually reliable privilege escalation.

Note Hyunwoo Kim's assessment that it was the result of rushing the patch process:
According to researcher Hyunwoo Kim, who uncovered Dirty Frag, "Fragnesia" emerged as an unintended side effect of patches shipped to fix the original Dirty Frag vulnerabilities, adding yet another entry to the long tradition of security fixes accidentally creating new security problems.

As The Register previously reported, Dirty Frag followed hot on the heels of Copy Fail, another Linux kernel privilege escalation flaw that abused page cache handling to overwrite supposedly read-only files.
It doesn't appear that LLMs found any of these vulnerabilities, showing that even humans can overload the patch process to the point of failure. But the advent of LLMs means we can expect more and worse fiascos

Political Discourse

How malicious AI swarms can threaten democracy is a paper in Science by Daniel Thilo Schroeder et 21 al from last January (preprint). They set out the problem thus:
Advances in AI offer the prospect of manipulating beliefs and behaviors on a population-wide level. Large language models (LLMs) and autonomous agents now let influence campaigns reach unprecedented scale and precision. Generative tools can expand propaganda output without sacrificing credibility and inexpensively create falsehoods that are rated as more human-like than those written by humans. Techniques meant to refine AI reasoning, such as chain-of-thought prompting, can just as effectively be used to generate more convincing falsehoods. Enabled by these capabilities, a disruptive threat is emerging: swarms of collaborative, malicious AI agents. Fusing LLM reasoning with multi-agent architectures, these systems are capable of coordinating autonomously, infiltrating communities, and fabricating consensus efficiently. By adaptively mimicking human social dynamics, they threaten democracy. Because the resulting harms stem from design, commercial incentives, and governance, we prioritize interventions at multiple leverage points, focusing on pragmatic mechanisms over voluntary compliance.

This risk compounds long-standing vulnerabilities in democratic information ecosystems, already weakened by erosion of rational-critical discourse and a lack of shared reality among citizens. AI swarms are a potent accelerant in this trajectory, though their ultimate impact is not predetermined. Their effects will be shaped by platform design, market incentives, media institutions, and political actors. Here, we distinguish documented trends from projections, indicate where uncertainty remains, and note countervailing dynamics, such as growing public skepticism toward unverified content and a renewed interest in institutional demand for accountable journalism
David Gilbert covered it for Wired in AI-Powered Disinformation Swarms Are Coming for Democracy, and some of the authors discuss the paper in AI bot swarms threaten to undermine democracy on Gary Marcus' Substack:
The unique danger of a swarm is that it acts less like a megaphone and more like a coordinated social organism. Earlier botnets were simple-minded, mostly just copying and pasting messages at scale—and in well-studied cases (including Russia’s 2016 IRA effort on Twitter), their direct persuasive effects were hard to detect. Today’s swarms, now emerging, can coordinate fleets of synthetic personas—sometimes with persistent identities—and move in ways that are hard to distinguish from real communities. This is not hypothetical: in July 2024, the U.S. Department of Justice said it disrupted a Russia-linked, AI-enhanced bot farm tied to 968 X accounts impersonating Americans. And bots already make up a measurable slice of public conversation: a 2025 peer-reviewed analysis of major events estimated roughly one in five accounts/posts in those conversations were automated. Swarms don’t just broadcast propaganda; they can infiltrate communities by mimicking local slang and tone, build credibility over time, and then adapt in real time to audience reactions—testing variations at machine speed to discover what persuades.
This is precisely the AI-based version of Steve Bannon's "flooding the zone with shit".

Unlike the case of scholarly publication. I have no idea how to mitigate the shit that is flooding this zone. Unlike the oligopoly publishers, who can act as partially effective gatekeepers and are somewhat motivated to improve things, in the political space there are no longer any effective gatekeepers. This entire zone is driven by measures of "engagement", and fanning the flames of outrage is the best way to drive up the numbers.

The authors propose five "pragmatic mechanisms", but I have issues with each of them:
  1. social media platforms must move away from the “whack-a-mole” approach they currently use:
    Right now, companies rely on episodic takedowns—waiting until a disinformation campaign has already gone viral and done its damage before purging thousands of accounts in a single wave. This is too slow. Instead, we need continuous monitoring that looks for statistically unlikely coordination. Because AI can now generate unique text for every single post, looking for copy-pasted content no longer works. We must look at network behavior instead: a thousand users might be tweeting different things, but if they exhibit statistically improbable correlations in their semantic trajectories or propagate narratives with a synchronized efficiency that defies organic human diffusion.
    Platforms are always going to be reluctant to kill off the users on whom their finances depend upon. When forced to, they would rather do it in large blocks.
  2. we need to stop waiting for attackers to invent new tactics before we build defenses:
    A defense that only reacts to yesterday’s tricks is destined to fail. We should instead proactively stress-test our defenses using agent-based simulations. Think of this like a digital fire drill or a vaccine trial: researchers can build a “synthetic” social network populated by AI agents, and then release their own test-swarms into that isolated environment. By watching how these test-bots try to manipulate the system, we can see which safeguards crumble and which hold up, allowing us to patch vulnerabilities before bad actors act on them in the real world.
    This is a very good idea, but it would need funding (see below),
  3. we must make it expensive to be a fake person:
    Policymakers need to incentivize cryptographic attestations and reputation standards to strengthen provenance. This doesn’t mean forcing every user to hand over their ID card to a tech giant—that would be dangerous for whistleblowers and dissidents living under authoritarian regimes. Instead, we need “verified-yet-anonymous” credentialing. Imagine a digital stamp that proves you are a unique human being without revealing which human you are. If we require this kind of “proof-of-human” for high-reach interactions, we make it mathematically difficult and financially ruinous for one operator to secretly run ten thousand accounts.
    This is the same problem that has bedevilled computer-mediated communication since the advent of spam. Cynthia Dwork and Moni Naor's Pricing via Processing or Combatting Junk Mail signally failed to stem the tide by making it costly to send bulk e-mail. But it did lead to Satoshi Nakamoto's solution to the Sybil problem, making it expensive to mine Bitcoin. The problem here is that making it "expensive to be a fake person" effectively means making it somewhat expensive to be a person. The overhead and cost of obtaining such a "digital stamp" would disincentivize participation, so the platforms wouldn't like it. See The Permissionless Catch-22
  4. we need mandated transparency through free data access for researchers:
    We cannot defend society if the battlefield is hidden behind proprietary walls. Currently, platforms restrict access to the data needed to detect these swarms, leaving independent experts blind. Legislation must guarantee vetted academic and civil society researchers free, privacy-preserving access to platform data. Without a guaranteed “right to study,” we are forced to trust the self-reporting of the very corporations that profit from the engagement these swarms generate.
    The platforms depend upon monetizing the data they collect on users' behavior. So they are always going to be reluctant to give outsiders access to their key asset. And, in any case, any data to which they grant access will effectively be "self-reporting".
  5. we need to end the era of plausible deniability with an AI Influence Observatory:
    Crucially, this cannot be a government-run “Ministry of Truth.” Instead, it must be a distributed ecosystem of independent academic groups and NGOs. Their mandate is not to police content or decide who is right, but strictly to detect when the “public” is actually a coordinated swarm. By standardizing how evidence of bot-like networking is collected and publishing verified reports, this independent watchdog network would prevent the paralysis of “we can’t prove anything,” establishing a shared, factual record of when our public discourse is being engineered.
    Each of the "independent academic groups and NGOs" will need significant fund if they are to process information at the scale required. Where would this funding come from? Taxing the platforms to fund this is one answer, but it wouldn't motivate them to cooperate.
One of the most effective ways to use these disinformation swarms is to amplify pre-existing stereotypes, exploiting confirmation bias:
In social media, confirmation bias is amplified by the use of filter bubbles and "algorithmic editing", which display to individuals only information they are likely to agree with, while excluding opposing views.
Adam Kucharski shows how easy it is for AIs to build on such stereotypes in Real signals or artificial stereotypes?. He asked Copilot to analyze data that should have generated a null result:
First, I’d created 2000 free-text responses and labelled them ‘UK’. Then I copied and pasted the exact same 2000 responses but labelled these ‘US’. Finally, I combined them to create a dataset of 4000 total responses, and jumbled them up.

Despite the responses being identical for the UK and US, Copilot produced a rich, detailed summary of how US and UK respondents differed.
Copilot's output
Note how confident and detailed Copilot was driven not by anything in the data but only by the stereotypes in its training data. There are two problems when applying this to real data:
  • The stereotypes in the training data will impact the results, albeit probably less.
  • Although presumably prompts could be devised to eliminate the effect of stereotypes in the training data, in practice almost no-one would remember to use them.
The defense against DDoS at the network level is services like Cloudflare interposed between the bots and their target. There doesn't seem to be any way to replicate this at higher levels like these three zones. It is really hard to be optimistic about their future.

No comments: