Thursday, May 11, 2023

Flooding The Zone With Shit

Tom Cowap
CC-BY-SA 4.0
Much of the discussion occupying the Web recently has been triggered by the advent of Large Language Models (LLMs). Much of that has been hypeing the vast improvements in human productivity they promise, and glossing over the resulting unemployment among the chattering and coding classes. But the smaller negative coverage, while acknowledging the job losses, has concentrated on the risk of "The Singularity", the idea that these AIs will go HAL 9000 on us, and render humanity obsolete[0].

My immediate reaction to the news of ChatGPT was to tell friends "at last, we have solved the Fermi Paradox"[1]. It wasn't that I feared being told "This mission is too important for me to allow you to jeopardize it", but rather that I assumed that civilizations across the galaxy evolved to be able to implement ChatGPT-like systems, which proceeded to irretrievably pollute their information environment, preventing any further progress.

Below the fold I explain why my on-line experience, starting from Usenet in the early 80s, leads me to believe that humanity's existential threat from these AIs comes from Steve Bannon and his ilk flooding the zone with shit[2].

The Economist's April 22nd edition has a leader, a thoughtful essay and the whole of the Science and Technology section on AI.

Source
In How generative models could go wrong they acknowledge the HAL 9000 problem:
In August 2022, AI Impacts, an American research group, published a survey that asked more than 700 machine-learning researchers about their predictions for both progress in AI and the risks the technology might pose. The typical respondent reckoned there was a 5% probability of advanced AI causing an “extremely bad” outcome, such as human extinction (see chart). Fei-Fei Li, an AI luminary at Stanford University, talks of a “civilisational moment” for AI. Asked by an American tv network if AI could wipe out humanity, Geoff Hinton of the University of Toronto, another AI bigwig, replied that it was “not inconceivable”.
The essay correctly points out that this risk is beyond the current state of the art:
But in the specific context of GPT-4, the LLM du jour, and its generative ilk, talk of existential risks seems rather absurd. They produce prose, poetry and code; they generate images, sound and video; they make predictions based on patterns. It is easy to see that those capabilities bring with them a huge capacity for mischief. It is hard to imagine them underpinning “the power to control civilisation”, or to “replace us”, as hyperbolic critics warn.
I agree with the AI experts that the HAL 9000 problem is an existential risk worth considering. In order for humanity to encounter it society needs not just to survive a set of other existential risks which have much shorter fuses, but to do so with its ability to make rapid technological progress unimpaired. And the problem in this cursory paragraph will make these risks much worse:
The most immediate risk is that LLMs could amplify the sort of quotidian harms that can be perpetrated on the internet today. A text-generation engine that can convincingly imitate a variety of styles is ideal for spreading misinformation, scamming people out of their money or convincing employees to click on dodgy links in emails, infecting their company’s computers with malware. Chatbots have also been used to cheat at school.
These more urgent existential risks include climate change, pandemics, and the rise of warmongering authoritarian governments. A key technique used by those exacerbating them is "spreading misinformation". Brian Stelter explains Bannon's 2018 confession to the acclaimed writer Michael Lewis:
“The Democrats don’t matter,” Bannon told Lewis. “The real opposition is the media. And the way to deal with them is to flood the zone with shit.”

That’s the Bannon business model: Flood the zone. Stink up the joint. As Jonathan Rauch once said, citing Bannon’s infamous quote, “This is not about persuasion: This is about disorientation.”
Bannon's quote is related to a 2004 quote Ron Susskind attributed to a "senior adviser to Bush":
The aide said that guys like me were "in what we call the reality-based community," which he defined as people who "believe that solutions emerge from your judicious study of discernible reality." ... "That's not the way the world really works anymore," he continued. "We're an empire now, and when we act, we create our own reality. And while you're studying that reality -- judiciously, as you will -- we'll act again, creating other new realities, which you can study too, and that's how things will sort out. We're history's actors . . . and you, all of you, will be left to just study what we do."
The essay acknowledges the problem:
In many applications a tendency to spout plausible lies is a bug. For some it may prove a feature. Deep fakes and fabricated videos which traduce politicians are only the beginning. Expect the models to be used to set up malicious influence networks on demand, complete with fake websites, Twitter bots, Facebook pages, TikTok feeds and much more. The supply of disinformation, Renée DiResta of the Stanford Internet Observatory has warned, “will soon be infinite”.

This threat to the very possibility of public debate may not be an existential one; but it is deeply troubling. It brings to mind the “Library of Babel”, a short story by Jorge Luis Borges. The library contains all the books that have ever been written, but also all the books which were never written, books that are wrong, books that are nonsense. Everything that matters is there, but it cannot be found because of everything else; the librarians are driven to madness and despair.
Source
I disagree that the "threat to the very possibility of public debate" is not existential. Informed public debate is a neccessary but not sufficient condition for society to survive the existential threats it faces. An example is the continuing fiasco of the US' response to the COVID pandemic[3]. In Covid is still a leading cause of death as the virus recedes Dan Diamond writes [my emphasis]:
Federal health officials say that covid-19 remains one of the leading causes of death in the United States, tied to about 250 deaths daily, on average, mostly among the old and immunocompromised.

Few Americans are treating it as a leading killer, however — in part because they are not hearing about those numbers, don’t trust them or don’t see them as relevant to their own lives.
...
The actual toll exacted by the virus remains a subject of sharp debate. Since the earliest days of the pandemic, skeptics have argued that physicians and families had incentives to overcount virus deaths, and pointed to errors by the Centers for Disease Control and Prevention in how it has reported a wide array of covid data. Those arguments were bolstered earlier this year by a Washington Post op-ed by Leana Wen that argued the nation’s recent covid toll is inflated by including people dying with covid, as well as from covid — for instance, gunshot victims who also test positive for the virus — a conclusion echoed by critics of the pandemic response and amplified on conservative networks.
The zone was flooded with shit about hydroxychloroquine, bleach, vaccine-caused "sudden death", and so on and on[4].

Source
The argument of the essay starts by describing the effects of a much earlier technological revolution:
Johannes Gutenberg’s development of movable type has been awarded responsibility, at some time or other, for almost every facet of life that grew up in the centuries which followed. It changed relations between God and man, man and woman, past and present. It allowed the mass distribution of opinions, the systematisation of bureaucracy, the accumulation of knowledge. It brought into being the notion of intellectual property and the possibility of its piracy. But that very breadth makes comparison almost unavoidable. As Bradford DeLong, an economic historian at the University of California, Berkeley puts it, “It’s the one real thing we have in which the price of creating information falls by an order of magnitude.”
Much commentary on the effects of Gutenberg, including the essay, emphasizes books. But the greater effect on society came from propagandistic pamphlets, which being cheaper to produce had a much wider circulation. The economics of the moveable type revolution greatly impacted the production of high-quality content, but it impacted the production of lower-quality content much more. The explanation is simple, the raw costs of publication and distribution form a much greater proportion of the total cost of disseminating lower-quality content. Higher-quality content has much greater human and other costs in creating the content before it is published and distributed.

Initially when a new, more cost-effective medium becomes available, content quality is high because the early adopters value the new experience and put effort into using it. But as the low cost becomes more widely known, quality begins to degrade.We have seen this effect in action several times. My first was with Usenet newsgroups. I was an early Usenet adopter, and when I was working on the X Window System I found the "xpert" newsgroup a valuable resource to communicate with the system's early adopters. But as the volume grew the signal-to-noise ration dropped rapidly, and it became a waste of time. Usenet actually pioneered commercial spam, which migrated to e-mail where it provided yet another example of rapidly decaying signal-to-noise ratio.

Exactly the same phenomenon has degraded academic publishing. When Stanford's Highwire Press pioneered the transition of academic journals from paper to the Web in 1995 the cost of distribution was practically eliminated, but the cost of peer-review, copy-editing, graphics and so on was pretty much untouched. In Who pays the piper calls the tune Jim O'Donnell writes:
The model of scientific and scholarly publishing is, arguably, undergoing a fundamental change. Once upon a time, the business model was simple: publish high quality articles and convince as many people as possible to subscribe to the journals in which they appear and raise the prices as high as the market will bear. We all know pretty well how that works.

But now an alternate model appears: charge people to publish their articles and give them away for free. The fundamental change that implies is that revenue enhancement will still come from charging whatever the market will bear, but now the search is not for more subscribers but for more authors. Of course peer review intrudes into this model, but if you could, for example, double the number of articles passing peer review for a journal you publish, you could double your gross revenue. That was mostly not the case before except where the publisher had room to increase the subscription price proportionately. There's a slippery slope here. Predatory journals have already gone over the edge on that slope and are in a smoldering heap at the bottom of the hill, but the footing can get dicey for the best of them.
Open access with "author processing charges" out-competed the subscription model. Because the Web eliminated the article rate limit imposed by page counts and printing schedules, it enabled the predatory open access journal business model. So now it is hard for people "doing their own research" to tell whether something that looks like a journal and claims to be "peer-reviewed" is real, or a pay-for-play shit-flooder[5]. The result, as Bannon explains in his context, is disorientation, confusion, and an increased space for bad actors to exploit.

Governments' response to AI's "threat to the very possibility of public debate" (and to their control of their population's information environment) is to propose regulation. Here are the EU and the Chinese government[6]. In a blog post entitled The Luring Test: AI and the engineering of consumer trust Michael Atleson of the FTC made threatening noises:
Many commercial actors are interested in these generative AI tools and their built-in advantage of tapping into unearned human trust. Concern about their malicious use goes well beyond FTC jurisdiction. But a key FTC concern is firms using them in ways that, deliberately or not, steer people unfairly or deceptively into harmful decisions in areas such as finances, health, education, housing, and employment. Companies thinking about novel uses of generative AI, such as customizing ads to specific people or groups, should know that design elements that trick people into making harmful choices are a common element in FTC cases, such as recent actions relating to financial offers, in-game purchases, and attempts to cancel services.
The Biden administration met with tech CEOs, a sure way to ensure no effective regulation is imposed. These companies are reacting to perceived competitive threats by deploying not-ready-for-prime-time systems and laying off their AI ethics teams. We should be very worried that Geoffrey Hinton quit Google because:
Until last year, he said, Google acted as a “proper steward” for the technology, careful not to release something that might cause harm. But now that Microsoft has augmented its Bing search engine with a chatbot — challenging Google’s core business — Google is racing to deploy the same kind of technology. The tech giants are locked in a competition that might be impossible to stop, Dr. Hinton said.

His immediate concern is that the internet will be flooded with false photos, videos and text, and the average person will “not be able to know what is true anymore.”
In other words Google abandoned their previously responsible approach at the first hint of competition. Companies in this state of panic are not going to pay attention to gentle suggestions from governments.

Source
Part of the problem is the analogy to the massive bubbles in cryptocurrencies, non-fungible tokens and Web3. Just as with these technologies, AI is resistant to regulation, Just as with cryptocurrencies, this is the VC's "next big thing", so deployment will be lavishly funded and have vast lobbying resources. Just as with cryptocurrencies, the bad guys will be much quicker than the regulators[7].

And now, a fascinating leak from inside Google suggests that it simply won't matter what governments, VCs or even the tech giants do. The must-read We Have No Moat: And neither does OpenAI by Dylan Patel and Afzal Ahmad posts the anonymous document:
While our models still hold a slight edge in terms of quality, the gap is closing astonishingly quickly. Open-source models are faster, more customizable, more private, and pound-for-pound more capable. They are doing things with $100 and 13B params that we struggle with at $10M and 540B. And they are doing so in weeks, not months. This has profound implications for us:
  • We have no secret sauce. Our best hope is to learn from and collaborate with what others are doing outside Google. We should prioritize enabling 3P integrations.
  • People will not pay for a restricted model when free, unrestricted alternatives are comparable in quality. We should consider where our value add really is.
  • Giant models are slowing us down. In the long run, the best models are the ones which can be iterated upon quickly. We should make small variants more than an afterthought, now that we know what is possible in the <20B parameter regime.

...
Most importantly, they have solved the scaling problem to the extent that anyone can tinker. Many of the new ideas are from ordinary people. The barrier to entry for training and experimentation has dropped from the total output of a major research organization to one person, an evening, and a beefy laptop.
The purpose of the writer was to warn Google of the competitive threat from open source AI, thus increasing the panic level significantly. The writer argues that open source AI was accelerated by Facebook's release of LLaMA and leak of its data, and it has made very rapid progress since. Hardware and software resources in reach of individuals can achieve results close to those of ChatGPT and Bard, which require cloud-level investments.

Companies can argue that their AI will be better on some axes,for example in producing fewer "hallucinations", but they can't generate a return on their massive investments (Microsoft invested $10B in Open AI in January) if the competition is nearly as good and almost free. It seems unlikely that their customers care so much about "hallucinations" that they would be willing to pay a whole lot more to get fewer of them. The tech giants clearly don't care enough to delay deploying systems that frequently "hallucinate", so why should their customers?

That threat is what the tech giants are worried about. What I'm worried about is that these developments place good-enough AI in the hands of everyone, rendering governments' attempts to strong-arm companies into preventing bad guys using it futile. After all, the bad guys don't care about additional "hallucinations", for them that's a feature not a bug. They enhance the disorientation and confusion they aim for.

A world in which "a guy sitting on their bed who weighs 400 pounds" with a laptop can flood the zone with convincing shit on any topic he chooses is definitely at an existential risk.

Notes

  1. My favorite science fiction on this theme was published in 1954. I had probably read it by 1957 in one of the yellow-jacketed Gollancz SF collections. In Fredric Brown's (very) short story Answer Dwar Ev switches on the gigantic computer:
    There was a mighty hum, the surge of power from ninety-six billion planets. Lights flashed and quieted along the miles-long panel.

    Dwar Ev stepped back and drew a deep breath. “The honor of asking the first question is yours, Dwar Reyn.”

    “Thank you,” said Dwar Reyn. “It shall be a question that no single cybernetics machine has been able to answer.”

    He turned to face the machine. “Is there a God?”

    The mighty voice answered without hesitation, without the clicking of single relay.

    “Yes, now there is a God.”

    Sudden fear flashed on the face of Dwar Ev. He leaped to grab the switch.

    A bolt of lightning from the cloudless sky struck him down and fused the switch shut.
    That is the second half of the entire story.
  2. For the record, my favored resolution of the Fermi Paradox is the Rare Earth hypothesis. This is the idea that, although there are a enormous number of stars in the galaxy, each likely with a retinue of planets, the existence of humanity on Earth has depended upon a long series of extremely unlikely events, starting from the fact that despite its location in the outskirts of the Milky Way it is a rare high-metalicity G-type star with low luminosity variation, through the Earth's stable orbit and large moon, to its plate tectonics and magnetosphere, and on and on. Thus the number of alien civilizations in the galaxy is likely to be small, possibly one.
  3. This isn't to minimize the many less-than-existential threats. The FTC is making a list and checking it twice:
    Generative AI and synthetic media are colloquial terms used to refer to chatbots developed from large language models and to technology that simulates human activity, such as software that creates deepfake videos and voice clones. Evidence already exists that fraudsters can use these tools to generate realistic but fake content quickly and cheaply, disseminating it to large groups or targeting certain communities or specific individuals. They can use chatbots to generate spear-phishing emails, fake websites, fake posts, fake profiles, and fake consumer reviews, or to help create malware, ransomware, and prompt injection attacks. They can use deepfakes and voice clones to facilitate imposter scams, extortion, and financial fraud. And that’s very much a non-exhaustive list.
    And Dan Patterson interviews Christopher Ahlberg, CEO of threat intelligence platform Recorded Future, in ChatGPT and the new AI are wreaking havoc on cybersecurity in exciting and frightening ways:
    Generative AI has helped bad actors innovate and develop new attack strategies, enabling them to stay one step ahead of cybersecurity defenses. AI helps cybercriminals automate attacks, scan attack surfaces, and generate content that resonates with various geographic regions and demographics, allowing them to target a broader range of potential victims across different countries. Cybercriminals adopted the technology to create convincing phishing emails. AI-generated text helps attackers produce highly personalized emails and text messages more likely to deceive targets.
  4. Use The Economist's interactive tool to compare US excess deaths to those of, for example, New Zealand, Taiwan, or Australia.
  5. Another recent example of the zone flooding problem is the Texas state government's stubborn defense of its fossil fuel industry via decades of misinformation from the industry and its media allies. The Texas Senate has a suite of bipartisan bills that purport to fix the recent grid failures, but:
    These are bills meant to boost fossil fuels and crowd out renewables. S.B. 1287 requires energy companies to cover more of the costs of connecting to the grid depending on distance—in what amounts to an added tax on renewable generators that often operate farther away from the central source and depend on lengthy transmission lines. Then there’s S.B. 2012, which would “incentivize the construction of dispatchable generation” and “require electric companies to pay generators to produce power in times of shortage.” Definition: more gas buildout, and more levies on electricity providers instead of gas producers. S.B. 2014 “eliminates Renewable Energy Credits” so as to “level the playing field” with gas sources, never mind the generous tax breaks that already benefit fossil fuel producers. S.B. 2015 “creates a goal of 50% dispatchable energy” for the central grid, essentially mandating that gas sources provide at least half of Texas’ electricity at all times. Senate Joint Resolution 1 hopes to enshrine S.B. 6’s gas backup program in the state constitution as a new amendment.
    Graham Readfearn's Climate scientists first laughed at a ‘bizarre’ campaign against the BoM – then came the harassment provides another recent example:
    For more than a decade, climate science deniers, rightwing politicians and sections of the Murdoch media have waged a campaign to undermine the legitimacy of the Bureau of Meteorology’s temperature records.
    ...
    “This has frankly been a concerted campaign,” says climate scientist Dr Ailie Gallant, of Monash University. “But this is not about genuine scepticism. It is harassment and blatant misinformation that has been perpetuated.”

    Despite multiple reviews, reports, advisory panels and peer-reviewed studies rejecting claims that its temperature record was biased or flawed, Gallant says the “harassment” of the bureau has continued.
    ...
    One former executive, who for eight years was responsible for the bureau’s main climate record, says the constant criticism has affected the health of scientists over many years, who were diverted from real research to repeatedly answer the same questions.
    Note the scientist's comment:
    Dr Greg Ayers, a former director of the bureau and leading CSIRO atmospheric scientist, has written four peer-reviewed papers testing claims made by sceptics.

    “There’s a lot of assertion [from sceptics] but I haven’t seen much science,” said Ayers. “If you are going to make claims then we need to do peer-reviewed science, not just assertion.”
    A climate denier with a laptop could easily ask ChatGPT to write a paper based on the Murdoch papers' articles, complete with invented citations, and pay the "author processing charge" to a predatory journal. Then Dr. Ayers would be reduced to arguing about the quality of the journal.
  6. Will Oremus' He wrote a book on a rare subject. Then a ChatGPT replica appeared on Amazon shows that AI shit-flooding is already rampant in Amazon books and clickbait Web sites:
    Experts say those books are likely just the tip of a fast-growing iceberg of AI-written content spreading across the web as new language software allows anyone to rapidly generate reams of prose on almost any topic. From product reviews to recipes to blog posts and press releases, human authorship of online material is on track to become the exception rather than the norm.
    ...
    What that may mean for consumers is more hyper-specific and personalized articles — but also more misinformation and more manipulation, about politics, products they may want to buy and much more.

    As AI writes more and more of what we read, vast, unvetted pools of online data may not be grounded in reality, warns Margaret Mitchell, chief ethics scientist at the AI start-up Hugging Face. “The main issue is losing track of what truth is,” she said. “Without grounding, the system can make stuff up. And if it’s that same made-up thing all over the world, how do you trace it back to what reality is?”
  7. The Chinese government's attempt to have it both ways by leading in the AI race but ensuring their AIs stick to the party line isn't going well. Glyn Moody summarizes the state of play in How Will China Answer The Hardest AI Question Of All?:
    Chinese regulators have just released draft rules designed to head off this threat. Material generated by AI systems “needs to reflect the core values of socialism and should not subvert state power” according to a story published by CNBC. The results of applying that approach can already be seen in the current crop of Chinese chatbot systems. Bloomberg’s Sarah Zheng tried out several of them, with rather unsatisfactory results:
    In Chinese, I had a strained WeChat conversation with Robot, a made-in-China bot built atop OpenAI’s GPT. It literally blocked me from asking innocuous questions like naming the leaders of China and the US, and the simple, albeit politically contentious, “What is Taiwan?” Even typing “Xi Jinping” was impossible.

    In English, after a prolonged discussion, Robot revealed to me that it was programmed to avoid discussing “politically sensitive content about the Chinese government or Communist Party of China.” Asked what those topics were, it listed out issues including China’s strict internet censorship and even the 1989 Tiananmen Square protests, which it described as being “violently suppressed by the Chinese government.” This sort of information has long been inaccessible on the domestic internet.
    One Chinese chatbot began by warning: “Please note that I will avoid answering political questions related to China’s Xinjiang, Taiwan, or Hong Kong.” Another simply refused to respond to questions touching on sensitive topics such as human rights or Taiwanese politics.
    On the other hand, Low De Wei reports that China Arrests ChatGPT User Who Faked Deadly Train Crash Story:
    Chinese authorities have detained a man for using ChatGPT to write fake news articles, in what appears to be one of the first instances of an arrest related to misuse of artificial intelligence in the nation.
    ...
    The alleged offense came to light after police discovered a fake article about a train crash that left nine people dead, which had been posted to multiple accounts on Baidu Inc.’s blog-like platform Baijiahao. The article was viewed over 15,000 times before being removed.

    Further investigations revealed that Hong was using the chatbot technology — which is not available in China but can be accessed via VPN networks — to modify viral news articles which he would then repost. He told investigators that friends on WeChat had showed him how to generate cash for clicks.
  8. Why Disinformation and Misinformation Are More Dangerous Than Malware by Kim Key reports on a panel at the RSA Conference:
    "The overwhelming majority of people who are ever going to see a piece of misinformation on the internet are likely to see it before anybody has a chance to do anything about it," according to Yoel Roth, the former head of Trust and Safety at Twitter.

    When he was at Twitter, Roth observed that over 90% of the impressions on posts were generated within the first three hours. That’s not much time for an intervention, which is why it's important for the cybersecurity community to develop content moderation technology that "can give truth time to wake up in the morning," he says.
    It is this short window in time that makes flooding the zone with shit so powerful. It is a DDOS attack on content moderation, which already cannot keep up.

28 comments:

Geoff said...

See also Steve Yegge's take on this: https://steve-yegge.medium.com/were-gonna-need-a-bigger-moat-478a8df6a0d2

David. said...

Jeffrey Brainard's Fake scientific papers are alarmingly common reports on Fake Publications in Biomedical Science: Red-flagging Method Indicates Mass Production by Bernhard A. Sabel et al. Brainard writes:

"When neuropsychologist Bernhard Sabel put his new fake-paper detector to work, he was “shocked” by what it found. After screening some 5000 papers, he estimates up to 34% of neuroscience papers published in 2020 were likely made up or plagiarized; in medicine, the figure was 24%. Both numbers, which he and colleagues report in a medRxiv preprint posted on 8 May, are well above levels they calculated for 2010—and far larger than the 2% baseline estimated in a 2022 publishers’ group report.

“It is just too hard to believe” at first, says Sabel of Otto von Guericke University Magdeburg and editor-in-chief of Restorative Neurology and Neuroscience. It’s as if “somebody tells you 30% of what you eat is toxic.”

His findings underscore what was widely suspected: Journals are awash in a rising tide of scientific manuscripts from paper mills—secretive businesses that allow researchers to pad their publication records by paying for fake papers or undeserved authorship. “Paper mills have made a fortune by basically attacking a system that has had no idea how to cope with this stuff,” says Dorothy Bishop, a University of Oxford psychologist who studies fraudulent publishing practices."

This is before the impact of generative AI hits the academic publishing system.

David. said...

Another instance of shit-flooding in Digby's Trumpers celebrate Dear Leader’s lies:

"So, what did CNN do wrong on Wednesday? The most important thing was they did it live. The first rule of covering Trump is that it absolutely has to be on tape or you cannot competently fact check him. Trump’s adviser Steve Bannon famously told author Michael Lewis, “the real opposition is the media. And the way to deal with them is to flood the zone with shit.” And that is exactly what Donald Trump did in prime time.

The moderator Kaitlin Collins was well prepared and corrected him repeatedly but in those situations Trump just behaves as if he hasn’t heard the other person and the truth doesn’t matter. His tsunami of lies just crashed over her head. Any respectable news organization that has to cover him should always do it with the ability to contextualize what he says and you can only do with a taped interview."

David. said...

From the "closing the stable door" department comes Anna Edgerton and Oma Seddiq's OpenAI, IBM Urge Senate to Act on AI Regulation After Past Tech Failures:

"The creator of ChatGPT and the privacy chief of International Business Machines Corp. both called on US senators during a hearing Tuesday to more heavily regulate artificial intelligence technologies that are raising ethical, legal and national security concerns.

Speaking to a Senate Judiciary subcommittee, OpenAI Chief Executive Officer Sam Altman praised the potential of the new technology, which he said could solve humanity’s biggest problems. But he also warned that artificial intelligence is powerful enough to change society in unpredictable ways, and “regulatory intervention by governments will be critical to mitigate the risks.”

“My worst fear is that we, the technology industry, cause significant harm to the world,” Altman said. “If this technology goes wrong, it can go quite wrong.”

IBM’s Chief Privacy and Trust Officer Christina Montgomery focused on a risk-based approach and called for “precision regulation” on how AI tools are used, rather than how they’re developed."

The tech giants still believe that they have a monopoly on this technology. They haven't come to terms with the fact that anyone with a laptop can use it.

David. said...

Two notes on the response to generative AI.

1) Benj Edwards reports that Poll: 61% of Americans say AI threatens humanity’s future:

"A majority of Americans believe that the rise of artificial intelligence technology could put humanity's future in jeopardy, according to a Reuters/Ipsos poll published on Wednesday. The poll found that over two-thirds of respondents are anxious about the adverse effects of AI, while 61 percent consider it a potential threat to civilization.

The online poll, conducted from May 9 to May 15, sampled the opinions of 4,415 US adults. It has a credibility interval (a measure of accuracy) of plus or minus two percentage points."

2) Delos Prime reports that EU AI Act To Target US Open Source Software:

"In a bold stroke, the EU’s amended AI Act would ban American companies such as OpenAI, Amazon, Google, and IBM from providing API access to generative AI models. The amended act, voted out of committee on Thursday, would sanction American open-source developers and software distributors, such as GitHub, if unlicensed generative models became available in Europe. While the act includes open source exceptions for traditional machine learning models, it expressly forbids safe-harbor provisions for open source generative systems."

David. said...

More scope for generative AI shit-flooding in ProPublica's The Newest College Admissions Ploy: Paying to Make Your Teen a “Peer-Reviewed” Author:

"Sophia was entering her sophomore year in prep school, but her parents were already thinking ahead to college. They paid to enroll her in an online service called Scholar Launch, whose programs start at $3,500. Scholar Launch, which started in 2019, connects high school students with mentors who work with them on research papers that can be published and enhance their college applications.

Publication “is the objective,” Scholar Launch says on its website. “We have numerous publication partners, all are peer-reviewed journals.”

The prospect appealed to Sophia. “Nowadays, having a publication is kind of a given” for college applicants, she said."

David. said...

Mark Sumner reports on a little taste of what shit-flooding can do in AI is no joke when it causes $500 billion in market losses:

"On Monday, a pair of AI-generated images appeared on social media platforms Twitter and Telegram. One of these showed what was reportedly a large explosion at the Pentagon. The second, posted a few minutes later, showed what was reported to be a separate explosion at the White House. Both of these images were swiftly reposted thousands of times on both platforms.

Notably, they were shared on Twitter by new “gold check” accounts belonging to what Twitter now considers an “official” business. In this case, it was the Russian state-owned media outlet RT that retweeted the images. Shortly afterward, a blue-checkmark account reportedly belonging to Bloomberg News did the same.

Within a few minutes, The Street reports the S&P stock index lost more than $500 billion. Most of that value gradually returned over the next few minutes as it became clear the pictures were fake. They had been generated by an AI art program. The blue-check Bloomberg account (as well as several other blue-check accounts with authoritative names) was also a fake.
...
With half a day’s coding or less, it would be perfectly possible to create a crisis bot that would sift through the current news, order up images of a plausible disaster, mount them on social media, boost them with thousands or tens of thousands of retweets and links, push them with apparently authoritative accounts, and pitch them in a way tailor-made to trigger a response from the bots that operate the stock market, the bond market, the commodities market, or just about any other aspect of the economy.

They could do it regularly, randomly, or on targeted occasions. They could do it much more convincingly than these two images—and in ways that were much more difficult to refute. Whether what happened on Monday was a trial balloon, cyber warfare, or someone just farting around, we should be taking the results of that action very, very seriously."

David. said...

Two more dire predictions of the effects of shit-flooding in Lionel Laurent's The AI Gold Rush Will Take Humanity to Some Dark Places:

"In the same way as techno-solutionists offer AI companions as the answer to loneliness wrought by social media, this proposed fix to the side-effects of AI tools including ChatGPT, the generative software unleashed on the world by Altman’s OpenAI company, threatens to create even more unintended consequences."

And Simon Hattenstone's Tech guru Jaron Lanier: ‘The danger isn’t that AI destroys us. It’s that it drives us insane’:

"the danger isn’t that a new alien entity will speak through our technology and take over and destroy us. To me the danger is that we’ll use our technology to become mutually unintelligible or to become insane if you like, in a way that we aren’t acting with enough understanding and self-interest to survive, and we die through insanity, essentially."

David. said...

Emine Yücel's Will A Storm Of AI-Generated Misinfo Flood The 2024 Election? A Few Dems Seek To Get Ahead Of It shows the issue is getting attention in Congress:

"In early May, Clarke introduced the The REAL Political Ads Act, legislation that would expand the current disclosure requirements, mandating that AI-generated content be identified in political ads.

The New York Democrat is particularly concerned about the spread of misinformation around elections, coupled with the fact that a growing number of people can deploy the powerful technology rapidly and with minimal cost.
...
The existence of AI-generated content in and of itself is already having an effect on how people consume and trust that the information they’re absorbing is real.

“The truth is that because the effect of generative AI is to make people doubt whether or not anything they see is real, it’s in no one’s interest when it comes to a democracy,” Imran Ahmed, CEO of the Center for Countering Digital Hate, told TPM."

And at the WHite House, as Katyanna Quach reports in Get ready for Team America: AI Police:

"The US Office of Science and Technology Policy (OSTP) has updated its National AI R&D Strategic Plan for the first time since 2019, without making enormous changes.
...
there's the new strategy: "Establish a principled and coordinated approach to international collaboration in AI research.

International collaboration, with the USA convening and driving debate, is a signature tactic for president Biden. In this case he appears to be using it to drive debate on concerns about how AI impacts data privacy and safety, and to address the issue of biases in generative AI."

David. said...

Two articles from the mainstream media make important points about shit-flooding and the industry's response.

Stuart A. Thompson's A.I.-Generated Content Discovered on News Sites, Content Farms and Product Reviews quotes Steven Brill, the chief executive of NewsGuard:

“News consumers trust news sources less and less in part because of how hard it has become to tell a generally reliable source from a generally unreliable source, This new wave of A.I.-created sites will only make it harder for consumers to know who is feeding them the news, further reducing trust.”

Samantha Floreani's Yes, you should be worried about AI – but Matrix analogies hide a more insidious threat describes how the industry is

"The problem with pushing people to be afraid of AGI while calling for intervention is that it enables firms like OpenAI to position themselves as the responsible tech shepherds – the benevolent experts here to save us from hypothetical harms, as long as they retain the power, money and market dominance to do so. Notably, OpenAI’s position on AI governance focuses not on current AI but on some arbitrary point in the future. They welcome regulation, as long as it doesn’t get in the way of anything they’re currently doing."

David. said...

In Big Tech Isn’t Prepared for A.I.’s Next Chapter, Bruce Schneier and Jim Waldo make the same argument I did:

"We have entered an era of LLM democratization. By showing that smaller models can be highly effective, enabling easy experimentation, diversifying control, and providing incentives that are not profit motivated, open-source initiatives are moving us into a more dynamic and inclusive A.I. landscape. This doesn’t mean that some of these models won’t be biased, or wrong, or used to generate disinformation or abuse. But it does mean that controlling this technology is going to take an entirely different approach than regulating the large players."

David. said...

Robert McMillan's How North Korea’s Hacker Army Stole $3 Billion in Crypto, Funding Nuclear Program reports that:

"Ultimately they stole more than $600 million—mostly from players of Sky Mavis’s digital pets game, Axie Infinity.

It was the country’s biggest haul in five years of digital heists that have netted more than $3 billion for the North Koreans, according to the blockchain analytics firm Chainalysis. That money is being used to fund about 50% of North Korea’s ballistic missile program, U.S. officials say, which has been developed in tandem with its nuclear weapons. Defense accounts for an enormous portion of North Korea’s overall spending; the State Department estimated in 2019 Pyongyang spent about $4 billion on defense, accounting for 26 percent of its overall economy."

Matt Levine comments:

"Venture capitalists have largely pivoted from crypto to artificial intelligence, and while the popular view is that AI has a higher probability of wiping out humanity than crypto does, “crypto funds the North Korean missile program” would be a funny way for crypto to kill us all before a rogue AI can."

David. said...

Matt Levine is also looking at the potential for shit-flooding:

"And so at the market-microstructure level, it is easy to imagine letting an artificial intelligence model loose on the stock market and telling it “learn how to trade profitably,” and the model coming back and saying “it seems like the stock market is dominated by simple market-making algorithms that respond to order-book information, and actually the way to trade profitably is to do a lot of spoofing and market manipulation to trick them.”
...
And at the level of writing fake press releases, generative AI is probably better at writing fake press releases (and illustrating them with convincing fake photos) than, you know, the average market manipulator is."

David. said...

Mia Sato reports on one kind of shit-flooding in A storefront for robots - AI written SEO garbage:

"It’s a universal experience for small business owners who’ve come to rely on Google as a major source of traffic and customers. But it’s also led to the degradation of Google’s biggest product, Search, over time. The problem is poised to only continue to spiral as business owners, publishers, and other search-reliant businesses increasingly use artificial intelligence tools to do the search-related busywork. It’s already happening in digital media — outlets like CNET and Men’s Journal have begun using generative AI tools to produce SEO-bait articles en masse. Now, online shoppers will increasingly encounter computer-generated text and images, likely without any indication of AI tools.
...
AI companies offer tools that generate entire websites using automation tools, filling sites with business names, fake customer testimonials, and images for less than the price of lunch.

The result is SEO chum produced at scale, faster and cheaper than ever before. The internet looks the way it does largely to feed an ever-changing, opaque Google Search algorithm. Now, as the company itself builds AI search bots, the business as it stands is poised to eat itself."

David. said...

From the "no-one could have predicted" department comes Rhiannon Williams' The people paid to train AI are outsourcing their work… to AI:

"A significant proportion of people paid to train AI models may be themselves outsourcing that work to AI, a new study has found.

It takes an incredible amount of data to train AI systems to perform specific tasks accurately and reliably. Many companies pay gig workers on platforms like Mechanical Turk to complete tasks that are typically hard to automate, such as solving CAPTCHAs, labeling data and annotating text. This data is then fed into AI models to train them. The workers are poorly paid and are often expected to complete lots of tasks very quickly.
...
a team of researchers from the Swiss Federal Institute of Technology (EPFL) hired 44 people on the gig work platform Amazon Mechanical Turk to summarize 16 extracts from medical research papers. ...

They estimated that somewhere between 33% and 46% of the workers had used AI models like OpenAI’s ChatGPT."

The paper is Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks by Veniamin Veselovsky et al.

David. said...

Connie Loizos' Get a clue, says panel about buzzy AI tech: It’s being ‘deployed as surveillance’ reports on a recent Bloomberg conference:

"Featuring Meredith Whittaker, the president of the secure messaging app Signal; Credo AI co-founder and CEO Navrina Singh; and Alex Hanna, the director of Research at the Distributed AI Research Institute, the three had a unified message for the audience, which was: Don’t get so distracted by the promise and threats associated with the future of AI. It is not magic, it’s not fully automated and — per Whittaker — it’s already intrusive beyond anything that most Americans seemingly comprehend."

David. said...

More on flooding the academic zone with shit in In a Tipster’s Note, a View of Science Publishing’s Achilles Heel by Jonathan Moens, Undark & Retraction Watch:

"Publishers have to initiate investigations, which often involves looking into articles on a case-by-case basis — a process that can take months, if not years, to complete. When publishers do make the retraction, they often provide little information about the nature of the problem, making it difficult for journals to learn from each other’s lapses. All in all, said Bishop, the system just isn’t built to deal with the gargantuan size of the problem.

“This is a system that’s set up for the occasional bad apple,” Bishop said. “But it’s not set up to deal with this tsunami of complete rubbish that is being pumped into these journals at scale.”

A recent effort by publishers and the International Association of Scientific, Technical and Medical Publishers, an international trade group, however, aims to provide editors with tools to check articles for evidence of paper mill involvement and simultaneous submission to multiple journals, among other issues."

David. said...

The theme of AI-enabled shit-flooding is all over the Web these days:

1) Ben Quinn and Dan Milmo's Time running out for UK electoral system to keep up with AI, say regulators:

"Time is running out to enact wholesale changes to ensure Britain’s electoral system keeps pace with advances in artificial intelligence before the next general election, regulators fear.

New laws will not come in time for the election, which will take place no later than January 2025, and the watchdog that regulates election finance and sets standards for how elections should be run is appealing to campaigners and political parties to behave responsibly.

There are concerns in the UK and US that their next elections could be the first in which AI could wreak havoc by generating convincing fake videos and images. Technology of this type is in the hands of not only political and technology experts but increasingly the wider public."

Good luck with Nigel Farage "behaving responsibly".

2) James Vincent's AI is killing the old web, and the new web struggles to be born:

"Given money and compute, AI systems — particularly the generative models currently in vogue — scale effortlessly. They produce text and images in abundance, and soon, music and video, too. Their output can potentially overrun or outcompete the platforms we rely on for news, information, and entertainment. But the quality of these systems is often poor, and they’re built in a way that is parasitical on the web today. These models are trained on strata of data laid down during the last web-age, which they recreate imperfectly. Companies scrape information from the open web and refine it into machine-generated content that’s cheap to generate but less reliable. This product then competes for attention with the platforms and people that came before them."

3) Anil Dash's Today's AI is unreasonable:

"Today's highly-hyped generative AI systems (most famously OpenAI) are designed to generate bullshit by design. To be clear, bullshit can sometimes be useful, and even accidentally correct, but that doesn't keep it from being bullshit. Worse, these systems are not meant to generate consistent bullshit — you can get different bullshit answers from the same prompts. You can put garbage in and get... bullshit out, but the same quality bullshit that you get from non-garbage inputs! And enthusiasts are current mistaking the fact that the bullshit is consistently wrapped in the same envelope as meaning that the bullshit inside is consistent, laundering the unreasonable-ness into appearing reasonable.

Now we have billions of dollars being invested into technologies where it is impossible to make falsifiable assertions. A system that you cannot debug through a logical, socratic process is a vulnerability that exploitative tech tycoons will use to do what they always do, undermine the vulnerable."

David. said...

Will Knight reports on another demonstration of the problem in Researcher builds anti-Russia AI disinformation machine for $400:

"Russian criticism of the US is far from unusual, but CounterCloud’s material pushing back was: The tweets, the articles, and even the journalists and news sites were crafted entirely by artificial intelligence algorithms, according to the person behind the project, who goes by the name Nea Paw and says it is designed to highlight the danger of mass-produced AI disinformation. Paw did not post the CounterCloud tweets and articles publicly but provided them to WIRED and also produced a video outlining the project.
...
Paw says the project shows that widely available generative AI tools make it much easier to create sophisticated information campaigns pushing state-backed propaganda.

“I don't think there is a silver bullet for this, much in the same way there is no silver bullet for phishing attacks, spam, or social engineering,” Paw says in an email. Mitigations are possible, such as educating users to be watchful for manipulative AI-generated content, making generative AI systems try to block misuse, or equipping browsers with AI-detection tools. “But I think none of these things are really elegant or cheap or particularly effective,” Paw says."

David. said...

Gemma Conroy reports that Scientific sleuths spot dishonest ChatGPT use in papers:

"On 9 August, the journal Physica Scripta published a paper that aimed to uncover new solutions to a complex mathematical equation1. It seemed genuine, but scientific sleuth Guillaume Cabanac spotted an odd phrase on the manuscript’s third page: ‘Regenerate response’.
...
Since April, Cabanac has flagged more than a dozen journal articles that contain the telltale ChatGPT phrases ‘Regenerate response’ or ‘As an AI language model, I …’ and posted them on PubPeer."

David. said...

Julia Angwin takes the story to the New York Times op-ed page with The Internet Is About to Get Much Worse:

"Greg Marston, a British voice actor, recently came across “Connor” online — an A.I.-generated clone of his voice, trained on a recording Mr. Marston had made in 2003. It was his voice uttering things he had never said.

Back then, he had recorded a session for IBM and later signed a release form allowing the recording to be used in many ways. Of course, at that time, Mr. Marston couldn’t envision that IBM would use anything more than the exact utterances he had recorded. Thanks to artificial intelligence, however, IBM was able to sell Mr. Marston’s decades-old sample to websites that are using it to build a synthetic voice that could say anything."

David. said...

In Malicious ad served inside Bing's AI chatbot Jérôme Segura describes one type of shit with which the zone is being flooded:

"Ads can be inserted into a Bing Chat conversation in various ways. One of those is when a user hovers over a link and an ad is displayed first before the organic result. In the example below, we asked where we could download a program called Advanced IP Scanner used by network administrators. When we place our cursor over the first sentence, a dialog appears showing an ad and the official website for this program right below it:
...
Upon clicking the first link, users are taken to a website (mynetfoldersip[.]cfd) whose purpose is to filter traffic and separate real victims from bots, sandboxes, or security researchers. It does that by checking your IP address, time zone, and various other system settings such as web rendering that identifies virtual machines.

Real humans are redirected to a fake site (advenced-ip-scanner[.]com) that mimics the official one while others are sent to a decoy page. The next step is for victims to download the supposed installer and run it."

David. said...

One barrier against the flood is debunked in Researchers Tested AI Watermarks—and Broke All of Them by Kate Knibbs:

"Soheil Feizi considers himself an optimistic person. But the University of Maryland computer science professor is blunt when he sums up the current state of watermarking AI images. “We don’t have any reliable watermarking at this point,” he says. “We broke all of them.”

For one of the two types of AI watermarking he tested for a new study—“low perturbation” watermarks, which are invisible to the naked eye—he’s even more direct: “There’s no hope.”

Feizi and his coauthors looked at how easy it is for bad actors to evade watermarking attempts. (He calls it “washing out” the watermark.) In addition to demonstrating how attackers might remove watermarks, the study shows how it’s possible to add watermarks to human-generated images, triggering false positives."

David. said...

From the "no-one could have predicted" department comes Emanuel Maiberg's 4chan Uses Bing to Flood the Internet With Racist Images:

"4chan users are coordinating a posting campaign where they use Bing’s AI text-to-image generator to create racist images that they can then post across the internet. The news shows how users are able to manipulate free to access, easy to use AI tools to quickly flood the internet with racist garbage, even when those tools are allegedly strictly moderated.

“We’re making propaganda for fun. Join us, it’s comfy,” the 4chan thread instructs. “MAKE, EDIT, SHARE.”

A visual guide hosted on Imgur that’s linked in that post instructs users to use AI image generators, edit them to add captions that make them seem like political campaigns, and post them to social media sites, specifically Telegram, Twitter, and Instagram. 404 Media has also seen these images shared on a TikTok account that has since been removed."

David. said...

Katyanna Quach's AI girlfriend encouraged man to attempt crossbow assassination of Queen reports:

"Jaswant Singh Chail, 21, made headlines when he broke into Windsor Castle on Christmas Day in 2021 brandishing a loaded crossbow. He later admitted to police he had come to assassinate Queen Elizabeth II.

This week he was sentenced to nine years behind bars for treason, though he will be kept at a psychiatric hospital until he's ready to serve his time in the clink. He had also pleaded guilty to making threats to kill and being in possession of an offensive weapon.

It's said Chail wanted to slay the Queen as revenge for the Jallianwala Bagh massacre in 1919, when the British Army opened fire on a crowd peacefully protesting the Rowlatt Act, a controversial piece of legislation aimed at cracking down on Indian nationalists fighting for independence. It is estimated that up to over 1,500 protesters in Punjab, British India, were killed.

Investigators discovered Chail, who lived in a village just outside Southampton, had been conversing with an AI chatbot, created by the startup Replika, almost every night from December 8 to 22, exchanging over 5,000 messages. The virtual relationship reportedly developed into a romantic and sexual one with Chail declaring his love for the bot he named Sarai."

David. said...

Tom Di Fonzo's What You Need to Know About Generative AI’s Emerging Role in Political Campaigns isn't encouraging:

"Increased accessibility to this technology could allow more people to leverage it for their own purposes, both good and bad. Individual hobbyists today can generate hyper-targeted political messages and deep fakes that previously required significant resources, technical skills, and institutional access. “Before you needed [to] run a building full of Russians in St. Petersburg to [spread disinformation],” said cybersecurity expert Bruce Schneier. “Now hobbyists can do what it took…in 2016.”
...
Ben Winters, senior counsel at EPIC, a research organization focused on emerging privacy and civil liberties issues related to new technologies, pointed to a recent case of two men using robocalls to target Black voters with disinformation as an example of small groups engaging in large-scale manipulation. The men were sentenced after using robocalls to disseminate false information about voting by mail in Ohio. Winters worries similar groups could potentially utilize generative AI “to do that in a less trackable” way. With AI tools like ChatGPT, bad actors can easily “write me a text message” containing whatever fabricated message suits their aims, he noted. He is concerned that generative AI allows for deception “in a much sneakier way” while evading oversight."

David. said...

Cade Metz reports that Chatbots May ‘Hallucinate’ More Often Than Many Realize:

"When summarizing facts, ChatGPT technology makes things up about 3 percent of the time, according to research from a new start-up. A Google system’s rate was 27 percent.
...
Now a new start-up called Vectara, founded by former Google employees, is trying to figure out how often chatbots veer from the truth. The company’s research estimates that even in situations designed to prevent it from happening, chatbots invent information at least 3 percent of the time — and as high as 27 percent."

David. said...

Faked audio of Sadiq Khan dismissing Armistice Day shared among far-right groups by Dan Sabbagh is the latest example of the problem:

"Faked audio of Sadiq Khan dismissing the importance of Armistice Day events this weekend is circulating among extreme right groups, prompting a police investigation, according to the London mayor’s office.

One of the simulated audio clips circulating on TikTok begins: “I don’t give a flying shit about the Remembrance weekend,” with the anonymous poster going on to ask “is this for real or AI?” in an effort to provoke debate."