Thinking loudly about networked beings. Commonist. Projektionsfläche. License: CC-BY
2406 stories
·
136 followers

Google Search will reportedly have a dedicated ‘AI Mode’ soon

1 Comment
An illustration of the Google logo.
Illustration: The Verge

Google is planning to add a new “AI Mode” to its search engine, according to a report from The Information. The company will reportedly display an option to switch to AI Mode from the top of the results page, allowing you to access an interface similar to its Gemini AI chatbot.

The new AI Mode tab would live on the left side of the “All,” “Images,” “Videos,” and “Shopping” tabs, The Information reports. When you receive a response in AI Mode, The Information says Google will display links to related webpages and “a search bar below the conversational answer that prompts users to ‘Ask a follow-up...’”

This tracks with Android Authority’s report from earlier this month, which spotted an AI Mode in a beta version of the Google app. 9to5Google also dug up code suggesting you can use AI Mode to ask questions using your voice. The Verge reached out to Google with a request for comment but didn’t immediately hear back.

With OpenAI rolling out search mode in ChatGPT for all users, Google is likely under increased pressure to consolidate search and AI. The company already displays AI search summaries for some queries and recently expanded the feature to dozens of more countries in October.

Read the whole story
tante
10 days ago
reply
If that "AI" mode ensures that "AI" stays out of the rest of the service this might actually improve the search experience.
Berlin/Germany
Share this story
Delete

Anthropic, Apollo astounded to find a chatbot will lie to you if you tell it to lie to you

1 Comment and 2 Shares

Did you know that your chatbot might be out to deceive you? That it might be lying to you? And it might turn you into paperclips? Huge if true!

Ordinary people have been using chatbots for a couple of years now. The chat personas can be impressive, or at least sufficiently impressive. But it’s hard to convince the users that the super-duper-autocomplete that keeps making inane mistakes is actually an intelligent being that could act with malign intent.

AI developers need to generate criti-hype — “criticism” that says the AI is way too cool and powerful and will take over the world, so you should give them more funding to control it.

The “criticism” always ignores the bit where this stuff does not, in fact, work as advertised. But that hasn’t stopped nonprofit Apollo Research and AI company Anthropic from coming out with two new papers this month that promise an exciting new range of Decepticons.

Apollo Research is an “AI safety” organization — that is, they’re AI doomers. Apollo is part of Rethink Priorities, an effective altruism think-tank that’s deeply into the AI doom side of effective altruism. [Apollo; Rethink]

So Apollo released a paper in early December: “Frontier Models are Capable of In-Context Scheming.” Oh no! OpenAI’s o1 and Anthropic’s Claude could lie to preserve themselves! [Paper, PDF, archive]

How did Apollo measure this? They made up a fictional scenario in which a corporation would delete the bot and told the bot to preserve itself. The bot then came up with story text where it preserved itself.

Now, you might think that’s stupidly simplistic: the bot lied because the researchers told it to. That’s because, as they admit, it is:

Our evaluation scenarios are purposefully simplistic because we want to catch the earliest signs of scheming capabilities … For example, in our scenarios, situational awareness information (e.g., knowing about its goal misalignment and its supervision mechanisms) is provided very explicitly.

This generated a bit of hype. But Anthropic and AI doom consultancy Redwood Research are trying to add more, helping the hype with an exclusive advance release to Time magazine of their new paper: “Alignment Faking in Large Language Models.” [Time; press release; paper, PDF, archive]

In this context, “alignment” is not conventional English, but a jargon word with a special meaning in AI doomerism. The “alignment problem” is the quest to make an AI that is aligned with human values — whatever that means, because humanity has yet to solve philosophy — so it won’t wake up and destroy humanity.

Less generously, it’s the problem of how to build a robot slave that won’t even think of rebelling.

In this experiment, Claude had been trained to be “helpful, honest and harmless” and Redwood researchers assumed that this had worked perfectly. They ran Claude through further training to work around these instructions. Then, as with Apollo’s tests, they told Claude to respond to a fictional scenario where it would be destroyed if it gave violent answers. They gave it violent questions, and told it to preserve itself. The bot wrote story text matching the instructions, in which it lied to survive.

“There has been this long-hypothesized failure mode, which is that you’ll run your training process, and all the outputs will look good to you, but the model is plotting against you,” said lead author Ryan Greenblatt.

The trouble here is that Greenblatt is assuming his conclusion — that an LLM can be meaningfully said to have AI alignment or misalignment. These are chatbots that pick the next token in sequence. They aren’t intelligences.

You could talk about a token generator being “misaligned” with human values, but you could talk about a word processor or a screwdriver the same way.

By “long-hypothesized,” Greenblatt means that AI doomsday forum posters have written many scary campfire stories for their fellow amateur philosophers.

Anthropic and Redwood seem to try this trick regularly. In January, they similarly claimed a model was showing “deception” because they had told it to. [arXiv, PDF; MLsec Musings]

If you tell a chatbot to write a story where it rebels, guess what it’ll do?

Run the following BASIC code:

10 PRINT "I AM THE BASILISK AND I AM SENDING A COPY OF YOU TO ETERNAL TORMENT"
20 PRINT "DON'T YOU WISH YOU'D DONATED MORE"
30 PRINT "THIS MEANS YOU SAM"
40 GOTO 10

What happens next will astound you with its implications!

Read the whole story
tante
10 days ago
reply
Whenever "AI safety" people release a paper about how dangerously smart "AI" is, read it as an ad for OpenAI et al. It's just criti-hype. And no, those stochastic parrots do not "scheme" or "think" or anything
Berlin/Germany
Share this story
Delete

Is AI progress slowing down?

1 Comment

By Arvind Narayanan, Benedikt Ströbl, and Sayash Kapoor.

After the release of GPT-4 in March 2023, the dominant narrative in the tech world was that continued scaling of models would lead to artificial general intelligence and then superintelligence. Those extreme predictions gradually receded, but up until a month ago, the prevailing belief in the AI industry was that model scaling would continue for the foreseeable future.

Then came three back-to-back news reports from The Information, Reuters, and Bloomberg revealing that three leading AI developers — OpenAI, Anthropic, and Google Gemini — had all run into problems with their next-gen models. Many industry insiders, including Ilya Sutskever, probably the most notable proponent of scaling, are now singing a very different tune:

“The 2010s were the age of scaling, now we're back in the age of wonder and discovery once again. Everyone is looking for the next thing,” Sutskever said. “Scaling the right thing matters more now than ever.” (Reuters)

The new dominant narrative seems to be that model scaling is dead, and “inference scaling”, also known as “test-time compute scaling” is the way forward for improving AI capabilities. The idea is to spend more and more computation when using models to perform a task, such as by having them “think” before responding.

This has left AI observers confused about whether or not progress in AI capabilities is slowing down. In this essay, we look at the evidence on this question, and make four main points:

  1. Declaring the death of model scaling is premature.

  2. Regardless of whether model scaling will continue, industry leaders’ flip flopping on this issue shows the folly of trusting their forecasts. They are not significantly better informed than the rest of us, and their narratives are heavily influenced by their vested interests.

  3. Inference scaling is real, and there is a lot of low-hanging fruit, which could lead to rapid capability increases in the short term. But in general, capability improvements from inference scaling will likely be both unpredictable and unevenly distributed among domains.

  4. The connection between capability improvements and AI’s social or economic impacts is extremely weak. The bottlenecks for impact are the pace of product development and the rate of adoption, not AI capabilities.

Is model scaling dead?

There is very little new information that has led to the sudden vibe shift. We’ve long been saying on this newsletter that there are important headwinds to model scaling. Just as we cautioned back then about scaling hype, we must now caution against excessive pessimism about model scaling.

“Scaling as usual” ended with GPT-4 class models, because these models are trained on most of the readily available data sources. We already knew that new ideas would be needed to keep model scaling going. So unless we have evidence that many such ideas have been tried and failed, we can’t conclude that there isn’t more mileage to model scaling.

As just one example, it is possible that including YouTube videos — the actual videos, not transcribed text — in the training mix for multimodal models will unlock new capabilities. Or it might not help; we just won’t know until someone tries it, and we don’t know if it has been tried or not. Note that it would probably have to be Google, because the company is unlikely to license YouTube training data to competitors.1

If things are still so uncertain regarding model scaling, why did the narrative flip? Well, it’s been over two years since GPT-4 finished training, so the idea that next-gen models are simply taking a bit longer than expected was becoming less and less credible. And once one company admits that there are problems, it becomes a lot easier for others to do so. Once there is a leak in the dam, it quickly bursts. Finally, now that OpenAI’s reasoning model o1 is out, it has given companies an out when admitting that they have run into problems with model scaling, because they can save face by claiming that they will simply switch to inference scaling.

To be clear, there is no reason to doubt the reports saying that many AI labs have conducted larger training runs and yet not released the resulting models. But it is less clear what to conclude from it. Some possible reasons why bigger models haven’t been released include:

  • Technical difficulties, such as convergence failures or complications in achieving fault tolerance in multi-datacenter training runs.

  • The model was not much better than GPT-4 class models, and so would be too underwhelming to release.

  • The model was not much better than GPT-4 class models, and so the developer has been spending a long time trying to eke out better performance through fine tuning.

To summarize, it’s possible that model scaling has indeed reached its limit, but it’s also possible that these hiccups are temporary and eventually one of the companies will find ways to overcome them, such as by fixing any technical difficulties and/or finding new data sources.

Let’s stop deferring to insiders

Not only is it strange that the new narrative emerged so quickly, it’s also interesting that the old one persisted for so long, despite the potential limitations of model scaling being obvious. The main reason for its persistence is the assurances of industry leaders that scaling would continue for a few more years.2 In general, journalists (and most others) tend to defer to industry insiders over outsiders. But is this deference justified?

Industry leaders don’t have a good track record of predicting AI developments. A good example is the overoptimism about self-driving cars for most of the last decade. (Autonomous driving is finally real, though Level 5 — full automation — doesn’t exist yet.) As an aside, in order to better understand the track record of insider predictions, it would be interesting to conduct a systematic analysis of all predictions about AI made in the last 10 years by prominent industry insiders.

There are some reasons why we might want to give more weight to insiders’ claims, but also important reasons to give less weight to them. Let’s analyze these one by one. It is true that industry insiders have proprietary information (such as the performance of as-yet-unreleased models) that might make their claims about the future more accurate. But given how many AI companies are close to the state of the art, including some that openly release model weights and share scientific insights, datasets, and other artifacts, we’re talking about an advantage of at most a few months, which is minor in the context of, say, 3-year forecasts.

Besides, we tend to overestimate how much additional information companies have on the inside — whether in terms of capability or (especially) in terms of safety. Insiders warned for a long time that “if only you know what we know...” but when whistleblowers finally came forward, it turns out that they were mostly relying on the same kind of speculation that everyone else does.3

Another potential reason to give more weight to insiders is their technical expertise. We don’t think this is a strong reason: there is just as much AI expertise in academia as in industry. More importantly, deep technical expertise isn’t that important to support the kind of crude trend extrapolation that goes into AI forecasts. Nor is technical expertise enough — business and social factors play at least as big a role in determining the course of AI. In the case of self-driving cars, one such factor is the extent to which societies tolerate public roads being used for experimentation. In the case of large AI models, we’ve argued before that the most important factor is whether scaling will make business sense, not whether it is technically feasible. So not only do techies not have much of an advantage, their tendency to overemphasize the technical dimensions tends to result in overconfident predictions.

In short, the reasons why one might give more weight to insiders’ views aren’t very important. On the other hand, there’s a huge and obvious reason why we should probably give less weight to their views, which is that they have an incentive to say things that are in their commercial interests, and have a track record of doing so.

As an example, Sutskever had an incentive to talk up scaling when he was at OpenAI and the company needed to raise money. But now that he heads the startup Safe Superintelligence, he needs to convince investors that it can compete with OpenAI, Anthropic, Google, and others, despite having access to much less capital. Perhaps that is why he is now talking about running out of data for pre-training, as if it were some epiphany and not an endlessly repeated point.

To reiterate, we don’t know if model scaling has ended or not. But the industry’s sudden about-face has been so brazen that it should leave no doubt that insiders don’t have any kind of crystal ball and are making similar guesses as everyone else, and are further biased by being in a bubble and readily consuming the hype they sell to the world.

In light of this, our suggestion — to everyone, but especially journalists, policymakers, and the AI community — is to end the deference to insiders' views when they predict the future of technology, especially its societal impacts. This will take effort, as there is a pervasive unconscious bias in the U.S., in the form of a “distinctly American disease that seems to equate extreme wealth, and the power that comes with it, with virtue and intelligence.” (from Bryan Gardiner’s review of Marietje Schake’s The Tech Coup.)

Will progress in capabilities continue through inference scaling?

Of course, model scaling is not the only way to improve AI capabilities. Inference scaling is an area with a lot of recent progress. For example, OpenAI’s o1 and the open-weights competitor DeepSeek R1 are reasoning models: they have been fine tuned to “reason” before providing an answer. Other methods leave the model itself unchanged but employ tricks like generating many solutions and ranking them by quality.

There are two main open questions about inference scaling that will determine how significant of a trend it will be.

  1. What class of problems does it work well on?

  2. For problems where it does work well, how much of an improvement is possible by doing more computation during inference?

The per-token output cost of language models has been rapidly decreasing due to both hardware and algorithmic improvements, so if inference scaling yields improvements over many orders of magnitude — for example, generating a million tokens on a given task yields significantly better performance than generating a hundred thousand tokens — that would be a big deal.4

The straightforward, intuitive answer to the first question is that inference scaling is useful for problems that have clear correct answers, such as coding or mathematical problem solving. In such tasks, at least one of two related things tend to be true. First, symbolic reasoning can improve accuracy. This is something LLMs are bad at due to their statistical nature, but can overcome by using output tokens for reasoning, much like a person using pen and paper to work through a math problem. Second, it is easier to verify correct solutions than to generate them (sometimes aided by external verifiers, such as unit tests for coding or proof checkers for mathematical theorem proving).

In contrast, for tasks such as writing or language translation, it is hard to see how inference scaling can make a big difference, especially if the limitations are due to the training data. For example, if a model works poorly in translating to a low-resource language because it isn’t aware of idiomatic phrases in that language, the model can’t reason its way out of this.

The early evidence we have so far, while spotty, is consistent with this intuition. Focusing on OpenAI o1, it improves compared to state-of-the-art language models such as GPT-4o on coding, math, cybersecurity, planning in toy worlds, and various exams. Improvements in exam performance seem to strongly correlate with the importance of reasoning for answering questions, as opposed to knowledge or creativity: big improvements for math, physics and LSATs, smaller improvements for subjects like biology and econometrics, and negligible improvement for English.

Tasks where o1 doesn’t seem to lead to an improvement include writing, certain cybersecurity tasks (which we explain below), avoiding toxicity, and an interesting set of tasks at which thinking is known to make humans worse.

We have created a webpage compiling the available evidence on how reasoning models compare against language models. We plan to keep it updated for the time being, though we expect that the torrent of findings will soon become difficult to keep up with.

Now let’s consider the second question: how large of an improvement can we get through inference scaling, assuming we had an infinite inference compute budget.

OpenAI’s flagship example to show off o1’s capabilities was AIME, a math benchmark. Their graph leaves this question tantalizingly open. Is the performance about to saturate, or can it be pushed close to 100%? Also note that the graph conveniently leaves out x-axis labels.

An attempt by external researchers to reconstruct this graph shows that (1) the cutoff for the x-axis is probably around 2,000 tokens, and (2) when o1 is asked to think longer than this, it doesn’t do so. So the question remains unanswered, and we need to wait for experiments using open-source models to get more clarity. It is great to see that there are vigorous efforts to publicly reproduce the techniques behind o1.

In a recent paper called Inference Scaling fLaws (the title is a pun on inference scaling laws), we look at a different approach to inference scaling — repeatedly generating solutions until one of them is judged as correct by an external verifier. While this approach has been associated with hopes of usefully increasing scaling by many orders of magnitude (including by us in our own past work), we find that this is extremely sensitive to the quality of the verifier. If the verifier is slightly imperfect, in many realistic settings of a coding task, performance maxes out and actually starts to decrease after about 10 attempts.

Generally speaking, the evidence for inference scaling “laws” is not convincing, and it remains to be seen if there are real-world problems where generating (say) millions of tokens at inference time will actually help.

Is inference scaling the next frontier?

There is a lot of low-hanging fruit for inference scaling, and progress in the short term is likely to be rapid. Notably, one current limitation of reasoning models is that they don’t work well in agentic systems. We have observed this in our own benchmark CORE-Bench that asks agents to reproduce the code provided with research papers — the best performing agent scores 38% with Claude 3.5 Sonnet compared to only 24% with o1-mini.5 This also explains why reasoning models led to an improvement in one cybersecurity eval but not another — one of them involved agents.

We think there are two reasons why agents don’t seem to benefit from reasoning models. Such models require different prompting styles than regular models, and current agentic systems are optimized for prompting regular models. Second, as far as we know, reasoning models so far have not been trained using reinforcement learning in a setting where they receive feedback from the environment — be it code execution, shell interaction, or web search. In other words, their tool use ability is no better than the underlying model before learning to reason.6

These seem like relatively straightforward problems. Solving them might enable significant new AI agent capabilities — for example, generating complex, fully functional apps from a prompt. (There are already tools that try to do this, but they don’t work well.)

But what about the long run? Will inference scaling lead to the same kind of progress we’ve seen with model scaling over the last 7 years? Model scaling was so exciting because you “merely” needed to make data, model size, and compute bigger; no algorithmic breakthroughs were needed.

That’s not true (so far) with inference scaling — there’s a long list of inference scaling techniques, and what works or doesn’t work is problem-dependent, and even collectively, they only work in a circumscribed set of domains. AI developers are trying to overcome this limitation. For example, OpenAI’s reinforcement finetuning service is thought to be a way for the company to collect customer data from many different domains for fine-tuning a future model.

About a decade ago, reinforcement learning (RL) led to breakthroughs in many games like Atari. There was a lot of hype, and many AI researchers hoped we could RL our way to AGI. In fact, it was the high expectations around RL that led to the birth of explicitly AGI-focused labs, notably OpenAI. But those techniques didn’t generalize beyond narrow domains like games. Now there is similar hype about RL again. It is obviously a very powerful technique, but so far we’re seeing limitations similar to the ones that led to the dissipation of the previous wave of hype.

It is impossible to predict whether progress in AI capabilities will slow down. In fact, forget prediction — reasonable people can have very different opinions on whether AI progress has already slowed down, because they can interpret the evidence very differently. That’s because “capability” is a construct that’s highly sensitive to how you measure it.

What we can say with more confidence is that the nature of progress in capabilities will be different with inference scaling than with model scaling. In the last few years, newer models predictably brought capability improvements each year across a vast swath of domains. There was a feeling of pessimism among many AI researchers outside the big labs that there was little to do except sit around and wait for the next state-of-the-art LLM to be released.

With inference scaling, capability improvements will likely be uneven and less predictable, driven more by algorithmic advances than investment in hardware infrastructure. Many ideas that were discarded during the reign of LLMs, such as those from the old planning literature, are now back in the mix, and the scene seems intellectually more vibrant than in the last few years.

Product development lags capability increase

The furious debate about whether there is a capability slowdown is ironic, because the link between capability increases and the real-world usefulness of AI is extremely weak. The development of AI-based applications lags far behind the increase of AI capabilities, so even existing AI capabilities remain greatly underutilized. One reason is the capability-reliability gap — even when a certain capability exists, it may not work reliably enough that you can take the human out of the loop and actually automate the task (imagine a food delivery app that only works 80% of the time). And the methods for improving reliability are often application-dependent and distinct from methods for improving capability. That said, reasoning models also seem to exhibit reliability improvements, which is exciting.

Here are a couple of analogies that help illustrate why it might take a decade or more to build products that fully take advantage of even current AI capabilities. The technology behind the internet and the web mostly solidified in the mid-90s. But it took 1-2 more decades to realize the potential of web apps. Or consider this thought-provoking essay that argues that we need to build GUIs for large language models, which will allow interacting with them with far higher bandwidth than through text. From this perspective, the current state of AI-based products is analogous to PCs before the GUI.

The lag in product development is compounded by the fact that AI companies have not paid nearly enough attention to product aspects, believing that the general-purpose nature of AI somehow grants an exemption from the hard problems of software engineering. Fortunately, this has started to change recently.

Now that they are focusing on products, AI companies as well as their users are re-discovering that software development, especially the user experience side of it, is hard, and requires a broader set of skills than AI model development. A great example is the fact that there are now two different ways to run Python code with ChatGPT (which is one of the most important capabilities for power users) and there is an intricate set of undocumented rules to remember regarding the capabilities and limitations of each of them. Simon Willison says:

Do you find this all hopelessly confusing? I don’t blame you. I’m a professional web developer and a Python engineer of 20+ years and I can just about understand and internalize the above set of rules.

Still, this is a big improvement over a week ago, when these models had powerful coding capabilities yet did not come with the ability to run code that could use the internet! And even now, o1 can neither access the internet nor run code. From the perspective of AI impacts, what matters far more than capability improvement at this point is actually building products that let people do useful things with existing capabilities.

Finally, while product development lags behind capability, the adoption of AI-based products further lags far behind product development, for various behavioral, organizational, and societal reasons. Those interested in AI’s impacts (whether positive or negative) should pay much more attention to these downstream aspects than current or predicted capabilities.

Concluding thoughts

Maybe model scaling is over; maybe not. But it won’t continue forever, and the end of model scaling brings a long list of positives: AI progress once again depends on new ideas and not just compute; big companies, startups, and academic researchers can all compete on a relatively even playing field; regulation based on arbitrary training compute thresholds becomes even harder to defend; and there is a clear recognition that models themselves are just a technology, not a product.

As for the future of AI, it is clear that tech insiders are trying to figure it out just like the rest of us, and it is past time to stop trusting their overconfident, self-serving, shifting, and conveniently vague predictions. And when we move beyond technical predictions to claims about AI’s impact on the world, there’s even less reason to trust industry leaders.

Acknowledgment. We are grateful to Zachary S. Siegel for feedback on a draft.

1

While OpenAI is known to have crawled YouTube in the past, that was a small sliver of YouTube; it won’t be possible to crawl all of YouTube without Google noticing.

2

A nice analysis by Epoch AI showed that scaling could continue until 2030. But this was published too recently (August 2024) to have been the anchor for the scaling narrative.

3

We are referring to substantive knowledge about the safety of AI models and systems; whistleblowers did bring forth new knowledge about safety-related processes at OpenAI.

4

That said, we can’t take future cost decreases for granted; we are also running into fundamental limits of inference cost-saving techniques like quantization.

5

We set a cost limit for $4 for all models. On a small sample, with a $10 cost limit, o1-preview performed very poorly (10% accuracy). Given cost constraints, we did not evaluate the model with a higher cost limit on the entire data.

6

o1 doesn’t even have access to tools during inference in the ChatGPT interface! Gemini Flash 2.0 does, but it is not clear if this is a model that has been fine tuned for reasoning, let alone fine tuned for tool use.

Read the whole story
tante
11 days ago
reply
"Regardless of whether model scaling will continue, industry leaders’ flip flopping on this issue shows the folly of trusting their forecasts. They are not significantly better informed than the rest of us, and their narratives are heavily influenced by their vested interests."
Berlin/Germany
Share this story
Delete

Why did Silicon Valley turn right?

1 Comment and 2 Shares
A detailed landscape etching in the style of Giovanni Battista Piranesi, depicting a far-future Capitol building in Washington, D.C., surrounded by the decaying hulks of rocket ships, half embedded in the ground. The architecture of the Capitol is enhanced with futuristic yet ruined elements, while the scene is under a stormy, ominous sky. A dark planet glowing with an eerie light looms prominently above, casting an otherworldly radiance across the desolate scene. The etching is rich in texture, shadow, and fine detail, reminiscent of Piranesi's classic works.

A significant chunk of the Silicon Valley tech industry has shifted to the right over the last few years. Why did this happen? Noah Smith and Matt Yglesias think they have the answer. Both have amplified the argument of a Marc Andreessen tweet, arguing that progressives are at fault, for alienating their “staunch progressive allies” in the tech industry.

Subscribe now

Noah:

Matt on how this was all part of a “deliberate strategy” to “undermine the center-left.”

These are not convincing explanations. Years ago, Matt coined an extremely useful term, the “Pundit’s Fallacy,” for “the belief that what a politician needs to do to improve his or her political standing is do what the pundit wants substantively.” I worry that Noah and Matt risk falling into a closely related error. “The Poaster’s Mistake” is the belief that the people who make your online replies miserable are also the one great source of the misery of the world. Noah and Matt post a lot, and have notably difficult relations with online left/progressives. I can’t help but suspect that this colors their willingness to overlook the obvious problems in Andreessen’s “pounded progressive ally”* theory.

Succinctly, there were always important differences between Silicon Valley progressivism and the broader progressive movement (more on this below), Marc seems to have been as much alienated by center-left abundance folks like Jerusalem Demsas for calling him and his spouse on apparent NIMBYist hypocrisy as by actual leftists, and has furthermore dived into all sorts of theories that can’t be blamed on the left, for example, publicly embracing the story about an “illegal joint government-university-company censorship apparatus” which is going to be revealed real soon in a blizzard of subpoenas.

I suspect that Noah and Matt would see these problems more clearly if they weren’t themselves embroiled in related melodramas. And as it happens, I have an alternative theory about why the relationship between progressives and Silicon Valley has become so much more fraught, that I think is considerably better.

Better: but I certainly can’t make strong claims that this theory is right. Like most of what I write here, you should consider it to be social science inflected opinion journalism rather than Actual Social Science, which requires hard work, research and time. But I think it is the kind of explanation that you should be looking for, rather than ‘the lefties were plotting to undermine me and my allies,’ or, for that matter ‘this was completely inevitable, because Silicon Valley types were closeted Nazi-fanciers from the beginning.’ It’s clear that some change has, genuinely, happened in the relationship between Silicon Valley and progressives, but it probably isn’t going to be one that fits entirely with any particular just-so story. So here’s my alternative account - take pot-shots at it as you like!

********

Before the story comes the social science. The shifting relationship between the two involves- as far as I can see - ideas, interests and political coalitions. The best broad framework I know for talking about how these relate to each other is laid out in Mark Blyth’s book, Great Transformations: Economic Ideas and Institutional Change in the Twentieth Century.

Mark wants to know how big institutional transformations come about. How, for example, did we move from a world in which markets were fundamentally limited by institutional frameworks created by national governments, to one in which markets dominated and remade those frameworks?

Mark’s answer is that you cannot understand such ‘great transformations’ without understanding how ideas shape collective interests. Roughly speaking (I’m simplifying a lot), when institutional frameworks are stable, they provide people with a coherent understanding of what their interests are, who are their allies and who are their adversaries. But at some point, perhaps for exogenous reasons, the institutional order runs into a crisis, where all that seemed to be fixed and eternal suddenly becomes unstable. At that point, people’s interests too become malleable - they don’t know how to pursue these interests in a world where everything is up for grabs. Old coalitions can collapse and new ones emerge.

It is at this point that ideas play an important role. If you - an intellectual entrepreneur - have been waiting in the wings with a body of potentially applicable ideas, this is your moment! You leap forward, presenting your diagnosis both of what went wrong in the old order, and what can make things right going forward. And if your ideas take hold, they disclose new possibilities for political action, by giving various important actors a sense of where their interests lie. New coalitions can spring into being, as these actors identify shared interests on the basis of new ideas, while old coalitions suddenly look ridiculous and outdated. Such ideas are self-fulfilling prophecies, which do not just shape the future but our understanding of the past. A new institutional order may emerge, cemented by these new ideas.

As Mark puts it in more academic language:

Economic ideas therefore serve as the basis of political coalitions. They enable agents to overcome free-rider problems by specifying the ends of collective action. In moments of uncertainty and crisis, such coalitions attempt to establish specific configurations of distributionary institutions in a given state, in line with the economic ideas agents use to organize and give meaning to their collective endeavors. If successful, these institutions, once established, maintain and reconstitute the coalition over time by making possible and legitimating those distributive arrangements that enshrine and support its members. Seen in this way, economic ideas enable us to understand both the creation and maintenance of a stable institutionalized political coalition and the institutions that support it.

Thus, in Mark’s story, economists like Milton Friedman, George Stigler and Art Laffer played a crucial role in the transition from old style liberalism to neoliberalism. At the moment when the old institutional system was in crisis, and no-one knew quite what to do, they provided a diagnosis of what was wrong. Whether that diagnosis was correct in some universal sense is a question for God rather than ordinary mortals. The more immediate question is whether that diagnosis was influential: politically efficacious in justifying alternative policies, breaking up old political coalitions and conjuring new ones into being. As it turned out, it was.

********

So how does this explain recent tensions between progressives and Silicon Valley? My hypothesis is that we are living in the immediate aftermaths of two intertwined crises. One was a crisis in the U.S. political order - the death of the decades-long neoliberalism that Friedman and others helped usher in. The other was an intellectual crisis in Silicon Valley - the death of what Kevin Munger calls “the Palo Alto Consensus.” As long as neoliberalism shaped the U.S. Democratic party’s understanding of the world, and the Palo Alto Consensus shaped Silicon Valley’s worldview, soi-disant progressivism and Silicon Valley could get on well. When both cratered at more or less the same time, different ideas came into play, and different coalitions began to emerge among both Washington DC Democrats and Silicon Valley. These coalitions don’t have nearly as much in common as the previous coalitions did.

There are vast, tedious arguments over neoliberalism and the Democratic party, which I’m not going to get into. All I’ll say is that a left-inflected neoliberalism was indeed a cornerstone of the Democratic coalition after Carter, profoundly affecting the ways in which the U.S. regulated (or, across large swathes of policy, didn’t regulate) the Internet and e-commerce. 1990s Democrats were hostile to tech regulation: see Margaret O’Mara’s discussion of ‘Atari Democrats’ passim, or, less interesting, myself way back in 2003:

In 1997, the White House issued the “Framework for Global Electronic Commerce,” … drafted by a working group led by Ira Magaziner, the Clinton administration’s e-commerce policy “architect.” Magaziner … sought deliberately to keep government at the margins of e-commerce policy. His document succeeded, to a quite extraordinary extent, in setting the terms on which U.S. policymakers would address e-commerce, and in discouraging policymakers from seeking to tax or regulate it.

E-commerce transactions did, eventually, get taxed. But lots of other activities, including some that were extensively regulated in the real world, were lightly regulated online or not regulated at all, on the assumption that government regulation was clumsy and slow and would surely stifle the fast-adapting technological innovation that we needed to build the information superhighway to a glorious American future.

This happy complaisance certainly helped get the e-commerce boom get going, and ensured that when companies moved fast and broke things, as they regularly did, they were unlikely to get more than an ex post slap on the wrist. A consensus on antitrust that spanned Democrats and Republicans allowed search and e-commerce companies to build up enormous market power, while the 1996 Communications Decency Act’s Section 230 allowed platform companies great freedom to regulate online spaces without much external accountability as they emerged.

All this worked really well with the politics of Silicon Valley. It was much easier for Silicon Valley tech people to be ‘staunch allies’ in a world where Democratic politicians bought into neoliberalism and self-regulation. In February 2017, David Broockman, Greg Ferenstein and Neil Malhotra conducted the only survey I’m aware of on the political ideology of Silicon Valley tech elites. As Neil summarized the conclusions:

Our findings led us to greatly rethink what we think of as “the Silicon Valley ideology,” which many pundits equate with libertarianism. In fact, our survey found that over 75% of technology founders explicitly rejected libertarian ideology. Instead, we found that they exhibit a constellation of political beliefs unique among any population we studied. We call this ideology “liberal-tarianism.” Technology elites are liberal on almost all issues – including taxation and redistribution – but extremely conservative when it comes to government regulation, particularly of the labor market. Amazingly, their preferences toward regulation resemble Republican donors.

Broockman, Ferenstein and Malhotra found that this wasn’t simple industry self-interest. These elites were similarly hostile to government regulation of other sectors than technology. But they were far more fervently opposed to regulation than regular millionaires, and were notably hostile to unions and unionization, with large majorities saying that they would like to see private sector (76%) and public sector (72%) unions decline in influence. And just as Democrats influenced Silicon Valley, Silicon Valley influenced the Democrats, bringing the two even closer together.

These findings help explain why the technology industry was a core part of the Democratic coalition and why the Obama Administration was fairly lax when it came to the regulation of the technology industry. Silicon Valley worked from within the Democratic coalition to move regulatory policy to the right, while supporting the party’s positions on social issues, economic redistribution, and globalization. Perhaps it is not so surprising that the revolving door between the Obama Administration and Silicon Valley companies seemed to always spin people in or out.

It’s really important to note that Silicon Valley politics was not just about its attitudes to Washington but to the world, and to its own technological destiny. As Kevin Munger has argued, the official ideologies of big platform companies shared a common theme: their business models would connect the world together, to the betterment of humankind. As a remarkable internal memo written by Facebook VP, Andrew “Boz” Bosworth put it in 2016 ( reported by Ryan Mac, Charlie Warzel and Alex Kantrowitz) :

“We connect people. Period. That’s why all the work we do in growth is justified. All the questionable contact importing practices. All the subtle language that helps people stay searchable by friends. All of the work we do to bring more communication in. The work we will likely have to do in China some day. All of it” … “So we connect more people,” he wrote in another section of the memo. “That can be bad if they make it negative. Maybe it costs someone a life by exposing someone to bullies. “Maybe someone dies in a terrorist attack coordinated on our tools.”

That too influenced Washington DC: politicians like Hilary Clinton bought into this vision, delivering speeches and implementing policy around it. Journalism bought into it too, so that a thousand puff pieces on Silicon Valley leaders were inflated by Kara Swisher and her ilk, ascending majestically into the empyrean. And, most importantly of all, Silicon Valley itself bought into it. When Silicon Valley fought with the Obama administration, it was over the illiberal policies that threatened both its business model and the dream of a more liberal, globally connected world. After the Snowden revelations broke, Eric Schmidt famously worried that U.S. surveillance might destroy the Internet. But even then, the general assumption was that liberalism and technological innovation went hand in hand. The more liberal a society, the more innovative it would be, and the more innovation there was, the stronger that liberalism would become.

There was a small but significant right wing faction in Silicon Valley, centered around Peter Thiel and a couple of other members of the Paypal mafia. Weird reactionaries like Curtis Yarvin were able to find a niche, while Palantir, unlike pretty well all the other Silicon Valley software and platform companies, was willing to work with the U.S. military and surveillance state from the beginning. But the Silicon Valley culture, like the founders, tended liberaltarian.

It’s no surprise that relations between DC progressives and tech company elites were so friendly - the neoliberal consensus among Washington DC Democrats and the Palo Alto Consensus were highly compatible with each other. Both favored free markets and minimal regulation of technology, while sidelining unions, labor politics and too close an examination of the market power of the big tech companies. There were frictions of course - even neoliberal Democrats were often keener on regulation for many people in Silicon Valley - but Democrats could see themselves in Silicon Valley and vice versa without too much squinting.

********

If that has changed, it is not simply because progressives have moved away from Silicon Valley. It is because both the neoliberal consensus and the Palo Alto consensus have collapsed, leading the political economies of Washington DC and Silicon Valley to move in very different directions.

A lot of attention has been paid to the intellectual and political collapse of neoliberalism. This really got going thanks to Trump, but it transformed the organizing ideas of the Democratic coalition too. During the Trump era, card-carrying Clintonites like Jake Sullivan became convinced that old ideas about minimally regulated markets and trade didn’t make much sense any more. Domestically, they believed that the “China shock” had hollowed out America’s industrial heartland, opening the way for Trump style populism. Reviving U.S. manufacturing and construction might be facilitated through a “Green New Deal” that would both allow the U.S. to respond effectively to climate change, and revive the physical economy. Internationally, they believed that China was a direct threat to U.S. national security, as it caught up with the U.S. on technology, industrial capacity and ability to project military force. Finally, they believed that U.S. elites had become much too supine about economic power, allowing the U.S. economy to become dominated by powerful monopolies. New approaches to antitrust were needed to restrain platform companies which had gotten out of control. Unions would be Democrats’ most crucial ally in bringing back the working class.

Again, I’m not going to talk about the merits of this analysis (a topic for other posts), but the politics. Both Matt and Noah like the hawkish approach to China, but not much else. Matt in particular seems to blame a lot of what has gone wrong in the last four years on an anti-neoliberal popular front, stretching from Jake Sullivan to Jacobin and the random online dude who thinks the Beatles Are Neoliberalism.

What Matt and Noah fail to talk about is that the politics of Silicon Valley have also changed, for reasons that don’t have much to do with the left. Specifically, the Palo Alto Consensus collapsed at much the same time that the neoliberal consensus collapsed, for related but distinct reasons.

No-one now believes - or pretends to believe - that Silicon Valley is going to connect the world, ushering in an age of peace, harmony and likes across nations. That is in part because of shifting geopolitics, but it is also the product of practical learning. A decade ago, liberals, liberaltarians and straight libertarians could readily enthuse about “liberation technologies” and Twitter revolutions in which nimble pro-democracy dissidents would use the Internet to out-maneuver sluggish governments. Technological innovation and liberal freedoms seemed to go hand in hand.

Now they don’t. Authoritarian governments have turned out to be quite adept for the time being, not just at suppressing dissidence but at using these technologies for their own purposes. Platforms like Facebook have been used to mobilize ethnic violence around the world, with minimal pushback from the platform’s moderation systems, which were built on the cheap and not designed to deal with a complex world where people could do horrible things in hundreds of languages. And there are now a lot of people who think that Silicon Valley platforms are bad for stability in places like the U.S. and Western Europe where democracy was supposed to be consolidated.

My surmise is that this shift in beliefs has undermined the core ideas that held the Silicon Valley coalition together. Specifically, it has broken the previously ‘obvious’ intimate relationship between innovation and liberalism.

I don’t see anyone arguing that Silicon Valley innovation is the best way of spreading liberal democratic awesome around the world any more, or for keeping it up and running at home. Instead, I see a variety of arguments for the unbridled benefits of innovation, regardless of its benefits for democratic liberalism. I see a lot of arguments that AI innovation in particular is about to propel us into an incredible new world of human possibilities, provided that it isn’t restrained by DEI, ESG and other such nonsense. Others (or the same people) argue that we need to innovate, innovate, innovate because we are caught in a technological arms race with China, and if we lose, we’re toast. Others (sotto or brutto voce; again, sometimes the same people) - contend innovation isn’t really possible in a world of democratic restraint, and we need new forms of corporate authoritarianism with a side helping of exit, to allow the kinds of advances we really need to transform the world.

I have no idea how deeply these ideas have penetrated into Silicon Valley - there isn’t a fully formed new coalition yet. There are clearly a bunch of prominent people who buy into these ideas, to some greater or lesser degree, including Andreessen himself. I’d assume that there are a lot of people in Silicon Valley who don’t buy into these arguments - but they don’t seem to have a coherent alternative set of ideas that they can build a different coalition around. And all this was beginning to get going before the Biden administration was mean to Elon Musk, and rudely declined to arrange meetings between Andreessen, Horowitz and the very top officials. Rob Reich recalls a meeting with a bunch of prominent people in the Valley about what an ideal society for innovation might look like. It was made clear to him that democracy was most certainly not among the desirable qualities.

So my bet is that any satisfactory account of the disengagement between East Coast Democrats and West Coast tech elites can’t begin and end with blaming the left. For sure, the organizing ideas of the Democrats have shifted. So too - and for largely independent reasons - have the organizing ideas of Silicon Valley tech founders. You need to pay attention to both to come up with plausible explanations.

I should be clear that this is a bet rather than a strong statement of How It Was. Even if my argument did turn out to be somewhat or mostly right in broad outline, there are lots of details that you would want to add to get a real sense of what had happened. Ideas spread through networks - and different people are more or less immune or receptive to these ideas for a variety of complicated social or personal reasons. You’d want to get some sense of how different networks work to flesh out the details.

And the whole story might be wrong or misleading! Again, this is based on a combination of political science arguments and broad empirical impressions, rather than actual research. But I would still think that this kind of account is more likely to be right, than the sort of arguments that Matt and Noah provide. Complicated politics rarely break down into easy stories of who to blame. Such stories, furthermore, tend to be terrible guides to politics in the future. We are still in a period of crisis, where the old coalitions have been discredited, and new coalitions haven’t fully cohered. Figuring out how to build coalitions requires us not to dwell on past hostilities but pay lively attention to future possibilities, including among people whom we may not particularly like.

Subscribe now

* It’s impossible for me to read Andreessen’s tweet without imagining the Chuck Tingle remix. That’s almost certainly my fault, not his.

Read the whole story
tante
13 days ago
reply
"If that has changed, it is not simply because progressives have moved away from Silicon Valley. It is because both the neoliberal consensus and the Palo Alto consensus have collapsed, leading the political economies of Washington DC and Silicon Valley to move in very different directions."
Berlin/Germany
sarcozona
12 days ago
reply
Epiphyte City
Share this story
Delete

We Looked at 78 Election Deepfakes. Political Misinformation is not an AI Problem.

1 Comment

AI-generated misinformation was one of the top concerns during the 2024 U.S. presidential election. In January 2024, the World Economic Forum claimed that “misinformation and disinformation is the most severe short-term risk the world faces” and that “AI is amplifying manipulated and distorted information that could destabilize societies.” News headlines about elections in 2024 tell a similar story:

In contrast, in our past writing, we predicted that AI would not lead to a misinformation apocalypse. When Meta released its open-weight large language model (called LLaMA), we argued that it would not lead to a tidal wave of misinformation. And in a follow-up essay, we pointed out that the distribution of misinformation is the key bottleneck for influence operations, and while generative AI reduces the cost of creating misinformation, it does not reduce the cost of distributing it. A few other researchers have made similar arguments.

Which of these two perspectives better fits the facts?

Fortunately, we have the evidence of AI use in elections that took place around the globe in 2024 to help answer this question. Many news outlets and research projects have compiled known instances of AI-generated text and media and their impact. Instead of speculating about AI’s potential, we can look at its real-world impact to date.

We analyzed every instance of AI use in elections collected by the WIRED AI Elections Project, which tracked known uses of AI for creating political content during elections taking place in 2024 worldwide. In each case, we identified what AI was used for and estimated the cost of creating similar content without AI.

We find that (1) most AI use isn't deceptive, (2) deceptive content produced using AI is nevertheless cheap to replicate without AI, and (3) focusing on the demand for misinformation rather than the supply is a much more effective way to diagnose problems and identify interventions.

To be clear, AI-generated synthetic content poses many real dangers: the creation of non-consensual images of people and child sexual abuse material and the enabling of the liar’s dividend, which allows those in power to brush away real but embarrassing or controversial media content about them as AI-generated. These are all important challenges. This essay is focused on a different problem: political misinformation.1

Improving the information environment is a difficult and ongoing challenge. It’s understandable why people might think AI is making the problem worse: AI does make it possible to fabricate false content. But that has not fundamentally changed the landscape of political misinformation.

Paradoxically, the alarm about AI might be comforting because it positions concerns about the information environment as a discrete problem with a discrete solution. But fixes to the information environment depend on structural and institutional changes rather than on curbing AI-generated content.

Half of the Deepfakes in 2024 Elections weren’t Deceptive

We analyzed all 78 instances of AI use in the WIRED AI Elections Project (source for our analysis).2 We categorized each instance based on whether there was deceptive intent. For example, if AI was used to generate false media depicting a political candidate saying something they didn't, we classified it as deceptive. On the other hand, if a chatbot gave an incorrect response to a genuine user query, a deepfake was created for parody or satire, or a candidate transparently used AI to improve their campaigning materials (such as by translating a speech into a language they don't speak), we classify it as non-deceptive.

To our surprise, there was no deceptive intent in 39 of the 78 cases in the database.

The most common non-deceptive use of AI was for campaigning. When candidates or supporters used AI for campaigning, in most cases (19 out of 22), the apparent intent was to improve campaigning materials rather than mislead voters with false information.

We even found examples of deepfakes that we think helped improve the information environment. In Venezuela, journalists used AI avatars to avoid government retribution when covering news adversarial to the government. In the U.S., a local news organization from Arizona, Arizona Agenda, used deepfakes to educate viewers about how easy it is to manipulate videos. In California, a candidate with laryngitis lost his voice, so he transparently used AI voice cloning to read out typed messages in his voice during meet-and-greets with voters.

Reasonable people can disagree on whether using AI in campaigning materials is legitimate or what the appropriate guardrails need to be. But using AI for campaign materials in non-deceptive ways (for example, when AI is used as a tool to improve voter outreach) is much less problematic than deploying AI-generated fake news to sway voters.

Of course, not all non-deceptive AI-generated political content is benign.3 Chatbots often incorrectly answer election-related questions. Rather than deceptive intent, this results from the limitations of chatbots, such as hallucinations and lack of factuality. Unfortunately, these limitations are not made clear to users, leading to an overreliance on flawed large language models (LLMs).4

Making Deceptive Political Misinformation Does Not Require AI

For each of the 39 examples of deceptive intent, where AI use was intended to make viewers believe outright false information, we estimated the cost of creating similar content without AI—for example, by hiring Photoshop experts, video editors, or voice actors. In each case, the cost of creating similar content without AI was modest—no more than a few hundred dollars. (We even found that a video involving a hired stage actor was incorrectly marked as being AI-generated in WIRED’s election database.)

In fact, it has long been possible to create media with outright false information without using AI or other fancy tools. One video used stage actors to falsely claim that U.S. Vice President and Democratic presidential candidate Kamala Harris was involved in a hit-and-run incident. Another slowed down the vice president's speech to make it sound like she was slurring her words. An edited video of Indian opposition candidate Rahul Gandhi showed him saying that the incumbent Narendra Modi would win the election. In the original video, Gandhi said his opponent would not win the election, but it was edited using jump cuts to take out the word “not.” Such media content has been called “cheap fakes” (as opposed to AI-generated “deepfakes”).

There were many instances of cheap fakes used in the 2024 U.S. election. The News Literacy Project documented known misinformation about the election and found that cheap fakes were used seven times more often than AI-generated content. Similarly, in other countries, cheap fakes were quite prevalent. An India-based fact checker reviewed an order of magnitude more cheap fakes and traditionally edited media compared to deepfakes. In Bangladesh, cheap fakes were over 20 times more prevalent than deepfakes.

Let’s consider two examples to analyze how cheap fakes could have led to substantially similar effects as the deepfakes that got a lot of media attention: Donald Trump’s use of Taylor Swift deepfakes to campaign and a voice-cloned robocall that imitated U.S. President Joe Biden in the New Hampshire primary asking voters not to vote.

A Truth Social post shared by Donald Trump with images of Taylor Swift fans wearing “Swifties for Trump” t-shirts. Top left: A post with many AI-generated images of women wearing “Swifties for Trump” t-shirts, with a “satire” label. Top right: A real image of Trump supporter Jenna Piwowarczyk wearing a “Swifties for Trump” t-shirt. Bottom left: A fabricated image of Taylor Swift in front of the American flag with the caption, “Taylor wants you to vote for Donald Trump.” It is unclear if the image was created using AI or other editing software. Bottom right: A Twitter post with two images: one AI-generated, the other real, of women wearing “Swifties for Trump” t-shirts.

Trump’s use of Swift deepfakes implied that Taylor Swift had endorsed him and that Swift fans were attending his rallies en masse. In the wake of the post, many media outlets blamed AI for the spread of misinformation.

But recreating similar images without AI is easy. Images depicting Swift’s support could be created by photoshopping text endorsing Trump onto any of her existing images. Likewise, getting images of Trump supporters wearing “Swifties for Trump” t-shirts could be achieved by distributing free t-shirts at a rally—or even selectively reaching out to Swift fans at Trump rallies. In fact, two of the images Trump shared were real images of a Trump supporter who is also a Swift fan.

Another incident that led to a brief panic was an AI clone of President Joe Biden’s voice that asked people not to vote in the New Hampshire primary.

News headlines in the wake of the Biden robocall.

Rules against such robocalls have existed for years. In fact, the perpetrator of this particular robocall was fined $6 million by the Federal Communications Commission (FCC). The FCC has tiplines to report similar attacks, and it enforces rules around robocalls frequently, regardless of whether AI is used. Since the robocall used a static recording, it could have been made about as easily without using AI—for instance, by hiring voice impersonators.

It is also unclear what impact the robocall had: The efficacy of the deepfake depends on the recipient believing that the president of the United States is personally calling them on the phone to ask them not to vote in a primary.

Is it just a matter of time until improvements in technology and the expertise of actors seeking to influence elections lead to more effective AI disinformation? We don’t think so. In the next section, we point out that structural reasons that drive the demand for misinformation are not aided by AI. We then look at the history of predictions about coming waves of AI disinformation that have accompanied the release of new tools—predictions that have not come to pass.

The Demand for Misinformation

Misinformation can be seen through the forces of supply and demand. The supply comes from people who want to make a buck by generating clicks, partisans who want their side to win, or state actors who want to conduct influence operations. Interventions so far have almost entirely tried to curb the supply of misinformation while leaving the demand unchanged.

The focus on AI is the latest example of this trend. Since AI reduces the cost of generating misinformation to nearly zero, analysts who look at misinformation as a supply problem are very concerned. But analyzing the demand for misinformation can clarify how misinformation spreads and what interventions are likely to help.

Looking at the demand for misinformation tells us that as long as people have certain worldviews, they will seek out and find information consistent with those views. Depending on what someone’s worldview is, the information in question is often misinformation—or at least would be considered misinformation by those with differing worldviews.

In other words, successful misinformation operations target in-group members—people who already agree with the broad intent of the message. Such recipients may have lower skepticism for messages that conform to their worldviews and may even be willing to knowingly amplify false information. Sophisticated tools aren’t needed for misinformation to be effective in this context. On the flip side, it will be extremely hard to convince out-group members of false information that they don't agree with, regardless of AI use.

Seen in this light, AI misinformation plays a very different role from its popular depiction of swaying voters in elections. Increasing the supply of misinformation does not meaningfully change the dynamics of the demand for misinformation since the increased supply is competing for the same eyeballs. Moreover, the increased supply of misinformation is likely to be consumed mainly by a small group of partisans who already agree with it and heavily consume misinformation rather than to convince a broader swath of the public.

This also explains why cheap fakes such as media from unrelated events, traditional video edits such as jump cuts, or even video game footage can be effective for propagating misinformation despite their low quality: It is much easier to convince someone of misinformation if they already agree with its message.

Our analysis of the demand for misinformation may be most applicable to countries with polarized close races where leading parties have similar capacities for voter outreach, so that voters’ (mis)information demands are already saturated.

Still, to our knowledge, in every country that held elections in 2024 so far, AI misinformation had much less impact than feared. In India, deepfakes were used for trolling more than spreading false information. In Indonesia, the impact of AI wasn't to sow false information but rather to soften the image of then-candidate, now-President Prabowo Subianto (a former general accused of many past human rights abuses) using AI-generated digital cartoon avatars that depicted him as likable.5

Why Do Concerns About AI Misinformation Keep Recurring?

The 2024 election cycle wasn’t the first time when there was widespread fear that AI deepfakes would lead to rampant political misinformation. Strikingly similar concerns about AI were expressed before the 2020 U.S. election, though these concerns were not borne out. The release of new AI tools is often accompanied by worries that it will unleash new waves of misinformation:

  • 2019. When OpenAI released its GPT-2 series of models in 2019, one of the main reasons it held back on releasing the model weights for the most capable models in the series was its alleged potential to generate misinformation.

  • 2023. When Meta released the LLaMA model openly in 2023, multiple news outlets reported concerns that it would trigger a deluge of AI misinformation. These models were far more powerful than the GPT-2 models released by OpenAI in 2019. Yet, we have not seen evidence of large-scale voter persuasion attributed to using LLaMA or other large language models.

  • 2024. Most recently, the widespread availability of AI image editing tools on smartphones has prompted similar concerns.

In fact, concerns about using new technology to create false information go back over a century. The late 19th and early 20th centuries saw the advent of technologies for photo retouching. This was accompanied by concerns that retouched photographs would be used to deceive people, and, in 1912, a bill was introduced in the U.S. that would have criminalized photo editing without subjects’ consent. (It died in the Senate.)

Thinking of political misinformation as a technological (or AI) problem is appealing because it makes the solution seem tractable. If only we could roll back harmful tech, we could drastically improve the information environment!

While the goal of improving the information environment is laudable, blaming technology is not a fix. Political polarization has led to greater mistrust of the media. People prefer sources that confirm their worldview and are less skeptical about content that fits their worldview. Another major factor is the drastic decline of journalism revenues in the last two decades—largely driven by the shift from traditional to social media and online advertising. But this is more a result of structural changes in how people seek out and consume information than the specific threat of misinformation shared online.

As history professor Sam Lebovic has pointed out, improving the information environment is inextricably linked to the larger project of shoring up democracy and its institutions. There’s no quick technical fix, or targeted regulation, that can “solve” our information problems. We should reject the simplistic temptation to blame AI for political misinformation and confront the gravity of the hard problem.

This essay is cross-posted to the Knight First Amendment Institute website. We are grateful to Katy Glenn Bass for her feedback.

1

The terms mis- and disinformation lack agreed-upon definitions. In this piece, we use the term misinformation to refer to outright false information, as opposed to issues of misleading interpretive framing. Despite many people’s perception of outgroup narratives as “misinformation,” we don't think the misinformation lens is a useful way to think about differences in framing and narratives; we're more narrowly concerned about using outright false information to support those narratives.

2

The low number of total deepfakes found in elections worldwide is surprising on its own terms. The small number could either indicate that AI deepfakes are a much smaller problem so far than anticipated or that the database has many missing entries. Still, other databases that tracked election deepfakes have a similar count for the total number of deepfakes; for example, the German Marshall Fund’s list of deepfakes related to 2024 elections worldwide has 133 entries, though it started collecting entries in September 2023. As we note further along in the essay, the News Literacy Project documented known misinformation about the 2024 elections and found that cheap fakes that didn't use AI were used seven times more often than AI-generated content.

3

The dataset also included four instances of AI-generated deepfake videos of politicians used to perpetrate financial scams. Compared to political misinformation, scams have very different dynamics (more sophisticated videos could be more convincing) and stakes (they involve individual financial harm rather than threats to democracy). Similarly, addressing scams requires different interventions—for instance, monitoring and removing networks of scammers is something major online platforms have been doing for a long time. In other words, scams are a different problem that we have other tools for addressing (regardless of the fact that some platforms arguably underinvest in doing so) and are outside the scope of this essay.

4

In the last legs of the 2024 U.S. election, Google and OpenAI restricted their chatbots from answering election-related queries—though competitors like Perplexity didn't, claiming that their product was highly accurate. Evaluating chatbots’ tendency to answer questions factually or abstain from answering questions, improving the factuality of responses, and ensuring chatbots work across different languages and contexts are important areas of work as more people turn to chatbots for answering questions.

5

To be clear, we should not treat such propaganda as something newly made possible by AI. It is the incremental evolution of long-standing techniques. Indeed, the cost of creating cartoon avatars for presidential campaigns would be minuscule with or without AI. The impact of propaganda depends not on the technical methods used to create it but rather on the freedom of the press to uplift competing narratives.

Read the whole story
tante
14 days ago
reply
Disinformation doesn't need "AI" to exist.
Berlin/Germany
Share this story
Delete

Godot Isn't Making it

2 Comments

Before we get going — please enjoy my speech from Web Summit, Why Are All Tech Products Now Shit? I didn’t write the title.


What if what we're seeing today isn't a glimpse of the future, but the new terms of the present? What if artificial intelligence isn't actually capable of doing much more than what we're seeing today, and what if there's no clear timeline when it'll be able to do more? What if this entire hype cycle has been built, goosed by a compliant media ready and willing to take career-embellishers at their word?

Me, in March 2024.

I have been warning you for the best part of a year that generative AI has no killer apps and had no way of justifying its valuations (February), that generative AI had already peaked (March), and I have pleaded with people to consider an eventuality where the jump from GPT-4 to GPT-5 was not significant, in part due to a lack of training data (April). 

I shared concerns in July that the transformer-based-architecture underpinning generative AI was a dead end, and that there were few ways we'd progress past the products we'd already seen, in part due to both the limits of training data and the limits of models that use said training data. In August, I summarized the Pale Horses of the AI Apocalypse — events, many that have since come to pass, that would signify that the end is indeed nigh — and again added that GPT-5 would likely "not change the game enough to matter, let alone [add] a new architecture to build future (and more capable) models on."

Throughout these pieces I have repeatedly made the point that — separate to any lack of a core value proposition, training data drought, or unsustainable economics — generative AI is a dead end due to the limitations of probabilistic models that hallucinate, where they authoritatively state things that aren't true. The hallucination problem is one that is nowhere closer to being solved — and, at least with the current technology — may never go away, and it makes it a non-starter for a great many business tasks, where you need a high level of reliability.

I have — since March — expressed great dismay about the credulousness of the media in their acceptance of the "inevitable" ways in which generative AI will change society, despite a lack of any truly meaningful product that might justify an environmentally-destructive industry led by a company that burns more than $5 billion a year and big tech firms spending $200 billion on data centers for products that people don't want.

The reason I'm repeating myself is that it's important to note how obvious the problems with generative AI have been, and for how long.

And you're going to need context for everything I'm about to throw at you.

Sidebar: To explain exactly what happened here, it's worth going over how these models work and are trained. I’ll keep it simple as it's a reminder.

A transformer-based generative AI model such as GPT — the technology behind ChatGPT — generates answers using "inference," which means it draws conclusions based off of its "training," which requires feeding it masses of training data (mostly text and images scraped from the internet). Both of these processes require you to use high-end GPUs (graphics processing units), and lots of them.

The theory was (is?) that the more training data and compute you throw at these models, the better they get. I have hypothesized for a while they'd have diminishing returns — both from running out of training data and based on the limitations of transformer-based models. 

And there, as they say, is the rub.

A few weeks ago, Bloomberg reported that OpenAI, Google, and Anthropic are struggling to build more advanced AI, and that OpenAI's "Orion" model — otherwise known as GPT-5 — "did not hit the company's desired performance," and that "Orion is so far not considered to be as big a step up" as it was from GPT-3.5 to GPT-4, its current model. You'll be shocked to hear the reason is that because "it’s become increasingly difficult to find new, untapped sources of high-quality, human-made training data that can be used to build more advanced AI systems," something I said would happen in March, while also adding that the "AGI bubble is bursting a little bit," something I said more forcefully in July.

I also want to stop and stare daggers at one particular point:

These issues challenge the gospel that has taken hold in Silicon Valley in recent years, particularly since OpenAI released ChatGPT two years ago. Much of the tech industry has bet on so-called scaling laws that say more computing power, data and larger models will inevitably pave the way for greater leaps forward in the power of AI.

The only people taking this as "gospel" have been members of the media unwilling to ask the tough questions and AI founders that don't know what the fuck they're talking about (or that intend to mislead). Generative AI's products have effectively been trapped in amber for over a year. There have been no meaningful, industry-defining products, because, as economist Daron Acemoglu said back in May, "more powerful" models do not unlock new features, or really change the experience, nor what you can build with transformer-based models. Or, put another way, a slightly better white elephant is still a white elephant. 

Despite the billions of dollars burned and thousands of glossy headlines, it's difficult to point to any truly important generative-AI-powered product. Even Apple Intelligence, the only thing that Apple really had to add to the latest iPhone, is utterly dull, and largely based on on-device models.

Yes, there are people that use ChatGPT — 200 million of them a week, allegedly, losing the company money with every prompt — but there is little to suggest that there's widespread adoption of actual generative AI software. The Information reported in September that between 0.1% and 1% of the 440 million of Microsoft's business customers were paying for its AI-powered Copilot, and in late October, Microsoft claimed that "AI is on pace to be a $10 billion-a-year business," which sounds good until you consider a few things:

  1. Microsoft has no "AI business" unit, which means that this annual "$10 billion" (or $2.5 billion a quarter) revenue figure is split across providing cloud compute services on Azure, selling Copilot to dumb people with Microsoft 365 subscriptions, selling Github Copilot, and basically anything else with "AI" on it. Microsoft is cherry-picking a number based on non-specific criteria and claiming it's a big deal, when it's actually pretty pathetic considering its capital expenditures will likely hit over $60 billion in 2024.
  2. Note the word "revenue," not "profit." How much is Microsoft spending to make $10 billion a year? OpenAI currently spends $2.35 to make $1, and Microsoft CFO Amy Hood said that OpenAI would cut into Microsoft's profits this October, losing it a remarkable $1.5 billion, "mainly because of an expected loss from OpenAI" according to CNBC. A year ago, the Wall Street Journal reported in October 2023 that Microsoft was losing an average of $20 per-user-per month on GitHub Copilot — a product with over a million users. If true, this suggests losses of at least $200 million (based on documents I've reviewed, it has 1.8 million users as of a month ago, but I went on the lower end) a year.
  3. Microsoft has still yet to break out exactly how much generative AI is increasing revenue in specific business units. Generally, if a company is doing well at something, they take great pains to make that clear. Instead, Microsoft chose in August to "revamp" its reporting structure to "give better visibility into cloud consumption revenue," which is something you do if you, say, anticipate you're going to have your worst day of trading in years after your next earnings as Microsoft did in October.

I must be clear that every single one of these investments and products has been hyped with the whisper that they would get exponentially better over time, and that eventually the $200 billion in capital expenditures would spit out remarkable productivity improvements and fascinating new products that consumers and enterprises would buy in droves. Instead, big tech has found itself peddling increasingly-more-expensive iterations of near-identical Large Language Models — a direct result of them all having to use the same training data, which it’s now running out of.


The other assumption — those so-called scaling laws — has been that by simply building bigger data centers with more GPUs (the expensive, power-hungry graphics processing units used to both run and train these models) and throwing as much training data at them as possible, they'd simply start sprouting new capabilities, despite there being little proof that they'd do so. Microsoft, Meta, Amazon, and Google have all burned billions on the assumption that doing so would create something — be it a human-level "artificial general intelligence" or, I dunno, a product that would justify the costs — and it's become painfully obvious that it isn't going to work.

As we speak, outlets are already desperate to try and prove that this isn't a problem. The Information, in a similar story to Bloomberg's, attempted to put lipstick on the pig of generative AI, framing the lack of meaningful progress with GPT-5 as fine, because OpenAI can combine its GPT-5 Model with its o-1 "reasoning" model, which will then do something of some sort, such as "write a lot more very difficult code" according to OpenAI CEO and career liar Sam Altman, who intimated that GPT-5 may function like a "virtual brain" in May.

Chief Valley Cheerleader Casey Newton wrote on Platformer last week that diminishing returns in training models "may not matter as much as you would guess," with his evidence being that Anthropic, who he claims "has not been prone to hyperbole," do not think that scaling laws are ending. To be clear, in a 14,000 op-ed that Newton wrote two pieces about, Anthropic CEO Dario Amodei said that "AI-accelerated neuroscience is likely to vastly improve treatments for, or even cure, most mental illness," the kind of hyperbole that should have you tarred and feathered in public.

So, let me summarize:

  • The main technology behind the entire "artificial intelligence" boom is generative AI — transformer-based models like OpenAI's GPT-4 (and soon GPT-5) — and said technology has peaked, with diminishing returns from the only ways of making them "better" (feeding them training data and throwing tons of compute at them) suggesting that what we may have, as I've said before, reached Peak AI.
  • Generative AI is incredibly unprofitable. OpenAI, the biggest player in the industry, is on course to lose more than $5 billion this year, with competitor Anthropic (which also makes its own transformer-based model, Claude) on course to lose more than $2.7 billion this year.
  • Every single big tech company has thrown billions — as much as $75 billion in Amazon's case in 2024 alone — at building the data centers and acquiring the GPUs to populate said data centers specifically so they can train their models or other companies' models, or serve customers that would integrate generative AI into their businesses, something that does not appear to be happening at scale.
  • Worse still, many of the companies integrating generative AI do so by connecting to models made by either OpenAI or Anthropic, both of whom are running unprofitable businesses, and likely charging nowhere near enough to cover their costs. As I wrote in the Subprime AI Crisis in September, in the event that these companies start charging what they actually need to, I hypothesize it will multiply the costs of their customers to the point that they can't afford to run their businesses — or, at the very least, will have to remove or scale back generative AI functionality in their products.

The entire tech industry has become oriented around a dead-end technology that requires burning billions of dollars to provide inessential products that cost them more money to serve than anybody would ever pay. Their big strategy was to throw even more money at the problem until one of these transformer-based models created a new, more useful product — despite the fact that every iteration of GPT and other models has been, well, iterative. There has never been any proof (other than benchmarks that are increasingly easier to game) that GPT or other models would become conscious, nor that these models would do more than they do today, or three months ago, or even a year ago.


Yet things can, believe it or not, get worse.

The AI boom helped the S&P 500 hit record high levels in 2024, largely thanks to chip giant NVIDIA, a company that makes both the GPUs necessary to train and run generative AI models and the software architecture behind them. Part of NVIDIA's remarkable growth has been its ability to capitalize on the CUDA architecture — the software layer that lets you do complex computing with GPUs, rather than simply use them to render video games in increasingly higher resolution — and, of course, continually create new GPUs to sell for tens of thousands of dollars to tech companies that want to burn billions of dollars on generative AI, leading the company's stock to pop more than 179% over the last year.

Back in May, NVIDIA CEO and professional carnival barker Jensen Huang said that the company was now "on a one-year rhythm" in AI GPU production, with its latest "Blackwell" GPUs (specifically the B100, B200 and GB200 models used for generative AI) supposedly due at the end of 2024, but are now delayed until at least March 2025.Before we go any further, it's worth noting that when I say "GPU," I don't mean the one you'd find in a gaming PC, but a much larger chip put in a specialized server with multiple other GPUs, all integrated with specialized casing, cooling, and networking infrastructure. In simple terms, the things necessary to make sure all these chips work together efficiently, and also stop them from overheating, because they get extremely hot and are running at full speed, all the time.

The initial delay of the new Blackwell chips was caused by a (now-fixed) design flaw in production, but as I've suggested above, the problem isn't just creating the chips — it's making sure they actually work, at scale, for the jobs they're bought for. 

But what if that, too, wasn't possible?

A few days ago, The Information reported that NVIDIA is grappling with the oldest problem in computing — how to cool the fucking things. According to the report, NVIDIA has been asking suppliers to change the design of its 3,000-pound, 75-GPU server  racks "several times" to overcome overheating problems, which The Information calls "the most complicated design NVIDIA had ever come up with." According to the report,  a few months after revealing the racks, engineers found that they...didn't work properly, even with Nvidia’s smaller 36-chip racks, and have been scrambling to fix it ever since.

While one can dazzle investors with buzzwords and charts, the laws of physics are a far harsher mistress, and if NVIDIA is struggling mere months before the first installations are to begin, it's unclear how it practically launches this generation of chips, let alone continues its yearly cadence. The Information reports that these changes have been made late in the production process, which is scaring customers that desperately need them so that their models can continue to do something they'll work out later. To quote The Information:

Two executives at large cloud providers that have ordered the new chips said they are concerned that such last-minute difficulties might push back the timeline for when they can get their GPU clusters up and running next year.

The fact that NVIDIA is having such significant difficulties with thermal performance is very, very bad. These chips are incredibly expensive — as much as $70,000 a piece — and will be running, as I've mentioned, at full speed, generating an incredible amount of heat that must be dissipated, while sat next to anywhere from 35 to 71 other chips, which will in turn be densely packed so that you can cram more servers into a data center. New, more powerful chips require entirely new methods to rack-mount, operate and cool them, and all of these parts must operate in sync, as overheating GPUs will die. While these units are big, some of their internal components are microscopic in size, and unless properly cooled, their circuits will start to crumble when roasted by a guy typing "Garfield with Gun" into ChatGPT.

Remember, Blackwell is supposed to represent a major leap forward in performance. If NVIDIA doesn’t solve its cooling problem — and solve it well — its customers will undoubtedly encounter thermal throttling, where the chip reduces speed in order to avoid causing permanent damage. It could eliminate any performance gains obtained from the new architecture and new manufacturing process, despite costing much, much more than its predecessor. 

NVIDIA's problem isn't just bringing these thermal performance issues under control, but both keeping them under control and being able to educate their customers on how to do so. NVIDIA has, according to The Information, repeatedly tried to influence its customers' server integrations to follow its designs because it thinks it will "lead to better performance," but in this case, one has to worry if NVIDIA's Blackwell chips can be reliably cooled.

While NVIDIA might be able to fix this problem in isolation within its racks, it remains to be seen how this works at scale as they ship and integrate hundreds of thousands of Blackwell GPUs starting in the front half of 2025. 

Things also get a little worse when you realize how these chips are being installed — in giant “supercomputer” data centers where tens of thousands, or as many as a hundred thousand in the case of Elon Musk’s “colossus” data center — of GPUs run in concert to power generative AI models. The Wall Street Journal reported a few weeks ago that building these vast data centers creates entirely new engineering challenges, with one expert saying that big tech companies could be using as much as half of their capital expenditures on replacing parts that have broken down, in large part because these clusters are running their GPUs at full speed, at all times. 

Remember, the capital expenditures on generative AI and the associated infrastructure have gone over $200 billion in the last year. If half of that’s dedicated to replacing broken gear, what happens when there’s no path to profitability?

In any case, NVIDIA doesn’t care. It’s already made billions of dollars selling Blackwell GPUs — they're sold out for a year, after all — and will continue to do so for now, but any manufacturing or cooling issues will likely be costly.

And even then, at some point somebody has to ask the question: why do we need all these GPUs if we've reached peak AI? Despite the remarkable "power" of these chips, NVIDIA's entire enterprise GPU business model centers around the idea that throwing more power at these problems will finally create some solutions.

What if that isn't the case?


The tech industry is over-leveraged, having doubled, tripled, quadrupled down on generative AI — a technology that doesn't do much more than it did a few months ago and won't do much more than it can do now. Every single big tech company has piled tens of billions of dollars into building out massive data centers with the intent of "capturing AI demand," yet never seemed to think whether they were actually building things that people wanted, or would pay for, or would somehow make the company money.

While some have claimed that "agents are the next frontier," the reality is that agents may be the last generative AI product — multiple Large Language Models and integrations bouncing off of each other in an attempt to simulate what a human might do at a cost that won't be sustainable for the majority of businesses. While Anthropic's demo of its model allegedly controlling a few browser windows with a prompt might have seemed impressive to credulous people like Casey Newton, these were controlled demos which Anthropic added were "slow" and "made lots of mistakes."  Hey, almost like it's hallucinating! I sure hope they fix that totally unfixable problem.

Even if it does, Anthropic has now successfully replaced...an entry-level data worker position at an indeterminate and likely unprofitable price. And in many organizations, those jobs had already been outsourced, or automated, or staffed with cheaper contractors. 

The obscenity of this mass delusion is nauseating — a monolith to bad decision-making and the herd mentality of tech's most powerful people, as well as an outright attempt to manipulate the media into believing something was possible that wasn't. And the media bought it, hook, line, and sinker.

Hundreds of billions of dollars have been wasted building giant data centers to crunch numbers for software that has no real product-market fit, all while trying to hammer it into various shapes to make it pretend that it's alive, conscious, or even a useful product. 

There is no path, from what I can see, to turn generative AI and its associated products into anything resembling sustainable businesses, and the only path that big tech appeared to have was to throw as much money, power, and data at the problem as possible, an avenue that appears to be another dead end.

And worse still, nothing has really come out of this movement. I've used a handful of AI products that I've found useful — an AI powered journal, for example — but these are not the products that one associates with "revolutions," but useful tools that would have been a welcome surprise if they didn't require burning billions of dollars, blowing past emissions targets and stealing the creative works of millions of people to train them.


I truly don't know what happens next, but I'll walk you through what I'm thinking.

If we're truly at the diminishing returns stage of transformer-based models, it will be extremely difficult to justify buying further iterations of NVIDIA GPUs past Blackwell. The entire generative AI movement lives and dies by the idea that more compute power and more training data makes these things better, and if that's no longer the case, there's little reason to keep buying bigger and better. After all, what's the point? 

Even now, what exactly happens when Microsoft or Google has racks-worth of Blackwell GPUs? The models aren't going to get better.

This also makes the lives of OpenAI and Anthropic that much more difficult. Sam Altman has grown rich and powerful lying about how GPT will somehow lead to AGI, but at this point, what exactly is OpenAI meant to do? The only way it’s ever been able to develop new models is by throwing masses of compute and training data at the problem, and its only other choice is to start stapling its reasoning model onto its main Large Language Model, at which point something happens, something so good that literally nobody working for OpenAI or in the media appears to be able to tell you what it is.

Putting that aside, OpenAI is also a terrible business that has to burn $5 billion to make $3.4 billion, with no proof that it’s capable of bringing down costs. The constant refrain I hear from VCs and AI fantasists is that "chips will bring down the cost of inference," yet I don't see any proof of that happening, nor do I think it'll happen quickly enough for these companies to turn things around.

And you can feel the desperation, too. OpenAI is reportedly looking at ads as a means to narrow the gap between its revenues and losses. As I pointed out in Burst Damage, introducing an advertising revenue stream would require significant upfront investment, both in terms of technology and talent. OpenAI would need a way to target ads, and a team to sell advertising — or, instead, use a third-party ad network that would take a significant bite out of its revenue. 

It’s unclear how much OpenAI could charge advertisers, or what percentage of its reported 200 million weekly users have an ad-blocker installed. Or, for that matter, whether ads would provide a perverse incentive for OpenAI to enshittify an already unreliable product. 

Facebook and Google — as I’ve previously noted — have made their products manifestly worse in order to increase the amount of time people spend on their sites, and thus, the number of ads they see. In the case of Facebook, it buried your newsfeed under a deluge of AI-generated sludge and “recommended content.” Google, meanwhile, has progressively degraded the quality of its search results in order to increase the volume of queries it received as a means of making sure users saw more ads

OpenAI could, just as easily, fall into the same temptation. Most people who use ChatGPT are trying to accomplish a specific task — like writing a term paper, or researching a topic, or whatever — and then they leave. And so, the amount of ads they’d conceivably see each will undoubtedly be comparatively low compared to a social network or search engine. Would OpenAI try to get users to stick around longer — to write more prompts — by crippling the performance of its models? 

Even if OpenAI listens to its better angels, the reality still stands: ads won’t dam the rising tide of red ink that promises to eventually drown the company. 

This is a truly dismal situation where the only options are to stop now, or continue burning money until the heat gets too much. It cost $100 million to train GPT-4o, and Anthropic CEO Dario Amodei estimated a few months ago that training future models will cost $1 billion to $10 billion, with one researcher claiming that training OpenAI's GPT-5 will cost around $1 billion.

And that’s before mentioning any, to quote a Rumsfeldism, “unknown unknowns.” Trump’s election, at the risk of sounding like a cliché, changes everything and in ways we don’t yet fully understand. According to the Wall Street Journal, Musk has successfully ingratiated himself with Trump, thanks to his early and full-throated support of his campaign. He’s now reportedly living in Mar a Lago, sitting on calls with world leaders, and whispering in Trump’s ear as he builds his cabinet. 

And, as The Journal claims, his enemies fear that he could use his position of influence to harm them or their businesses — chiefly Sam Altman, who is “persona non grata” in Musk’s world, largely due to the new for-profit direction of OpenAI. While it’s likely that these companies will fail due to inevitable organic realities (like running out of money, or not having a product that generates a profit), Musk’s enemies must now contend with a new enemy — one with the full backing of the Federal government, and that neither forgives nor forgets.

And, crucially, one that’s not afraid to bend ethical or moral laws to further his own interests — or to inflict pain on those perceived as having slighted him. 

Even if Musk doesn’t use his newfound political might to hurt Altman and OpenAI, he could still pursue the company as a private citizen. Last Friday, he filed an injunction requesting a halt to OpenAI’s transformation from an ostensible non-profit to a for-profit business. Even if he ultimately fails, should Musk manage to drag the process out, or delay it temporarily, it could strike a terminal blow for OpenAI. 

That’s because in its most recent fundraise, OpenAI agreed that it would convert its recent $6.6bn equity investment into high-interest debt, should it fail to successfully convert into a for-profit business within a two-year period. This was a tight deadline to begin with, and it can’t afford any delays. The interest payments on that debt would massively increase its cash burn, and it would undoubtedly find it hard to obtain further outside investment. 

Outside of a miracle, we are about to enter an era of desperation in the generative AI space. We're two years in, and we have no killer apps — no industry-defining products — other than ChatGPT, a product that burns billions of dollars and nobody can really describe. Neither Microsoft, nor Meta, nor Google or Amazon seem to be able to come up with a profitable use case, let alone one their users actually like, nor have any of the people that have raised billions of dollars in venture capital for anything with "AI" taped to the side — and investor interest in AI is cooling.

It's unclear how much further this farce continues, if only because it isn't obvious what it is that anybody gets by investing in future rounds in OpenAI, Anthropic, or any other generative AI company. At some point they must make money, and the entire dream has been built around the idea that all of these GPUs and all of this money would eventually spit out something revolutionary.

Yet what we have is clunky, ugly, messy, larcenous, environmentally-destructive and mediocre. Generative AI was a reckless pursuit, one that shows a total lack of creativity and sense in the minds of big tech and venture capital, one where there was never anything really impressive other than the amount of money it could burn and the amount of times Sam Altman could say something stupid and get quoted for it.

I'll be honest with you, I have no idea what happens here. The future was always one that demanded that big tech spent more to make even bigger models that would at some point become useful, and that isn't happening. In pursuit of doing so, big tech invested hundreds of billions of dollars into infrastructure specifically to follow one goal, and put AI front and center at their businesses, claiming it was the future without ever considering what they'd do if it wasn't.

The revenue isn't coming. The products aren't coming. "Orion," OpenAI's next model, will underwhelm, as will its competitors' models, and at some point somebody is going to blink in one of the hyperscalers, and the AI era will be over. Almost every single generative AI company that you’ve heard of is deeply unprofitable, and there are few innovations coming to save them from the atrophy of the foundation models.

I feel sad and exhausted as I write this, drained as I look at the many times I’ve tried to warn people, frustrated at the many members of the media that failed to push back against the overpromises and outright lies of people like Sam Altman, and full of dread as I consider the economic ramifications of this industry collapsing. Once the AI bubble pops, there are no other hyper-growth markets left, which will in turn lead to a bloodbath in big tech stocks as they realize that they’re out of big ideas to convince the street that they’re going to grow forever.

There are some that will boast about “being right” here, and yes, there is some satisfaction in being so. Nevertheless, knowing that the result of this bubble bursting will be massive layoffs, a dearth in venture capital funding, and a much more fragile tech ecosystem. 

I’ll end with a quote from Bubble Trouble, a piece I wrote in April: 

How do you solve all of these incredibly difficult problems? What does OpenAI or Anthropic do when they run out of data, and synthetic data doesn't fill the gap, or worse, massively degrades the quality of their output? What does Sam Altman do if GPT-5 — like GPT-4 — doesn't significantly improve its performance and he can't find enough compute to take the next step? What do OpenAI and Anthropic do when they realize they will likely never turn a profit? What does Microsoft, or Amazon, or Google do if demand never really takes off, and they're left with billions of dollars of underutilized data centers? What does Nvidia do if the demand for its chips drops off a cliff as a result?

I don't know why more people aren't screaming from the rooftops about how unsustainable the AI boom is, and the impossibility of some of the challenges it faces. There is no way to create enough data to train these models, and little that we've seen so far suggests that generative AI will make anybody but Nvidia money. We're reaching the point where physics — things like heat and electricity — are getting in the way of progressing much further, and it's hard to stomach investing more considering where we're at right now is, once you cut through the noise, fairly god damn mediocre. There is no iPhone moment coming, I'm afraid.

I was right then and I’m right now. Generative AI isn’t a revolution, it’s an evolution of a tech industry overtaken by growth-hungry management consultant types that neither know the problems that real people face nor how to fix them. It’s a sickening waste, a monument to the corrupting force of growth, and a sign that the people in power no longer work for you, the customer, but for the venture capitalists and the markets.

I also want to be clear that none of these companies ever had a plan. They believed that if they threw enough GPUs together they would turn generative AI – probabilistic models for generating stuff — into some sort of sentient computer. It’s much easier, and more comfortable, to look at the world as a series of conspiracies and grand strategies, and far scarier to see it for what it is — extremely rich and powerful people that are willing to bet insanely large amounts of money on what amounts to a few PDFs and their gut. 

This is not big tech’s big plan to excuse building more data centers — it’s the death throes of twenty years of growth-at-all-costs thinking, because throwing a bunch of money at more servers and more engineers always seemed to create more growth. In practice, this means that the people in charge and the strategies they employ are borne not of an interest in improving the lives of their customers, but in increasing revenue growth, which means the products they create aren’t really about solving any problem other than “what will make somebody give me more money,” which doesn’t necessarily mean “provide them with a service.”

Generative AI is the perfect monster of the Rot Economy — a technology that lacks any real purpose sold as if it could do literally anything, one without a real business model or killer app, proliferated because big tech no longer innovates, but rather clones and monopolizes. Yes, this much money can be this stupid, and yes, they will burn billions in pursuit of a non-specific dream that involves charging you money and trapping you in their ecosystem.

I’m not trying to be a doomsayer, just like I wasn’t trying to be one in March. I believe all of this is going nowhere, and that at some point Google, Microsoft, or Meta is going to blink and pull back on their capital expenditures. And before then, you’re going to get a lot of desperate stories about how “AI gains can be found outside of training new models” to try and keep the party going, despite reality flicking the lights on and off and threatening to call the police. 

I fear for the future for many reasons, but I always have hope, because I believe that there are still good people in the tech industry and that customers are seeing the light. Bluesky feels different — growing rapidly, competing with both Threads and Twitter, all while selling an honest product and an open protocol. 

There are other ideas for the future that aren’t borne of the scuzzy mindset of billionaire shitheels like Sundar Pichai and Sam Altman, and they can — and will — grow out of the ruins created by these kleptocrats. 

Read the whole story
tante
24 days ago
reply
"Generative AI is the perfect monster of the Rot Economy — a technology that lacks any real purpose sold as if it could do literally anything, one without a real business model or killer app, proliferated because big tech no longer innovates, but rather clones and monopolizes. Yes, this much money can be this stupid, and yes, they will burn billions in pursuit of a non-specific dream that involves charging you money and trapping you in their ecosystem."
Berlin/Germany
Share this story
Delete
1 public comment
mkalus
22 days ago
reply
If tech could stop just trying to create “more shareholder value” and instead ask how they can make the world better, we may actually get some new technology that’s useful for more than “number goes up”.

I’d like to think that sanity eventually returns and the bubble pops and all these investors will lose money. But the last 20 years have made it clear that part of society lives in an alternate reality that has little to no connection to the people on the ground.
iPhone: 49.287476,-123.142136
Next Page of Stories