Benedict Evans makes a savvy comparision between the current Generative AI moment and the early days of the PC. While the new technology is impressive, it’s not (yet) evident how it fits into the everyday lives or workflows of most people. Basically: what do we do with this thing? For many, ChatGPT and its cousins remain curiosities—fun toys to tinker with, but little more so far.
This wouldn’t matter much (‘man says new tech isn’t for him!’), except that that a lot of people in tech look at ChatGPT and LLMs and see a step change in generalisation, towards something that can be universal. A spreadsheet can’t do word processing or graphic design, and a PC can do all of those but someone needs to write those applications for you first, one use-case at a time. But as these models get better and become multi-modal, the really transformative thesis is that one model can do ‘any’ use-case without anyone having to write the software for that task in particular.
Suppose you want to analyse this month’s customer cancellations, or dispute a parking ticket, or file your taxes - you can ask an LLM, and it will work out what data you need, find the right websites, ask you the right questions, parse a photo of your mortgage statement, fill in the forms and give you the answers. We could move orders of magnitude more manual tasks into software, because you don’t need to write software to do each of those tasks one at a time. This, I think, is why Bill Gates said that this is the biggest thing since the GUI. That’s a lot more than a writing assistant.
It seems to me, though, that there are two kinds of problem with this thesis.
The first problem, Evans says, is that the models are still janky. They trip—all the time—on problems that are moderately complex or just a few degrees left of familiar. That’s a technical problem, and the systems are getting better at a startling clip.
The second problem is more twisty—and less clear how it will resolve: as a culture broadly, and as the tech industry specifically, our imaginations haven’t quite caught up with truly useful applications for LLMs.
It reminds me a little of the early days of Google, when we were so used to hand-crafting our solutions to problems that it took time to realise that you could ‘just Google that’. Indeed, there were even books on how to use Google, just as today there are long essays and videos on how to learn ‘prompt engineering.’ It took time to realise that you could turn this into a general, open-ended search problem, and just type roughly what you want instead of constructing complex logical boolean queries on vertical databases. This is also, perhaps, matching a classic pattern for the adoption of new technology: you start by making it fit the things you already do, where it’s easy and obvious to see that this is a use-case, if you have one, and then later, over time, you change the way you work to fit the new tool.
The arrival of startling new technologies often works this way, as we puzzle how to shoehorn them into old ways of doing things. In my essay Of Nerve and Imagination, I framed this less as a problem of imagination than of nerve—the cheek to step out of old assumptions of “how things are done” and into a new paradigm. I wrote that essay just as the Apple Watch and other smartwatches were landing, adding yet another device to a busy ecosystem. Here’s what I said then:
The significance of new combinations tends to escape us. When someone embeds a computer inside a watch, it’s all too natural for us to assume that it will be used like either a computer or a watch. A smartphone on your wrist! A failure of nerve prevents us from imagining the entirely new thing that this combination might represent. The habits of the original technology blind us to the potential opportunities of the new.
Today’s combinations are especially hard to parse because they’re no longer about individual instances of technology. The potential of a smartwatch, for example, hinges not only on the combination of its component parts but on its combination with other smart and dumb objects in our lives.
As we weigh the role of the smartwatch, we have to muster the nerve to imagine: How might it talk to other devices? How can it interact with the physical world? What does it mean to wear data? How might the watch signal identity in the digital world as we move through the physical? How might a gesture or flick of the wrist trigger action around me? What becomes possible if smart watches are on millions of wrists? What are the social implications? What new behaviors will the watch channel and shape? How will it change the way I use other devices? How might it knit them together?
As we begin to embed technology into everything—when anything can be an interface—we can no longer judge each new gadget on its own. The success of any new interface depends on how it controls, reflects, shares, or behaves in a growing community of social devices.
Similarly, how do LLMs fit into a growing community of interfaces, services, and indeed other LLMs? As we confront a new and far more transformational technology in Generative AI, it’s up to designers and product folks to summon the nerve to understand not only how it fits into our tech ecosystem, but how it changes the way we work or think or interact.
Easier said than done, of course. And Evans writes that we’re still finding the right level for working with this technology as both users and product makers. Will we interact with these systems directly as general-purpose, “ask me anything” (or “ask me to do anything”) companions? Or will we instead focus on narrower applications, with interfaces wrapped around purpose-built AI to help focus and nail specific tasks? Can the LLMs themselves be responsible for presenting those interfaces, or do we need to imagine and build each application one at a time, as we traditionally have? There’s an ease and clarity to that narrow interface approach, Evans writes, but it diverges from loftier visions for what the AI interface might be.
Evans writes:
A GUI tells the users what they can do, but it also tells the computer everything we already know about the problem. Can the GUI itself be generative? Or do we need another whole generation of [spreadsheet inventor] Dan Bricklins to see the problem, and then turn it into apps, thousands of them, one at a time, each of them with some LLM somewhere under the hood?
On this basis, we would still have an orders of magnitude change in how much can be automated, and how many use-cases can be found for LLMs, but they still need to be found and built one by one. The change would be that these new use-cases would be things that are still automated one-at-a-time, but that could not have been automated before, or that would have needed far more software (and capital) to automate. That would make LLMs the new SQL, not the new HAL9000.