Boring Is Good

Scott Jenson suggests AI is likely to be more useful for “boring” tasks than for fancy outboard brains that can do our thinking for us. With hallucination and faulty reasoning derailing high-order tasks, Scott argues its time to right-size the task—and maybe the models, too. “Small language models” (SLMs) are plenty to take on helpful but modest tasks around syntax and language.

These smaller open-source models, while very good, usually don’t score as well as the big foundational models by OpenAI and Google which makes them feel second-class. That perception is a mistake. I’m not saying they perform better; I’m saying it doesn’t matter. We’re asking them the wrong questions. We don’t need models to take the bar exam.

Instead of relying on language models to be answer machines, Scott suggests that we should lean into their core language understanding for proofreading, summaries, or light rewrites for clarity: “Tiny uses like this flip the script on the large centralized models and favor SLMs which have knock-on benefits: they are easier to ethically train and have much lower running costs. As it gets cheaper and easier to create these custom LLMs, this type of use case could become useful and commonplace.”

This is what we call casual intelligence in Sentient Design, and we recently shared examples of iPhone apps doing exactly what Scott is talking about. It makes tons of sense.

Sentient Design advocates dramatically new experiences that go beyond Scott’s “boring” use cases, but that advocacy actually lines up neatly with what Scott proposes: let’s lean into what language models are really good at. These models may be unreliable at answering questions, but they’re terrific at understanding language and intent.

Some of Sentient Design’s most impressive experience patterns rely on language models to do low-lift tasks that they’re quite good at. The bespoke UI design pattern, for example, creates interfaces that can redesign their own layouts in response to explicit or implicit requests. It’s wild when you first see it go, but under the hood, it’s relatively simple: ask the model to interpret the user’s intent and choose from a small set of design patterns that match the intent. We’ve built a bunch of these, and they’re reliable—because we’re not asking the model to do anything except very simple pattern matching based on language and intent. Sentient Scenes is a fun example of that, and a small, local language model would be more than capable of handling that task.

As Scott says, all of this comes with time and practice as we learn the grain of this new design material. But for now we’ve been asking the models to do more than they can handle:

LLMs are not intelligent and they never will be. We keep asking them to do “intelligent things” and find out a) they really aren’t that good at it, and b) replacing that human task is far more complex than we originally thought. This has made people use LLMs backwards, desperately trying to automate from the top down when they should be augmenting from the bottom up.…

Ultimately, a mature technology doesn’t look like magic; it looks like infrastructure. It gets smaller, more reliable, and much more boring.

We’re here to solve problems, not look cool.

It’s only software, friends.

Boring is good | Scott Jenson