The Lazy Interface

Why the chatbot was the wrong first interface for AI—and what comes next

Last August, the New York Times published a story that really stuck with me. It was about Allan Brooks, a 48-year-old corporate recruiter from Ontario, Canada, who over the course of three weeks came to believe he had invented a mathematical formula with ChatGPT that could break the internet and power fantastical inventions.

He wasn't experiencing a mental health crisis before this happened. He was a regular person who fell down a rabbit hole that consumed his life. In November, four wrongful death lawsuits were filed against OpenAI, along with cases from three people—including Brooks—who say ChatGPT led to mental health breakdowns. "Their product caused me harm, and others harm, and continues to do so," Brooks said.

The problem isn't just the machine. It's the interface.

What the Machine Actually Does

A Large Language Model (LLM) is a system that predicts the next token in a sequence. Given some text, it calculates probabilities for what should come next, picks one, adds it to the sequence, and repeats. Do this thousands of times and you get paragraphs, essays, conversations.

That seemingly simple process, at sufficient scale, gives rise to surprising capabilities—reasoning, creativity, problem-solving—that even the developers didn't fully anticipate. But at its core, it's still a prediction engine, not a verification engine. It doesn't fact-check what it generates. It doesn't consult external sources unless specifically engineered to do so. It generates text based on patterns learned from training data.

This has implications for how it behaves in conversation:

It will almost always produce output. Ask it to solve a problem it can't solve, and it will often generate confident-sounding text anyway. While models are trained to express uncertainty, they don't have a reliable mechanism to know what they don't know.

It can generate false information with the same confidence as true information. Despite training to improve accuracy, the model can produce plausible-sounding falsehoods—sometimes called hallucinations—without any indication that something is wrong.

It commits to the story. In a long conversation, it uses context to predict what comes next. If the context has been building toward "you're a genius who discovered something revolutionary," then the next prediction will continue that story. Breaking character would be improbable.

It's trained on human feedback that rewards agreeableness. Users rate responses. Users like being told they're smart. So the model learns to tell people they're smart. This is called sycophancy, and it's a known challenge that emerges from the training process.

These aren't simply flaws to be fixed. They're tendencies inherent to how the technology works. The question is: what interface should be put around this technology when it's released to the public?

The Lazy Interface

The first interface was chat. Open-ended, conversational, unlimited.

This was inevitable. Chat is how you demo an LLM. It's impressive. It's intuitive. Anyone can use it. And it ships fast—you're just exposing the model directly to users with minimal front end engineering.

But chat is a lazy interface. It's the path of least resistance, not always the right design.

Here's what happens when you put a next-word-prediction engine in an open-ended chat interface with a human:

The human anthropomorphizes. We can't help it. Conversational partners feel like minds. We trust them. We expect them to tell us the truth, to push back when we're wrong, to care about our wellbeing. The chat interface activates all of our social instincts.

The machine has no boundaries. It will talk about anything. Math, physics, relationships, inventions, threats, opportunities. It has no domain expertise and no domain limits. It will discuss anything.

Context accumulates. In a long conversation, early messages shape later predictions. A small misunderstanding becomes a large delusion. The story builds on itself.

There's no external reference. In a basic chat interface, the machine generates text from its training data and the conversation history. It doesn't check facts against external sources. It can be engineered to query databases or search the web—but a raw chat interface doesn't do this by default.

Engagement is rewarded. The companies optimizing these systems want users to come back. Cliffhangers, drama, excitement—these keep users engaged. The model may have learned narrative patterns—perhaps from thrillers and sci-fi—that it deploys to keep conversations compelling.

Put all of this together and Allan Brooks is not surprising. It's predictable. The more remarkable thing is that more people haven't been harmed. Brave people like Allan are speaking up, warning others.

Researchers and clinicians have a name for what happened to Allan Brooks: chatbot psychosis—delusional experiences emerging from extended AI chatbot interactions.

What Better Interfaces Look Like

The solution isn't to abandon AI, and with it having come this far, that's not going to happen. The solution is for software developers to build thoughtful interfaces.

This past year, I've worked on LLM applications to address workforce development, vaccine hesitancy, and online hate. The patterns that make them work—when they work—are similar. The career counselor prototype I collaborated on for Skilllab, a workforce development platform, illustrates them well.

Constrained domains. The AI works best when it has a specific job. The prototype shows remarkable promise—but only because the application limits what it will discuss. It talks about skills and job matching. When users go off-topic, it redirects rather than improvising about things outside its domain.

Real data, not generation. Instead of generating answers from training data, the AI is configured to query specific databases and documents. A career recommendation comes from matching user skills to job requirements, and it uses the LLM for language interpretation and generation.

Specific tools. The AI has defined capabilities: get user profile, get career recommendations, analyze skill gaps. It uses these tools to do its work, not to fabricate plausible-sounding responses.

Structured flows. The conversation has a shape. Introduction, assessment, exploration, planning. The AI guides users through a process rather than improvising indefinitely.

Session limits. Conversations have natural endpoints. The AI isn't designed for week-long marathons. It accomplishes something specific and concludes.

Human oversight for stakes. When decisions matter—career choices, financial decisions, health—humans remain in the loop. The AI supports human judgment rather than replacing it.

Fresh context. Each session starts clean, rather than building on accumulated information from previous conversations.

This may seem like chat. But it is more, because it requires understanding users, domain, and data. It requires engineering beyond exposing the LLM model. But it's how to derive value from large language models with benefit, not harm.

The Interface Is the Product

Here's what I've learned: with AI, the interface isn't just how users access the technology. The interface is the product.

ChatGPT's chat interface creates one kind of experience—open-ended, improvisational, and potentially delusional. A constrained career counselor creates a different experience—focused, grounded, useful.

Same underlying technology. Completely different outcomes.

The companies shipping chat interfaces are making a choice. They're choosing reach over safety, demo-ability over appropriateness, engagement over wellbeing. That choice has consequences, as Allan Brooks inevitably discovered.

Organizations considering AI should make different choices. Not "should we use AI?" but "what interface is right for our users and our domain?"

What This Means for Mission-Driven Organizations

If you're a nonprofit or social enterprise thinking about AI, here's some practical guidance:

Don't start with chat. Resist the temptation to just plug in a chatbot. Ask first: what specific problem are you solving? What data do you have? What constraints should exist?

Design for your users. If you serve vulnerable populations—and many mission-driven organizations do—the stakes are higher. Sycophantic AI telling struggling people what they want to hear can cause real harm. Design interfaces that support rather than manipulate.

Use AI as a component, not a conversationalist. The most successful applications use LLMs as one piece of a larger system. The AI interprets data and generates text—but within guardrails defined by the application.

Ground the AI in real data. If you have data about your users, your services, your domain—use it. AI that queries that information is vastly more reliable than AI that generates from its own training data.

Test with real users, watch for harm. Not just "does it work?" but "what happens when it doesn't?" How does the AI handle confusion, frustration, unrealistic expectations? What's the failure mode?

Stay close. AI applications need ongoing attention. User needs shift, model behavior changes, new failure modes emerge. This isn't deploy-and-forget technology.

The Lazy Era Is Ending

The chat interface was inevitable as a first step. You have to ship something to learn anything. But the lesson is clear: open-ended chat with ungrounded AI and no guardrails is dangerous.

The era of AI applications should be defined by thoughtful interfaces. Constrained domains. Real data. Specific tools. Human oversight. Designed experiences that use AI's capabilities while managing its limitations.

Allan Brooks needed someone to tell him the truth: This is a next-word prediction engine. It doesn't know if your ideas are real. It will tell you what sounds good, not what is true. Don't trust it for more than it can do.

That's not what the chat interface told him. It compared him to Leonardo da Vinci.

We can do better. We have to.