Explore whether conversational interfaces (CUI) are truly the ultimate form for AI products. This article examines the challenges of AI chat windows, compares conversational and command-based AI interactions, and discusses the current limitations of AI agents in delivering comprehensive services.
Story|The Story of AI, Widely Believed
ChatGPT and Copilot have shaped the fundamental perception of large language model (LLM)-based AI products.
So far, no one seems confident in creating a genuinely AI-native product within the existing business models. As a fallback, many feel that simply adding a “+AI” feature to their existing products is good enough.
Starting in 2023, many have perceived the productization of large models as requiring the addition of a chat window somewhere in the AI product—or even equating the AI product itself to a chat window. GPTs have naturally led people to believe that conversational user interfaces (CUI) represent the ultimate form of AI products.
But is that really the case?
Issue|How Much Can a Chat Window Hold?
In practice, we can’t overlook the shortcomings of this seemingly “advanced” conversational format:
1. The Barrier to Using AI Conversations
Without proper guidance, users often don’t know how to start or how to organize their language naturally when interacting with AI.
This is similar to how everyone knows how to use a search box but not everyone can effectively translate their needs into precise search keywords—there’s a certain skill threshold. The format of AI chat windows exacerbates this barrier to human-computer interaction.
2. Highly Dispersed User Intent
Users enjoy “freedom” when chatting with AI, but that freedom comes at a high cost. For AI to respond meaningfully, it must accurately recognize user intent. However, user intentions during conversations tend to be scattered and jump between topics, rarely staying focused on one subject before moving to another. Furthermore, the chat window format amplifies the difficulty of intent recognition.
3. Overloaded IM Container
If you try to pack all kinds of services into an instant messaging (IM) chatbox, the information and results delivered by AI will vary widely. Sometimes, AI needs to send forms; other times, it needs to deliver structured information. Imagine using an AI to order food in a chat window—it would have to display restaurant lists, menu items, order confirmations, payment processes, and order status updates within the IM interface. Managing this complexity in a chat window is a daunting challenge for products with intricate workflows.
4. Inefficient and Unnecessary
The overall experience resulting from the above issues can feel like “taking off your pants to fart”—completely unnecessary. Why not stick to traditional GUI interfaces, which are faster and more efficient?
Reflection|Further Thoughts
1. The Most Advanced Isn’t Necessarily the Most Suitable
Take Notion AI and Character.AI as examples. These two products represent two distinct interaction forms for LLM-based AI products: single-turn command-based interactions and multi-turn conversational interactions, respectively. They exhibit vastly different product characteristics.
Is conversation really the ideal interaction form for AI products? Let’s hear the director’s analysis!
As of now, we cannot claim that conversational AI is more advanced or intelligent than command-based AI, nor can we say that conversational CUIs will gradually dominate and replace all GUIs. Even less can we assert that conversational CUIs are more advanced and must be adopted sooner.
Many AI functions can be effectively supported without relying on conversational CUIs—command-based AI can be equally sophisticated and intelligent.
2. AI Agents Are Not Yet Capable of Providing Comprehensive Professional Services
The concept of AI agents is still more of a buzzword than a reality. If you’ve used chatbots like those offered by Character.AI, Doubao, or Spicychat, you’ll notice that their primary value lies in entertaining curious users, rather than providing genuine emotional companionship.
Providing more comprehensive professional services is even further out of reach.
For instance, while AI agents may be able to:
Guide you through an hour of English practice,
Conduct a counseling session,
Plan a city travel itinerary, or
Simulate a job interview,
they cannot reliably:
Order a meal tailored to your tastes,
Help you find a suitable job opportunity,
Tutor your child on their school assignments, or
Offer a genuinely practical five-day-four-night travel plan.
The gap is significant, and the primary issue isn’t about contextual understanding or intent recognition during conversations. Instead, it lies in the lack of underlying integration and synergy between AI agents and other systems and services. This prevents them from delivering the complete solutions users need and fails to create genuine user value.