The most jarring part of talking to a chatbot isn't the hallucinations or the occasional error; it’s the silence. You ask a question, the cursor blinks, and you wait for the server to finish its computation. Then, a wall of text appears all at once.
Sesame, the startup founded by the team behind Oculus, thinks this is the wrong way to build a relationship with a machine. On Thursday, the company released its iOS app, moving beyond a simple research preview to offer a conversational experience that mimics the messy, iterative nature of human speech.
Instead of waiting for a complete thought, Sesame’s agents—Maya, Miles, Simone, and Charlie—begin speaking almost immediately. As they talk, they continue to run parallel searches, weaving new information into their responses in real-time. If the agent uncovers a more relevant fact while mid-sentence, it can pivot, effectively interrupting itself to refine its point. It is a deliberate attempt to solve the tension between speed and accuracy that plagues current LLM interfaces.
The Architecture of a Conversation
To pull this off, Sesame has moved away from the standard "request-response" cycle. The company claims to have built a proprietary search and retrieval system that operates asynchronously. While the agent is vocalizing, the backend is still querying, filtering, and prioritizing data.
This isn't just about making the AI sound more "human." It’s about utility. By allowing the agent to "think" while it speaks, Sesame is attempting to bridge the gap between a static chatbot and a true digital assistant. During its earlier beta phase, which saw over a million users, the company added "search cards" that display images and notes to visualize concepts, acknowledging that voice alone is often insufficient for complex tasks.
Beyond the Chatbot
While the iOS app is the current focus, the team’s roadmap is significantly more ambitious. The founders, who famously shepherded Oculus from a Kickstarter project to a $2 billion acquisition by Meta, are eyeing the hardware space once again. The company has signaled plans to launch intelligent eyewear by 2027.
For now, the agents are limited to conversation and information retrieval, but the "agent" moniker is intentional. Sesame is building toward a future where these systems don't just provide answers, but execute tasks on behalf of the user. This shift is critical; today’s agentic tools often require precise, technical prompting. Sesame’s goal is to allow users to speak naturally, letting the AI handle the nuance of the command.
What This Means for Users
For the average user, the app is a free, albeit waitlisted, experiment in a different kind of AI interaction. It includes an incognito mode that allows agents to maintain context for the duration of a session without saving data to long-term memory—a nod to the growing privacy concerns surrounding persistent AI logs.
Whether this "interruptible" style of speech becomes the new standard for voice interfaces remains to be seen. However, by prioritizing the flow of conversation over the perfection of the output, Sesame is betting that the future of AI isn't just about being right—it's about being responsive.
Key Takeaways
- Real-time pivoting: Sesame’s agents can search for information and update their responses while they are actively speaking, allowing for mid-sentence corrections.
- Hardware ambitions: The iOS app is a stepping stone for the company's long-term goal of launching intelligent, AI-powered eyewear by 2027.
- Agentic focus: The company is moving beyond simple chatbots, aiming to build systems that can eventually take action on behalf of the user without requiring perfect prompts.
As the app rolls out to 39 countries, the next hurdle for Sesame will be scaling its infrastructure to handle the compute-heavy demands of real-time, parallel searching. The company has yet to announce a monetization strategy, keeping the experience free for now. The real test will come when these agents move from answering questions to managing the user's digital life.