ChatGPT's composer teardown
Updated June 15, 2026
The composer starts calm on purpose. Capability shows up in layers: open the + menu, pick a mode, see a chip, maybe grab a starter. Familiarity first, power when you ask for it.
That works when the mode teaches you something before you send. It falls apart when the only feedback is a chip. Think is where we would push hardest.
Calm default

What works
- No tool vocabulary on first load. You can just type.
- Starter pills teach three common jobs without opening the + menu.
What we would push on
- Starter pills and the + menu are two front doors. Some users may never open the + menu.
- Attach and advanced modes need another way in if pills stay the primary path.
Business strategy
OpenAI wants more people to send a first message. An empty bar looks like normal chat, not a pro tool, so newcomers type instead of bouncing. Starter pills teach three jobs without a product tour.
Tradeoff
| Tradeoff | Benefit | Cost |
|---|---|---|
| Calm default (no chips on load) | Familiar, low intimidation | Advanced tools stay hidden until users discover the + menu or pills |
Takeaway
Approachable surface, hidden depth. Fine if you accept the discoverability tradeoff.
Pattern: Tool Switching in ComposerCapability stays off the bar until you ask: progressive disclosure as a growth strategy, not just aesthetics.
Pattern: Prompt TemplatesStarter pills teach jobs without opening the + menu, but they compete with it as a second front door.
Tools & modes

What works
- One + menu groups attach, creation, and behavioral modes. You pick intent, not infrastructure.
- Open the + menu and the first screen lists attach, Create image, Thinking, Deep research, and Web search. More and Projects hold overflow.
What we would push on
- Does web search need to be a mode? Most people already ask in plain language ("What is the stock price for SpaceX today?"). A mode earns its place when the output contract changes: citations, live results, slower run, a chip that confirms web use.
- "Look something up" on the starter pill and Web search in the + menu are the same job, two doors.
- Think and Deep research sit side by side with one line each. Different jobs, same visual weight, and the composer only teaches the difference after you pick Deep research.
- Every row looks the same weight. Nothing signals that Deep research takes minutes while Think is a slower single reply.
Business strategy
The + menu is how OpenAI sells heavier modes (Deep research, image gen, web search) inside one chat product instead of spinning up separate apps for each job.
Tradeoff
| Tradeoff | Benefit | Cost |
|---|---|---|
| Equal visual weight for every mode row | Consistent menu design | Slow or costly modes look as cheap as fast ones |
Takeaway
The + menu organizes well. It does not help you choose between modes that look equal but behave very differently.
Pattern: Tool Switching in ComposerModes are tools with different output contracts, the menu should signal cost and latency, not just intent.
Pattern: Persona SelectorThink and Deep research behave like personas with different runtimes, but share equal visual weight.
Attachments

What works
- Inline thumbnail with dismiss. You see what ships and can fix mistakes without resetting the thread.
- Attach stacks with mode chips without clearing them. Flexible for power users.
What we would push on
- Attach inside the + menu fights the paperclip habit from email and Slack. Sending a file is not an edge case.
- File + mode chip + custom text makes the composer dense fast.
Business strategy
ChatGPT’s bet is one input for everything (attach, modes, send), so the product can grow into tools and agents without users learning a new surface for each feature.
Tradeoff
| Tradeoff | Benefit | Cost |
|---|---|---|
| Attach inside the + menu | Keeps the default bar calm | Fights the paperclip habit; file send is not an edge case |
Takeaway
Steal the preview. Question hiding attach behind the + menu unless you have another strong discovery path.
Pattern: Context Chip ManagementFile and mode chips stack in one bar. Plan for crowding when attach, scope, and output controls all show at once.
Deep research

What works
- Chip, placeholder ("Get a detailed report"), credible starters below. You know the job before you type.
- Pre-fill teaches prompting without trapping you. Edit, remove chip, still fine.
- Apps and Sites filters scope the research run inside the composer.
Business strategy
Deep research runs take minutes and cost real compute. Teaching the job before send reduces rage-quits, and makes it easier to justify Plus when users know what they signed up for.
Tradeoff
| Tradeoff | Benefit | Cost |
|---|---|---|
| Pre-send education for slow modes | Fewer surprise waits and abandoned threads | More UI density before the first send |
Takeaway
This is the bar for behavioral modes: name the outcome, suggest a first prompt, keep scope in the bar.
Pattern: Prompt Templates
Pattern: Tool Switching in Composer
Think mode

What works
- Removable chip shows scope before send.
- Placeholder stays editable. No locked template.
What we would push on
- Thinnest mode in the set. Research and Search rewrite placeholder and starters. Think adds a chip. Send looks like default chat.
- Deep research pre-fills "Get a detailed report," adds Apps and Sites, drops report starters. Think stays on "Ask anything." Same output shape, supposedly smarter reasoning. The product gap is real. The UI gap is invisible.
- Can you explain the tradeoff before send? Think fails. Deep research passes.
Business strategy
Think is a retention bet on “smarter default chat.” If users wait longer and cannot tell what changed, they may not come back, especially after paying for Plus.
Tradeoff
| Tradeoff | Benefit | Cost |
|---|---|---|
| Think as chip only | Simple to ship and remove | No pre-send contract; user learns after waiting |
Takeaway
Steal the chip. Do not steal label-only as differentiation. Behavioral modes need a pre-send contract.
Pattern: Persona SelectorA behavioral mode needs a pre-send contract, chip-only fails when the output shape looks identical to default chat.
Pattern: Tool Switching in Composer
Web search

What works
- Chip, web-specific placeholder, trending starters. Live web is in scope before send.
- Same rhythm as Deep research. Feels learnable once selected.
What we would push on
- Did you need to pick a mode at all? Default chat handles "search for X" in plain language for many queries.
- Research and Search overlap for anyone who wants fresh info. The UI does not help you choose.
- "Look something up" on the empty state points at the same job without explaining how it differs from Web search here.
Business strategy
Web search burns retrieval and inference on every run. A dedicated mode makes that cost explicit before send, and gives OpenAI a clear signal when users want live data.
Tradeoff
| Tradeoff | Benefit | Cost |
|---|---|---|
| Web search as explicit mode | Clear scope before a costly run | Redundant with plain-language lookup in default chat |
Takeaway
Execution is solid after selection. The open question is whether search should be a mode or just something you say.
Pattern: Tool Switching in Composer
Pattern: Prompt Templates
Image mode

What works
- Scope in the bar, exploration below. You commit to making an image without inventing a prompt from zero.
- "Explore ideas" helps people who want inspiration, not a template to paste.
- Starters swap when the tool changes. Image prompts do not bleed into Search.
Business strategy
Image gen is a high-value mode that keeps users in ChatGPT instead of Midjourney or DALL·E. Starters and chip scope raise completion without a separate creation app.
Tradeoff
| Tradeoff | Benefit | Cost |
|---|---|---|
| Dedicated image mode with starters | Higher completion on generative jobs | Another mode row competing for discovery with research and search |
Takeaway
One of the clearer modes. Chip plus contextual starters teach the job before send.
Pattern: Prompt Templates
Pattern: Tool Switching in Composer
Image controls

What works
- Aspect ratio inline when Image is active. Message-scoped, not global settings.
- Progressive disclosure. The bar earns density only after you pick a generative mode.
What we would push on
- Chip, ratio dropdown, and attach preview can stack. Hide controls when the chip goes away.
Business strategy
Power users iterate on image prompts in-thread. Message-scoped controls keep them inside ChatGPT instead of exporting to another tool or digging through settings.
Tradeoff
| Tradeoff | Benefit | Cost |
|---|---|---|
| Inline output controls while Image is active | Fast iteration on the current prompt | Bar crowding when chip, ratio, and attach stack |
Takeaway
Good pattern for send-time output controls. Watch crowding at the high end.
Pattern: Tool Switching in Composer
Pattern: Context Chip Management
Voice in the composer

What works
- Mic on the right, + menu on the left. Separate jobs, separate active states.
- Dictation replaces typing instead of stacking on top.
- Waveform plus confirm/cancel gives a clear listening contract.
What we would push on
- Two voice paths exist: inline dictation and full voice session. The bar mic does not tell you the richer mode exists.
Business strategy
A mic on the bar lowers the barrier for mobile and accessibility users (speak, edit, send) without forcing everyone into a separate voice product up front.
Tradeoff
| Tradeoff | Benefit | Cost |
|---|---|---|
| Dictation inline in the composer | Speak once, edit, then send: familiar chat loop | Hides the richer full-session voice product behind the same mic affordance |
Takeaway
Dictation in the composer is well scoped. The product-level voice map is not.
Pattern: Input Mode Toggle
Pattern: Voice Visualizer
Voice session

What works
- Full viewport fits back-and-forth talk. Different job from speak once, edit, send.
- "Start talking" and a visible End button. Clear exit.
What we would push on
- Dictation feeds the composer. Session bypasses it. Naming and placement should make that split impossible to miss.
Business strategy
Full voice sessions target longer, hands-free use (commutes, cooking, workouts) where dictation-to-text is the wrong job. That drives session time OpenAI can monetize.
Tradeoff
| Tradeoff | Benefit | Cost |
|---|---|---|
| Separate full-screen voice session | Clear back-and-forth talk without text-field constraints | Split discovery from the bar mic; users may never find it |
Takeaway
Right surface for conversation mode. Wrong that users have to discover it separately from the bar mic.
Pattern: Input Mode Toggle
Pattern: Voice Visualizer
How it fits together
The pattern
- Calm default, + menu for capability, chip for scope, starters when the mode needs them, send.
- Attach and modes share one + menu. Voice and send stay on the right.
Where it breaks
- Research, Search, and Image teach before send. Think does not.
- That inconsistency makes the chip system harder to trust.
- Dual discovery (starter pills and the + menu), hidden attach, two voice paths with no map.
Business strategy
The composer is meant to become ChatGPT’s shell for memory, projects, and agents. Inconsistent mode treatment makes that shell feel unreliable, and hurts trust in paid features built on top of it.
Tradeoff
| Tradeoff | Benefit | Cost |
|---|---|---|
| + menu for all capability | Clean bar, one discovery model | Dual discovery with starter pills and uneven mode contracts |
Takeaway
One of the better composer architectures for hiding power without cluttering the bar. Mode quality is uneven. Steal the structure, not every mode treatment.
Pattern: Tool Switching in ComposerCalm default → + menu → chip → send is a reusable composer architecture, mode quality is where products diverge.
Pattern: Prompt Templates
Steal this
- One + menu for attach, tools, and modes
- Removable chips that show scope before send
- Pre-send contract: placeholder and starters that match the mode
- Outcome labels ("Create image," "Deep research") instead of model names
- Inline output controls only while the relevant chip is on
Skip this
- Modes that are only a chip (Think)
- A dedicated search mode when plain prompts already route to the web
- Starter pills and the + menu as parallel discovery with no bridge between them
- Two voice entry points without explaining dictation vs conversation
- Every menu row looking equally important when some modes are slow or costly
How others design the composer
Same job, different product bets, and what each tradeoff reveals.
Claude exposes model choice on the right and keeps search/style in + flyouts, betting that power users want model control over a calm default.
Read teardownPerplexity puts Search and Computer in the bar, search is the product, not a hidden mode behind a + menu.
Read teardownGemini nests uploads and tools in one + menu but surfaces model and thinking on the right, similar shell, different emphasis on model transparency.
Read teardownOriginal gallery pages: Tool Switching in Composer · Dictation Mode