ChatGPT logo

ChatGPT's composer teardown

Updated June 15, 2026

The composer starts calm on purpose. Capability shows up in layers: open the + menu, pick a mode, see a chip, maybe grab a starter. Familiarity first, power when you ask for it.

That works when the mode teaches you something before you send. It falls apart when the only feedback is a chip. Think is where we would push hardest.

Calm default

Empty bar, no chips. Starter pills already hint at three jobs.
Empty bar, no chips. Starter pills already hint at three jobs.

What works

  • No tool vocabulary on first load. You can just type.
  • Starter pills teach three common jobs without opening the + menu.

What we would push on

  • Starter pills and the + menu are two front doors. Some users may never open the + menu.
  • Attach and advanced modes need another way in if pills stay the primary path.

Business strategy

OpenAI wants more people to send a first message. An empty bar looks like normal chat, not a pro tool, so newcomers type instead of bouncing. Starter pills teach three jobs without a product tour.

Tradeoff

TradeoffBenefitCost
Calm default (no chips on load)Familiar, low intimidationAdvanced tools stay hidden until users discover the + menu or pills

Takeaway

Approachable surface, hidden depth. Fine if you accept the discoverability tradeoff.

Pattern: Tool Switching in ComposerCapability stays off the bar until you ask: progressive disclosure as a growth strategy, not just aesthetics.

Pattern: Prompt TemplatesStarter pills teach jobs without opening the + menu, but they compete with it as a second front door.

Tools & modes

Photos, image, Think, research, and web search on the first screen. More is for overflow.
Photos, image, Think, research, and web search on the first screen. More is for overflow.

What works

  • One + menu groups attach, creation, and behavioral modes. You pick intent, not infrastructure.
  • Open the + menu and the first screen lists attach, Create image, Thinking, Deep research, and Web search. More and Projects hold overflow.

What we would push on

  • Does web search need to be a mode? Most people already ask in plain language ("What is the stock price for SpaceX today?"). A mode earns its place when the output contract changes: citations, live results, slower run, a chip that confirms web use.
  • "Look something up" on the starter pill and Web search in the + menu are the same job, two doors.
  • Think and Deep research sit side by side with one line each. Different jobs, same visual weight, and the composer only teaches the difference after you pick Deep research.
  • Every row looks the same weight. Nothing signals that Deep research takes minutes while Think is a slower single reply.

Business strategy

The + menu is how OpenAI sells heavier modes (Deep research, image gen, web search) inside one chat product instead of spinning up separate apps for each job.

Tradeoff

TradeoffBenefitCost
Equal visual weight for every mode rowConsistent menu designSlow or costly modes look as cheap as fast ones

Takeaway

The + menu organizes well. It does not help you choose between modes that look equal but behave very differently.

Pattern: Tool Switching in ComposerModes are tools with different output contracts, the menu should signal cost and latency, not just intent.

Pattern: Persona SelectorThink and Deep research behave like personas with different runtimes, but share equal visual weight.

Attachments

Thumbnail preview with dismiss. Attach is inside the + menu, not on the bar.
Thumbnail preview with dismiss. Attach is inside the + menu, not on the bar.

What works

  • Inline thumbnail with dismiss. You see what ships and can fix mistakes without resetting the thread.
  • Attach stacks with mode chips without clearing them. Flexible for power users.

What we would push on

  • Attach inside the + menu fights the paperclip habit from email and Slack. Sending a file is not an edge case.
  • File + mode chip + custom text makes the composer dense fast.

Business strategy

ChatGPT’s bet is one input for everything (attach, modes, send), so the product can grow into tools and agents without users learning a new surface for each feature.

Tradeoff

TradeoffBenefitCost
Attach inside the + menuKeeps the default bar calmFights the paperclip habit; file send is not an edge case

Takeaway

Steal the preview. Question hiding attach behind the + menu unless you have another strong discovery path.

Pattern: Context Chip ManagementFile and mode chips stack in one bar. Plan for crowding when attach, scope, and output controls all show at once.

Deep research

Chip, pre-filled prompt, report starters. This is what a pre-send contract looks like.
Chip, pre-filled prompt, report starters. This is what a pre-send contract looks like.

What works

  • Chip, placeholder ("Get a detailed report"), credible starters below. You know the job before you type.
  • Pre-fill teaches prompting without trapping you. Edit, remove chip, still fine.
  • Apps and Sites filters scope the research run inside the composer.

Business strategy

Deep research runs take minutes and cost real compute. Teaching the job before send reduces rage-quits, and makes it easier to justify Plus when users know what they signed up for.

Tradeoff

TradeoffBenefitCost
Pre-send education for slow modesFewer surprise waits and abandoned threadsMore UI density before the first send

Takeaway

This is the bar for behavioral modes: name the outcome, suggest a first prompt, keep scope in the bar.

Think mode

A chip appears. Nothing else changes. No pre-fill, no starters, no hint about what Think actually does.
A chip appears. Nothing else changes. No pre-fill, no starters, no hint about what Think actually does.

What works

  • Removable chip shows scope before send.
  • Placeholder stays editable. No locked template.

What we would push on

  • Thinnest mode in the set. Research and Search rewrite placeholder and starters. Think adds a chip. Send looks like default chat.
  • Deep research pre-fills "Get a detailed report," adds Apps and Sites, drops report starters. Think stays on "Ask anything." Same output shape, supposedly smarter reasoning. The product gap is real. The UI gap is invisible.
  • Can you explain the tradeoff before send? Think fails. Deep research passes.

Business strategy

Think is a retention bet on “smarter default chat.” If users wait longer and cannot tell what changed, they may not come back, especially after paying for Plus.

Tradeoff

TradeoffBenefitCost
Think as chip onlySimple to ship and removeNo pre-send contract; user learns after waiting

Takeaway

Steal the chip. Do not steal label-only as differentiation. Behavioral modes need a pre-send contract.

Pattern: Persona SelectorA behavioral mode needs a pre-send contract, chip-only fails when the output shape looks identical to default chat.

Pattern: Tool Switching in Composer

Image mode

Chip in the bar. Edit, style, and Explore ideas open up below.
Chip in the bar. Edit, style, and Explore ideas open up below.

What works

  • Scope in the bar, exploration below. You commit to making an image without inventing a prompt from zero.
  • "Explore ideas" helps people who want inspiration, not a template to paste.
  • Starters swap when the tool changes. Image prompts do not bleed into Search.

Business strategy

Image gen is a high-value mode that keeps users in ChatGPT instead of Midjourney or DALL·E. Starters and chip scope raise completion without a separate creation app.

Tradeoff

TradeoffBenefitCost
Dedicated image mode with startersHigher completion on generative jobsAnother mode row competing for discovery with research and search

Takeaway

One of the clearer modes. Chip plus contextual starters teach the job before send.

Image controls

Aspect ratio sits in the bar, only while Image is active.
Aspect ratio sits in the bar, only while Image is active.

What works

  • Aspect ratio inline when Image is active. Message-scoped, not global settings.
  • Progressive disclosure. The bar earns density only after you pick a generative mode.

What we would push on

  • Chip, ratio dropdown, and attach preview can stack. Hide controls when the chip goes away.

Business strategy

Power users iterate on image prompts in-thread. Message-scoped controls keep them inside ChatGPT instead of exporting to another tool or digging through settings.

Tradeoff

TradeoffBenefitCost
Inline output controls while Image is activeFast iteration on the current promptBar crowding when chip, ratio, and attach stack

Takeaway

Good pattern for send-time output controls. Watch crowding at the high end.

Voice in the composer

Dictation replaces the text field. Waveform, cancel, confirm.
Dictation replaces the text field. Waveform, cancel, confirm.

What works

  • Mic on the right, + menu on the left. Separate jobs, separate active states.
  • Dictation replaces typing instead of stacking on top.
  • Waveform plus confirm/cancel gives a clear listening contract.

What we would push on

  • Two voice paths exist: inline dictation and full voice session. The bar mic does not tell you the richer mode exists.

Business strategy

A mic on the bar lowers the barrier for mobile and accessibility users (speak, edit, send) without forcing everyone into a separate voice product up front.

Tradeoff

TradeoffBenefitCost
Dictation inline in the composerSpeak once, edit, then send: familiar chat loopHides the richer full-session voice product behind the same mic affordance

Takeaway

Dictation in the composer is well scoped. The product-level voice map is not.

Voice session

Full-screen voice. Conversation mode, not text replacement.
Full-screen voice. Conversation mode, not text replacement.

What works

  • Full viewport fits back-and-forth talk. Different job from speak once, edit, send.
  • "Start talking" and a visible End button. Clear exit.

What we would push on

  • Dictation feeds the composer. Session bypasses it. Naming and placement should make that split impossible to miss.

Business strategy

Full voice sessions target longer, hands-free use (commutes, cooking, workouts) where dictation-to-text is the wrong job. That drives session time OpenAI can monetize.

Tradeoff

TradeoffBenefitCost
Separate full-screen voice sessionClear back-and-forth talk without text-field constraintsSplit discovery from the bar mic; users may never find it

Takeaway

Right surface for conversation mode. Wrong that users have to discover it separately from the bar mic.

How it fits together

The pattern

  • Calm default, + menu for capability, chip for scope, starters when the mode needs them, send.
  • Attach and modes share one + menu. Voice and send stay on the right.

Where it breaks

  • Research, Search, and Image teach before send. Think does not.
  • That inconsistency makes the chip system harder to trust.
  • Dual discovery (starter pills and the + menu), hidden attach, two voice paths with no map.

Business strategy

The composer is meant to become ChatGPT’s shell for memory, projects, and agents. Inconsistent mode treatment makes that shell feel unreliable, and hurts trust in paid features built on top of it.

Tradeoff

TradeoffBenefitCost
+ menu for all capabilityClean bar, one discovery modelDual discovery with starter pills and uneven mode contracts

Takeaway

One of the better composer architectures for hiding power without cluttering the bar. Mode quality is uneven. Steal the structure, not every mode treatment.

Pattern: Tool Switching in ComposerCalm default → + menu → chip → send is a reusable composer architecture, mode quality is where products diverge.

Pattern: Prompt Templates

Steal this

  • One + menu for attach, tools, and modes
  • Removable chips that show scope before send
  • Pre-send contract: placeholder and starters that match the mode
  • Outcome labels ("Create image," "Deep research") instead of model names
  • Inline output controls only while the relevant chip is on

Skip this

  • Modes that are only a chip (Think)
  • A dedicated search mode when plain prompts already route to the web
  • Starter pills and the + menu as parallel discovery with no bridge between them
  • Two voice entry points without explaining dictation vs conversation
  • Every menu row looking equally important when some modes are slow or costly

How others design the composer

Same job, different product bets, and what each tradeoff reveals.

Original gallery pages: Tool Switching in Composer · Dictation Mode