Google Gemini logoGoogle Gemini

Multimodal Input

Gemini supports multiple input types including text, images, and files in a unified composer interface, allowing users to attach and interact with various content types simultaneously.

Composer interface with text input and multimodal (+) icon

Composer interface with text input and multimodal (+) icon

1 / 5

What's happening

Gemini provides a dedicated multimodal (+) icon in the composer for attaching text, images, and files, as well as importing code from GitHub or external URLs. It leverages the Google ecosystem by integrating directly with Google Workspace (Docs, Drive, Gmail) and promoting specialized tools like NotebookLM for source-based grounding and research. This is distinct from the adjacent tools icon, keeping the multimodal workflow focused.

Patterns

Multimodal (+) icon opens a menu to attach files, images, or import from GitHub/Workspace/NotebookLM

Context Chip Management

Attached items from local storage, Workspace, or NotebookLM shown as preview cards with remove option

Open playground

UX Insights

  • Unified interface for all input types including ecosystem-specific sources
  • Deep integration with Google Workspace leverages existing user data as context
  • Cross-promotion of specialized AI products like NotebookLM for specific use cases like research
  • Visual previews confirm what will be sent
  • Supports multiple attachments simultaneously from varied sources

Design Decisions

The unified composer approach makes multimodal input feel natural and integrated, while the Workspace and NotebookLM integrations demonstrate how a large ecosystem can be leveraged to provide richer, grounded AI context without leaving the chat interface.

Captured: December 29, 2025Type: desktop
multimodalimage-inputfile-upload

More real-world AI UX in your inbox

Weekly gallery picks, interface patterns, and notes on how products ship AI - no spam, unsubscribe anytime.

Subscribe on Substack