Multimodal Input
Gemini supports multiple input types including text, images, and files in a unified composer interface, allowing users to attach and interact with various content types simultaneously.
.png)
Composer interface with text input and multimodal (+) icon
What's happening
Gemini provides a dedicated multimodal (+) icon in the composer for attaching text, images, and files, as well as importing code from GitHub or external URLs. It leverages the Google ecosystem by integrating directly with Google Workspace (Docs, Drive, Gmail) and promoting specialized tools like NotebookLM for source-based grounding and research. This is distinct from the adjacent tools icon, keeping the multimodal workflow focused.
Patterns
Multimodal (+) icon opens a menu to attach files, images, or import from GitHub/Workspace/NotebookLM
Attached items from local storage, Workspace, or NotebookLM shown as preview cards with remove option
UX Insights
- •Unified interface for all input types including ecosystem-specific sources
- •Deep integration with Google Workspace leverages existing user data as context
- •Cross-promotion of specialized AI products like NotebookLM for specific use cases like research
- •Visual previews confirm what will be sent
- •Supports multiple attachments simultaneously from varied sources
Design Decisions
The unified composer approach makes multimodal input feel natural and integrated, while the Workspace and NotebookLM integrations demonstrate how a large ecosystem can be leveraged to provide richer, grounded AI context without leaving the chat interface.
More from the gallery
.png)
Context Chip Management
Claude allows users to attach files, images, and other context sources to conversations, displayed as removable chips.

Multimodal Input
Grok provides comprehensive multimodal input options through a paperclip menu, offering file upload, text content, sketching, cloud storage integration, and voice input capabilities.
.png)
Tool Switching in Composer
Grok displays active tools as dismissible chips above the composer, allowing users to activate capabilities like DeepSearch and see them clearly indicated with the ability to remove them with an X button.
More real-world AI UX in your inbox
Weekly gallery picks, interface patterns, and notes on how products ship AI - no spam, unsubscribe anytime.