AI UX Product and Performance Glossary for Designers
Speed, cost, memory, personalization, and model tuning concerns that shape whether AI features feel ready for production.
11 terms
- Product and performance
AI Evals (Evaluations)
AI evals are automated or human frameworks that measure model accuracy, bias, safety, and task performance before and after you ship.
- Product and performance
Compute
Compute is the processing power (GPUs, TPUs, cloud instances) used to train models and run inference when users generate, classify, or embed content.
- Product and performance
Fine-Tuning
Fine-tuning adapts a base model to your domain, tone, or task by training on curated examples, beyond what a system prompt alone can reliably enforce.
- Product and performance
GEO (Generative Engine Optimization)
GEO (generative engine optimization) is the practice of shaping content and structure so AI answer engines (ChatGPT, Perplexity, Gemini, Claude) cite and summarize your product accurately.
- Product and performance
Inference
Inference is running a trained model on new inputs to produce outputs: the live “prediction” step users experience as chat, classify, or generate.
- Product and performance
Latency
Latency is the delay between a user action and a usable AI response: time to first token, time to complete answer, or time to finish an agent run.
- Product and performance
Memory
Memory is how an AI product retains user preferences, facts, or past context across sessions, beyond the single context window.
- Product and performance
Personalization
Personalization tailors AI behavior or content to a user or segment, using memory, history, embeddings, or fine-tuned priors.
- Product and performance
Streaming Response
Streaming is when the AI sends its answer incrementally as tokens generate, instead of waiting for the full reply.
- Product and performance
Token Burn Rate
Token burn rate is how fast a product consumes tokens over time: per request, per user session, or per agent run.
- Product and performance
Vibe Coding
Vibe coding is iterative building with AI coding tools (Cursor, Claude Code, Lovable, etc.) where natural language steers rapid prototypes—“make it feel calmer,” “add citation chips.”