AI Glossary
Clear, practical definitions of AI and machine learning concepts — from the fundamentals to advanced techniques.
Fundamentals
Context Window
The context window is the maximum amount of text an AI model can process in a single conversation. Learn about context limits across models.
Embeddings
Embeddings are numerical representations of text that capture meaning. Learn how vector embeddings power semantic search, RAG, and recommendation systems.
Inference
Inference is when a trained AI model generates predictions or responses. Learn about local vs. cloud inference, speed, and cost.
Large Language Model
A large language model (LLM) is an AI system trained on massive text datasets to understand and generate human language. Learn how LLMs like GPT, Claude, and Gemini work.
Multimodal AI
Multimodal AI models can process and generate multiple types of data — text, images, audio, and video. Learn how multimodal models work.
Temperature
Temperature controls the randomness of AI model outputs. Low temperature gives focused answers; high temperature gives creative ones.
Tokens
A token is a unit of text that AI models process — roughly 3/4 of a word in English. Learn how tokenization works and why token limits matter.
Top-P
Top-P (nucleus sampling) controls AI output randomness by limiting token selection to the most probable candidates. Learn how it differs from temperature.
Transformer
The transformer is the neural network architecture behind all modern LLMs. Learn how attention mechanisms power GPT, Claude, and other AI models.
Models
ChatGPT
ChatGPT is OpenAI's conversational AI product built on GPT models. Learn how it works, its limitations, and alternatives for power users.
Claude
Claude is Anthropic's AI assistant, known for long context windows, safety, and strong coding abilities. Learn about Claude models and how to use them.
Open-Source LLM
Open-source LLMs like Llama, Mistral, and Gemma can be downloaded and run locally for free. Learn about the best open-source AI models.
Techniques
AI Agent
An AI agent is a system that uses language models to autonomously plan and execute multi-step tasks. Learn how agentic AI works.
Fine-Tuning
Fine-tuning adapts a pre-trained AI model to a specific task or domain using custom training data. Learn when and how to fine-tune LLMs.
Function Calling
Function calling lets AI models invoke external tools and APIs. Learn how tool use works in GPT, Claude, and other LLMs.
LoRA
LoRA is an efficient method for fine-tuning AI models by training only a small number of parameters. Learn how LoRA makes model customization accessible.
Prompt Engineering
Prompt engineering is the practice of crafting effective instructions for AI models. Learn techniques like chain-of-thought, few-shot, and system prompts.
Quantization
Quantization compresses AI models by reducing numerical precision, making them smaller and faster. Learn about 4-bit, 8-bit, and GGUF formats.
RAG
RAG combines AI language models with external knowledge retrieval. Learn how RAG reduces hallucinations and keeps AI answers up to date.
System Prompt
A system prompt sets the behavior, role, and rules for an AI model at the start of a conversation. Learn how to write effective system prompts.
Infrastructure
API Key
An API key is a credential for accessing AI model services like OpenAI, Claude, and Gemini. Learn how API keys work and how to manage them securely.
Apple Silicon
Apple Silicon (M1-M4) chips enable fast local AI inference on Mac. Learn how the Neural Engine and unified memory make Macs ideal for running LLMs.
MCP
MCP (Model Context Protocol) is a standard for connecting AI models to external tools and data sources. Learn how MCP servers work.
Ollama
Ollama is a tool for running large language models locally on your Mac. Learn how to set up and use Ollama for private, offline AI.
OpenAI API
The OpenAI API provides programmatic access to GPT-4, DALL-E, and other AI models. Learn about pricing, setup, and how to use it in your applications.