What Is a Context Window in AI?
The context window is the maximum number of tokens a large language model can process in a single request — including both your input and the model’s response.
Why Context Window Size Matters
A larger context window means you can:
- Paste entire documents for summarization
- Have longer conversations without the model “forgetting” earlier messages
- Provide more examples and instructions
- Analyze larger codebases in one shot
Context Windows by Model
| Model | Context Window | Approx. Pages |
|---|---|---|
| GPT-4o | 128K tokens | ~200 pages |
| Claude 3.5 Sonnet | 200K tokens | ~300 pages |
| Gemini 1.5 Pro | 2M tokens | ~3,000 pages |
| Llama 3 (8B) | 8K tokens | ~12 pages |
| Mistral Large | 128K tokens | ~200 pages |
Context Window vs. Memory
The context window is not long-term memory. Once a conversation exceeds the window, earlier messages are dropped. Some apps work around this with RAG — retrieving relevant past context on demand.
Managing Context in Elvean
Elvean lets you fork conversations into threads, keeping each thread focused and within context limits. You can also @mention different models mid-conversation to switch to one with a larger context window when needed.
Elvean brings all these concepts together in one native Mac app — local models, cloud APIs, agentic tools, and more.
Learn more about Elvean