What Are Tokens in AI?
Tokens are the basic units of text that large language models read and generate. Rather than processing whole words, LLMs break text into smaller pieces called tokens.
How Tokenization Works
A tokenizer splits text into subword units. For example:
- “Hello world” →
["Hello", " world"](2 tokens) - “Unbelievable” →
["Un", "believ", "able"](3 tokens) - Code like
print("hi")→["print", "(\"", "hi", "\")"](4 tokens)
Rule of thumb: 1 token ≈ 4 characters in English, or roughly 3/4 of a word.
Why Tokens Matter
- Pricing: Cloud AI APIs charge per token (input + output). Fewer tokens = lower cost.
- Context limits: Every model has a maximum context window measured in tokens. GPT-4o supports 128K tokens; Claude supports up to 200K.
- Speed: More tokens = longer generation time.
Token Counts by Model
| Model | Max Tokens | Approx. Words |
|---|---|---|
| GPT-4o | 128,000 | ~96,000 |
| Claude 3.5 Sonnet | 200,000 | ~150,000 |
| Gemini 1.5 Pro | 2,000,000 | ~1,500,000 |
| Llama 3 | 8,000 | ~6,000 |
Tokens in Elvean
Elvean shows token usage for each message, helping you track costs and stay within context limits across all your connected models.
Elvean brings all these concepts together in one native Mac app — local models, cloud APIs, agentic tools, and more.
Learn more about Elvean