What Are Tokens in AI?

Tokens are the basic units of text that large language models read and generate. Rather than processing whole words, LLMs break text into smaller pieces called tokens.

How Tokenization Works

A tokenizer splits text into subword units. For example:

“Hello world” → ["Hello", " world"] (2 tokens)
“Unbelievable” → ["Un", "believ", "able"] (3 tokens)
Code like print("hi") → ["print", "(\"", "hi", "\")"] (4 tokens)

Rule of thumb: 1 token ≈ 4 characters in English, or roughly 3/4 of a word.

Why Tokens Matter

Pricing: Cloud AI APIs charge per token (input + output). Fewer tokens = lower cost.
Context limits: Every model has a maximum context window measured in tokens. GPT-4o supports 128K tokens; Claude supports up to 200K.
Speed: More tokens = longer generation time.

Token Counts by Model

Model	Max Tokens	Approx. Words
GPT-4o	128,000	~96,000
Claude 3.5 Sonnet	200,000	~150,000
Gemini 1.5 Pro	2,000,000	~1,500,000
Llama 3	8,000	~6,000

Tokens in Elvean

Elvean shows token usage for each message, helping you track costs and stay within context limits across all your connected models.

What Are Tokens in AI?

How Tokenization Works

Why Tokens Matter

Token Counts by Model

Tokens in Elvean

Elvean is Mac-only (for now)