What Is Top-P (Nucleus Sampling) in AI?
Top-P (also called nucleus sampling) is a parameter that controls how many candidate tokens the model considers when generating each word. It’s an alternative to temperature for controlling output randomness.
How Top-P Works
At each step, the model ranks all possible next tokens by probability. Top-P selects the smallest set of tokens whose combined probability reaches P:
- Top-P = 0.1: Only the most likely tokens (very focused output)
- Top-P = 0.9: A wide range of tokens (more creative output)
- Top-P = 1.0: All tokens considered (maximum randomness)
Top-P vs. Temperature
Both control randomness, but in different ways:
| Setting | How It Works | Best For |
|---|---|---|
| Temperature | Scales the probability distribution | General randomness control |
| Top-P | Cuts off low-probability tokens | Preventing nonsensical outputs |
Most AI providers recommend adjusting one or the other — not both simultaneously.
Recommended Values
| Use Case | Top-P |
|---|---|
| Code generation | 0.1 - 0.3 |
| Factual answers | 0.1 - 0.4 |
| General chat | 0.7 - 0.9 |
| Creative writing | 0.9 - 1.0 |
Elvean brings all these concepts together in one native Mac app — local models, cloud APIs, agentic tools, and more.
Learn more about Elvean