What Is Top-P (Nucleus Sampling) in AI?

Top-P (also called nucleus sampling) is a parameter that controls how many candidate tokens the model considers when generating each word. It’s an alternative to temperature for controlling output randomness.

How Top-P Works

At each step, the model ranks all possible next tokens by probability. Top-P selects the smallest set of tokens whose combined probability reaches P:

Top-P = 0.1: Only the most likely tokens (very focused output)
Top-P = 0.9: A wide range of tokens (more creative output)
Top-P = 1.0: All tokens considered (maximum randomness)

Top-P vs. Temperature

Both control randomness, but in different ways:

Setting	How It Works	Best For
Temperature	Scales the probability distribution	General randomness control
Top-P	Cuts off low-probability tokens	Preventing nonsensical outputs

Most AI providers recommend adjusting one or the other — not both simultaneously.

Recommended Values

Use Case	Top-P
Code generation	0.1 - 0.3
Factual answers	0.1 - 0.4
General chat	0.7 - 0.9
Creative writing	0.9 - 1.0