Minimum Probability and Temperature	Updated	2024/08/18
Words	281
Tags	#llm #min_p #temperature	Read	2 minutes

LLM Sampling Techniques: Minimum Probability and Temperature

Minimum Probability Sampling

Minimum probability sampling is a technique used in language model APIs to balance between diversity and coherence in the model’s output.

Sets a dynamic threshold for token selection based on the probability of the most likely token.
The threshold is a fraction (determined by the min_p value) of the top token’s probability.

Let’s say min_p = 0.1, and we’re generating the next token:

Scenario A:

Scenario B:

When the model is very confident (high top probability), the threshold is higher, limiting options to maintain coherence.
When the model is less certain (lower top probability), the threshold lowers, allowing more diverse options.

Temperature controls the randomness in token selection during text generation.

Combining min_p sampling with higher temperatures allows for:

min_p sampling adapts token selection threshold based on the model’s confidence.
Higher temperatures increase output diversity but risk coherence.
Combining min_p with higher temperatures balances creativity and coherence.
The optimal sampling strategy depends on the specific task and desired outcome.