Question 1

What temperature should I use for AI agents?

Accepted Answer

For most agent tasks — classification, extraction, routing, and structured responses — use temperature 0 to 0.3. For creative tasks like drafting marketing copy or generating alternatives, use 0.5 to 0.7. Avoid temperatures above 1.0 for production agents, as output becomes unpredictable and may include hallucinations.

Question 2

Does temperature 0 always produce the same output?

Accepted Answer

Nearly, but not exactly. Temperature 0 is deterministic in theory, but implementation details like floating-point arithmetic, batching, and GPU parallelism can introduce minor variations across runs. For practical purposes, temperature 0 produces highly consistent output, but don't rely on exact character-for-character reproducibility.

Question 3

What is the difference between temperature and top_p?

Accepted Answer

Both control randomness, but differently. Temperature adjusts the sharpness of the entire probability distribution. Top_p (nucleus sampling) cuts off the distribution at a cumulative probability threshold — only tokens within the top p% of probability are considered. Most practitioners adjust one or the other, not both simultaneously.

Question 4

What temperature should email agents use for drafting replies?

Accepted Answer

Email reply drafting works best at temperature 0.2 to 0.4. This provides slight variation to avoid robotic-sounding responses while keeping the tone consistent and on-brand. Lower temperatures risk sounding formulaic; higher temperatures risk off-brand or factually inconsistent replies.

Question 5

Does higher temperature increase hallucination?

Accepted Answer

Yes. Higher temperatures make the model more likely to select low-probability tokens, which increases the chance of generating plausible-sounding but incorrect information. For agent tasks that require factual accuracy, like answering customer questions or extracting data from emails, keep temperature low.

Question 6

Can you use different temperatures for different agent tasks?

Accepted Answer

Yes, and this is a recommended practice. Use temperature 0 for classification and data extraction, 0.2-0.3 for response drafting, and 0.5-0.7 for creative tasks like subject line generation. This gives you reliability where it matters and variety where it helps.

Question 7

What happens if you set temperature above 1.0?

Accepted Answer

Temperatures above 1.0 flatten the probability distribution, making unlikely tokens nearly as probable as likely ones. Output becomes increasingly random, often producing incoherent or nonsensical text. There is almost no production use case for temperatures above 1.0 in agent systems.

Question 8

How does temperature affect token costs?

Accepted Answer

Temperature itself doesn't change token costs directly. However, higher temperatures tend to produce longer, more meandering outputs because the model is more likely to go on tangents. Lower temperatures produce more focused, concise responses, which can indirectly reduce output token costs.

Question 9

Should temperature be set per request or globally for an agent?

Accepted Answer

It depends on the agent's architecture. If the agent performs a single task type (like email classification), a global setting works fine. If the agent handles multiple task types in a pipeline, set temperature per request based on the specific task. Most agent frameworks support per-call temperature configuration.

Question 10

What is the default temperature for most AI models?

Accepted Answer

Most AI APIs default to temperature 1.0, which provides a balance between coherence and variety. For agent applications, this default is typically too high. Developers should explicitly set a lower temperature rather than relying on the default, especially for tasks requiring consistency and accuracy.

Temperature

What is temperature?#

Why it matters for AI agents#

Frequently asked questions

Related terms

Inference

Tokens