1-272829···80
← Back to index
PHASE 3 LLM Architecture · Day 28 of 80 · Raschka LLMs From Scratch

Generating Text — Temperature, Top-k, Nucleus Sampling

Master the art of text generation: temperature scaling, top-k filtering, nucleus (top-p) sampling.

Controlled randomness is the hallmark of sophisticated systems. In markets, it’s position sizing with volatility targets. In LLMs, it’s temperature and top-k.— Day 28 Principle

I. Sampling Strategies

def generate(model, idx, max_tokens, temperature=1.0, top_k=None): for _ in range(max_tokens): logits = model(idx[:, -block_size:])[0][:, -1, :] logits = logits / temperature if top_k is not None: v, _ = torch.topk(logits, top_k) logits[logits < v[:, [-1]]] = float('-inf') probs = F.softmax(logits, dim=-1) idx_next = torch.multinomial(probs, 1) idx = torch.cat((idx, idx_next), dim=1) return idx

Temperature × Top-k = Full Control

Temperature controls sharpness (low=focused, high=creative). Top-k limits the candidate set. Together they give fine-grained control over generation quality and diversity.

V. Deliverables

Generation strategies complete. Tomorrow: using LLMs for classification.— Day 28 Closing