Day 28 - Generating Text — Temperature, Top-k, Nucleus Sampling

Controlled randomness is the hallmark of sophisticated systems. In markets, it’s position sizing with volatility targets. In LLMs, it’s temperature and top-k.— Day 28 Principle

I. Sampling Strategies

def generate(model, idx, max_tokens, temperature=1.0, top_k=None):
    for _ in range(max_tokens):
        logits = model(idx[:, -block_size:])[0][:, -1, :]
        logits = logits / temperature
        if top_k is not None:
            v, _ = torch.topk(logits, top_k)
            logits[logits < v[:, [-1]]] = float('-inf')
        probs = F.softmax(logits, dim=-1)
        idx_next = torch.multinomial(probs, 1)
        idx = torch.cat((idx, idx_next), dim=1)
    return idx

Temperature × Top-k = Full Control

Temperature controls sharpness (low=focused, high=creative). Top-k limits the candidate set. Together they give fine-grained control over generation quality and diversity.

V. Deliverables

Temperature sampling
Top-k filtering
Nucleus/top-p sampling
Greedy decoding
Comparison matrix
Quality evaluation

Generation strategies complete. Tomorrow: using LLMs for classification.— Day 28 Closing