1-282930···80
← Back to index
PHASE 3 LLM Architecture · Day 29 of 80 · Raschka LLMs From Scratch

Text Classification with LLMs

Use a pre-trained LLM for classification: extract the last token's hidden state, add a classification head.

Not every model needs to generate. Sometimes the most valuable output is a single decision: buy or sell, spam or not, positive or negative.— Day 29 Principle

I. Classification Head

Take the last token’s hidden state from the LLM, pass through a linear layer to produce class logits. Fine-tune end-to-end or freeze the backbone and train only the head.

hidden = model.transformer(input_ids)[:, -1, :] # last token logits = classifier_head(hidden) # [B, num_classes] loss = F.cross_entropy(logits, labels)

V. Deliverables

LLMs are powerful feature extractors. Tomorrow: instruction finetuning.— Day 29 Closing