Day 30 - Instruction Finetuning — Dataset Format & Preparation

1-293031···80

The difference between a base model and an assistant is instruction tuning. Raw capability becomes usable skill through carefully formatted examples.— Day 30 Principle

I. Instruction Format

{"instruction": "Summarize the following text",
 "input": "The Federal Reserve announced...",
 "output": "The Fed raised rates by 25bp..."}

Datasets like Alpaca, LIMA, and OpenAssistant follow this pattern. Quality matters more than quantity — LIMA showed that 1,000 high-quality examples can match 50,000 noisy ones.

V. Deliverables

Instruction format
Chat template
Data quality filtering
Train/val split
Tokenization with special tokens

Good data beats more data. Tomorrow: the SFT training loop.— Day 30 Closing