The difference between a base model and an assistant is instruction tuning. Raw capability becomes usable skill through carefully formatted examples.— Day 30 Principle
I. Instruction Format
{"instruction": "Summarize the following text",
"input": "The Federal Reserve announced...",
"output": "The Fed raised rates by 25bp..."}
Datasets like Alpaca, LIMA, and OpenAssistant follow this pattern. Quality matters more than quantity — LIMA showed that 1,000 high-quality examples can match 50,000 noisy ones.
V. Deliverables
- Instruction format
- Chat template
- Data quality filtering
- Train/val split
- Tokenization with special tokens
Good data beats more data. Tomorrow: the SFT training loop.— Day 30 Closing