Day 4 — Building a Neuron & MLP from Scratch

A portfolio is not a random pile of assets — it is a structured hierarchy. Each position has a weight, a bias toward certain outcomes, and a nonlinear response to market conditions. A neural network is the same: weighted inputs, a bias, and a nonlinear activation. Today you build the first “portfolio” that learns. — Day 4 Principle, adapted from the Marks framework

I. Anatomy of a Neuron — w·x + b, Then Squish

A single neuron does three things: (1) multiply each input by a weight, (2) add a bias, (3) pass through an activation function. That’s it. Every neuron in every network — from a 2-layer MLP to GPT-4 — follows this pattern.

Exhibit A — A Single Neuron: Inputs → Weighted Sum → Activation

II. The Code — Neuron, Layer, MLP

Three classes, each building on the previous. A Neuron holds weights and a bias. A Layer holds a list of neurons. An MLP holds a list of layers. This is the entire architecture of a multi-layer perceptron.

import random

class Neuron:
    def __init__(self, nin):
        self.w = [Value(random.uniform(-1,1)) for _ in range(nin)]
        self.b = Value(0)

    def __call__(self, x):
        act = sum((wi*xi for wi,xi in zip(self.w, x)), self.b)
        return act.tanh()

    def parameters(self):
        return self.w + [self.b]

class Layer:
    def __init__(self, nin, nout):
        self.neurons = [Neuron(nin) for _ in range(nout)]

    def __call__(self, x):
        out = [n(x) for n in self.neurons]
        return out[0] if len(out) == 1 else out

    def parameters(self):
        return [p for n in self.neurons for p in n.parameters()]

class MLP:
    def __init__(self, nin, nouts):
        sz = [nin] + nouts
        self.layers = [Layer(sz[i], sz[i+1]) for i in range(len(nouts))]

    def __call__(self, x):
        for layer in self.layers:
            x = layer(x)
        return x

    def parameters(self):
        return [p for layer in self.layers for p in layer.parameters()]

# Example: 3-input, two hidden layers of 4, 1 output
model = MLP(3, [4, 4, 1])
print(len(model.parameters()))  # 41 parameters
  

The parameters() Pattern

Every level of the hierarchy exposes a parameters() method that collects all learnable values. This is the same pattern PyTorch uses with nn.Module.parameters(). It lets you write one optimizer loop that updates every weight in the network, regardless of depth.

III. Architecture Anatomy — MLP(3, [4, 4, 1])

Exhibit B — Multi-Layer Perceptron: 3 Inputs → 4 → 4 → 1 Output

IV. The Matrix — What Matters Today

Builds Deep Intuition

Surface-Level Only

Quick to Do

🎯

DO FIRST

Implement Neuron, Layer, MLP. Create MLP(3,[4,4,1]). Do a forward pass on sample data. Count parameters.

⏭️

DO IF TIME

Visualize the computation graph using graphviz. Karpathy’s draw_dot() utility makes this easy and deeply satisfying.

Slow but Worth It

🖐

DO CAREFULLY

Call backward() on the output and inspect gradients of every weight. Understand which weights have large vs. small gradients and why.

🚫

AVOID TODAY

Writing a training loop or loss function. That’s tomorrow. Today is purely about building and understanding the forward pass architecture.

V. Today’s Deliverables

Neuron class: Random weights, bias, __call__ with tanh, parameters()
Layer class: List of neurons, __call__ returns list or single value
MLP class: List of layers, sequential __call__, parameters()
Forward pass: model([2.0, 3.0, -1.0]) — get a scalar output in (−1, 1)
Count params: len(model.parameters()) — verify the math (e.g., 41 for [3,4,4,1])
Backward pass: Call .backward() on output, inspect weight gradients

A neuron is the atom of intelligence. Alone, it computes a weighted sum and squishes it. But composed into layers, layers into networks, something remarkable emerges: the capacity to approximate any function. Today you built the smallest possible version of this. Tomorrow, you teach it to learn. — Day 4 Closing Principle

Day 4 Notebook — Building a Neuron & MLP from Scratch Runnable Python

Neuron, Layer, and MLP classes built on the Value engine. Forward pass, parameter inspection, and architecture exploration.

▶ Open in Colab View on GitHub nbviewer