Perceptron: Learning y = x²

Weights (the knobs)

w₀ 2.00

w₁ 1.00

w₂ 0.00

Loss (MSE)

—

Step 0 · LR: · Points:

Show Data Table

x	True y	Predicted ŷ	Error (y − ŷ)	(y − ŷ)²

Gradients (the Jacobian)

The Jacobian (∂L/∂wⱼ) is the gradient of the loss w.r.t. each weight. A gradient step moves each weight by Δwⱼ = −LR × ∂L/∂wⱼ — the minus sign means we walk downhill, and the learning rate (LR) scales how big the step is. Near zero means that weight is already close to optimal along its axis.

gradient magnitude → ∂L/∂wⱼ direction next step Δwⱼ

∂L/∂w₀

— — —

∂L/∂w₁

— — —

∂L/∂w₂

— — —

Drag the weight sliders and watch how the curves, loss, and gradients change!

Gradient Slices (Loss vs each weight)

Each plot shows how the loss changes when we move one weight while keeping the others fixed. The red dot is your current position, the dashed line is the tangent (gradient slope), and the orange step vector goes from the red dot to the blue triangle ▲ (next point) — i.e. w_new = w − LR × ∂L/∂w.

Loss vs w₀

Loss vs w₁

Loss vs w₂

Loss Landscape (3D surface)

The red dot marks where you are now. Gradient descent = rolling downhill. Drag to rotate!

Hold fixed: w₁ = 1.00 (fixed)

View:

Loss Isosurface (all 3 weights)

The full loss lives in 4D: L(w₀, w₁, w₂). An isosurface shows all weight combinations that produce the same loss. As you lower the threshold, the shell shrinks toward the optimal point — gradient descent navigates through these nested shells inward.

Loss threshold:

5.0

Nested shells

Loss Over Time No steps yet

Perceptron: Learning to Fit y = x²