Problem 1 §
Let h3=y^ for consistencey in notation. Assume vectors are column vectors. First we calculate:
∂h3,i∂ℓ∂h3∂ℓ=N−2(yi−h3,i)=−N2(y−h3)⊤=letδ3⊤shape 1×D3
Then we consider the tree of computation as in class. Let ak=Wk⊤hk−1+bk
- ∂W3⊤∂h3=h2 to get ∂W3⊤∂ℓ=h2δ3⊤ 🔴
- ∂b3∂h3=1 to get ∂b3∂ℓ=δ3⊤ 🔴
- ∂h2∂h3=W3 to get ∂h2∂ℓ=W3δ3⊤=letδ2
Next layer:
- ∂a2∂h2=∂(⋅)∂g(⋅)
- ∂h1∂a2=W2⊤ to get ∂h1∂ℓ=W2⊤g′(⋅)δ2=letδ1
- ∂W2⊤∂h2=h1 to get ∂W2⊤∂ℓ=h1δ2⊤ 🔴
- ∂b2∂h2=1 to get ∂b2∂ℓ=δ2⊤ 🔴
Similarly:
- get ∂W1⊤∂ℓ=h0δ1⊤=xδ1⊤ 🔴
- get ∂b1∂ℓ=δ1⊤ 🔴
Results marked with 🔴