Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Machine Learning
CMPT 726 / 419
1 Announcements Deadline for academic dishonesty amnesty application is this coming Sunday, March 20th.
If you committed an academic offence and do not fill out the form, the standard penalty will apply (at least zero or -100% for a single offence, F in the course for multiple offences), without exception. 2 Recap 3 Nesterov’s Accelerated Gradient (NAG) Gradient Descent with Momentum:
⃗θ (0) ← random vector Δ ⃗θ = ⃗0 for t = 1,2,3,… Δ ⃗θ ← αΔ ⃗θ − γt ∂L ∂ ⃗θ ( ⃗θ (t−1)) ⃗θ (t) ← ⃗θ (t−1) + Δ ⃗θ if algorithm has converged return ⃗θ (t) 4 Nesterov’s Accelerated Gradient:
⃗θ (0) ← random vector Δ ⃗θ = ⃗0 for t = 1,2,3,… Δ ⃗θ ← αΔ ⃗θ − γt ∂L ∂ ⃗θ ( ⃗θ (t−1) + αΔ ⃗θ ) ⃗θ (t) ← ⃗θ (t−1) + Δ ⃗θ if algorithm has converged return ⃗θ (t) Lookahead: account for the inertia when choosing where to compute gradient Adaptive Gradient Methods Recall the function .
The eigenvalues of the Hessian (which in this case are the diagonal entries) have very different magnitudes.