laGrenouille4 hours ago
Great visualizations. Really enjoyed having a well-written example where mathematical proofs directly help with understanding a practical application.
I wonder what would happen with this analysis if a momentum term was added to the gradient descent. It seems that it would fix the specific failure modes in the examples, but I wonder if there's a corresponding mathematical way of categorizing what kinds of functions can(not) be quickly optimized with GD + momentum.
xuzhenpeng6 hours ago
The animation is very good, making the article easy to understand
Guestmodinfo5 hours ago
We studied it in our peparation for college entrance exams in India. Though the detail the article goes in is exhaustive. But I thought that this maybe common or almost common knowledge. We used to call it sandwich theorem
thaumasiotes5 hours ago
The sandwich theorem would normally refer to this one: https://en.wikipedia.org/wiki/Squeeze_theorem
quietbritishjiman hour ago
I immediately thought of the ham sandwich theorem
CarVac2 hours ago
Simplex methods can handle those tough situations, though.
xzp121385 hours ago
[flagged]