One of these is the Adam optimizer performing actual gradient descent. The other is... weeeee
source twitter