One of these is the Adam optimizer performing actual gradient descent. The other is... weeeee

source twitter