Donovan Ho

Recent

Multivariate Gaussian Distribution
Jun 14, 2025
Linear Algebra
May 17, 2025
Ordinary Differential Equations
May 16, 2025

Oct 13, 20241 min read

Machine Learning

Gradient Descent

Descend to the lowest point in the loss function such that the parameters of a model is optimized in a way that best fits the data.

How It Works

Calculate the gradient of the loss function i.e. take the derivative with respect to each of the parameters.
Select random values for the parameters.
Plug the parameter values into the derivatives i.e. the gradient.
Calculate the step size.

St e p S i ze = P a r am \times L e a r nin g R a t e

Calculate the new parameters.

P a r a m_{n} = P a r a m_{n - 1} - St e p S i ze

Repeat until step size is very small (close to zero) or when the maximum number of steps has been reached.

Beware

When there are millions of data points, gradient descent can take a long time which is where stochastic gradient descent comes in.

References

StatQuest Gradient Descent, Step-by-Step

Gradient Descent
How It Works
References

Graph View

Backlinks

Linear Regression

Created with Quartz v4.5.0 © 2025

GitHub
LinkedIn