ML0023 Gradient Descent

Written by

What is Gradient Descent in machine learning?

Answer

Gradient descent is an iterative optimization algorithm used to minimize a function, most commonly a cost or loss function in machine learning, by moving step-by-step in the direction of the steepest descent (i.e., opposite to the gradient).
In each iteration, the algorithm computes the gradient of the function with respect to its parameters, then updates the parameters by subtracting a fraction (the learning rate) of this gradient.
This process is repeated until the function converges to a minimum (which, for convex functions, is the global minimum) or until the updates become negligibly small.

The update rule for Gradient descent:
${\huge\theta = \theta - \alpha \nabla J(\theta)}$
where:
${\huge\theta}$ represents the parameters being optimized (for example, the weights in a model).
${\huge\alpha}$ represents the learning_rate.
${\huge\nabla J(\theta)}$ is the gradient of the cost function ${\huge J( \theta )}$ with respect to ${\huge\theta}$ .