Gradient Descent OldKiwi - Rhea

Gradient descent algorithms are iterative methods for finding the minimum of a function in $\Re^n$ , where n is usually at least 3.

The general form of the algorithm is: $x^{(k+1)} = x^{(k)} - \alpha_k \nabla f(x^{(k)})$

Where (k) represents the current iteration and (k+1) the next. $\alpha_k$ is referred to as the step size and is itself a function that differs between the various gradient descent algorithms. Lastly, $\nabla f(x^{(k)})$ is the gradient_OldKiwi of the function f evaluated at the current x and is referred to as the direction of the algorithm. The algorithm terminates when the gradient becomes equal to a vector of 0's, which indicates that a local minimum has been reached.

For an example of a common $\alpha_k$ , let us examine the steepest descent algorithm, which is simply one of many gradient descent algorithms. If the function is quadratic, $\alpha_k = \frac{\nabla f(x^{(k)})^T \nabla f(x^{(k)})}{\nabla f(x^{(k)})^T F(x^{(k)}) \nabla f(x^{(k)})}$ .

Where F(x) is the Hessian_OldKiwi of f(x).

Gradient Descent OldKiwi - Rhea

Alumni Liaison