Inexact Gradient Descent

Source file

wc_inexact_gradient_descent(L, mu, epsilon, n; solver=Clarabel.Optimizer, verbose=true)

Problem statement

Compute a PEPit worst-case guarantee for wc_inexact_gradient_descent.

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where $f$ is $L$-smooth and $\mu$-strongly convex.

Performance metric

This code computes a worst-case guarantee for the inexact gradient method. That is, it computes the smallest possible $\tau(n, L, \mu, \varepsilon)$ such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L, \mu, \varepsilon) (f(x_0) - f_\star)\]

is valid, where $x_n$ is the output of the inexact gradient method, and where $x_\star$ is the minimizer of $f$. In short, for given values of $n$, $L$, $\mu$ and $\varepsilon$, $\tau(n, L, \mu, \varepsilon)$ is computed as the worst-case value of $f(x_n)-f_\star$ when $f(x_0) - f_\star \leqslant 1$.

Algorithm

\[x_{t+1} = x_t - \gamma d_t\]

with

\[\|d_t - \nabla f(x_t)\| \leqslant \varepsilon \|\nabla f(x_t)\|\]

and

\[\gamma = \frac{2}{L_{\varepsilon} + \mu_{\varepsilon}}\]

where $L_{\varepsilon} = (1 + \varepsilon) L$ and $\mu_{\varepsilon} = (1 - \varepsilon) \mu$.

Theoretical guarantee

The tight worst-case guarantee obtained in [1, Theorem 5.3] or [2, Remark 1.6] is

\[f(x_n) - f_\star \leqslant \left(\frac{L_{\varepsilon}-\mu_{\varepsilon}}{L_{\varepsilon}+\mu_{\varepsilon}}\right)^{2n}(f(x_0) - f_\star),\]

where tightness is achieved on simple quadratic functions.

References

The detailed analyses can be found in [1, 2].

[1] E. De Klerk, F. Glineur, A. Taylor (2020). Worst-case convergence analysis of inexact gradient and Newton methods through semidefinite programming performance estimation. SIAM Journal on Optimization, 30(3), 2053-2082.

[2] O. Gannot (2021). A frequency-domain analysis of inexact gradient methods. Mathematical Programming.

Arguments

L: smoothness or Lipschitz parameter, as used by the modeled class.
mu: strong convexity or monotonicity parameter, as used by the modeled class.
epsilon: level of inaccuracy.
n: number of iterations.
solver: JuMP optimizer constructor used to solve the generated SDP.
verbose: print example and solver progress information when true.

Returns

pepit_tau: worst-case value
theoretical_tau: theoretical value

Julia usage

pepit_tau, theoretical_tau = wc_inexact_gradient_descent(1.0, 0.1, 0.1, 2; solver=Clarabel.Optimizer, verbose=true)
## Returns approximately: (pepit_tau, theoretical_tau) = (0.518917, 0.518917)