Inexact Accelerated Gradient

Source file

wc_inexact_accelerated_gradient(L, epsilon, n; solver=Clarabel.Optimizer, verbose=true)

Problem statement

Compute a PEPit worst-case guarantee for wc_inexact_accelerated_gradient.

Consider the minimization problem

\[f_\star \triangleq \min_x f(x),\]

where $f$ is $L$-smooth and convex.

Performance metric

This code computes a worst-case guarantee for an accelerated gradient method using inexact first-order information. That is, it computes the smallest possible $\tau(n, L, \varepsilon)$ such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L, \varepsilon) \|x_0 - x_\star\|^2\]

is valid, where $x_n$ is the output of inexact accelerated gradient descent and where $x_\star$ is a minimizer of $f$.

The inexact descent direction is assumed to satisfy a relative inaccuracy described by (with $0\leqslant \varepsilon \leqslant 1$)

\[\|\nabla f(y_t) - d_t\| \leqslant \varepsilon \|\nabla f(y_t)\|,\]

where $\nabla f(y_t)$ is the true gradient at $y_t$ and $d_t$ is the approximate descent direction that is used.

Algorithm

The inexact accelerated gradient method of this example is provided by

\[ \begin{aligned} x_{t+1} & = & y_t - \frac{1}{L} d_t\\ y_{k+1} & = & x_{t+1} + \frac{t-1}{t+2} (x_{t+1} - x_t). \end{aligned}\]

Theoretical guarantee

When $\varepsilon=0$, a tight empirical guarantee can be found in [1, Table 1]:

\[f(x_n)-f_\star \leqslant \frac{2L\|x_0-x_\star\|^2}{n^2 + 5 n + 6},\]

which is achieved on some Huber loss functions (when $\varepsilon=0$).

References

[1] A. Taylor, J. Hendrickx, F. Glineur (2017). Exact worst-case performance of first-order methods for composite convex optimization. SIAM Journal on Optimization, 27(3):1283-1313.

Arguments

  • L: smoothness or Lipschitz parameter, as used by the modeled class.
  • epsilon: level of inaccuracy
  • n: number of iterations.
  • solver: JuMP optimizer constructor used to solve the generated SDP.
  • verbose: print example and solver progress information when true.

Returns

  • pepit_tau: worst-case value
  • theoretical_tau: theoretical value

Julia usage

pepit_tau, theoretical_tau = wc_inexact_accelerated_gradient(1.0, 0.1, 5; solver=Clarabel.Optimizer, verbose=true)
## Returns approximately: (pepit_tau, theoretical_tau) = (0.039388, 0.035714)