Accelerated Gradient Convex Simplified

Source file

wc_accelerated_gradient_convex_simplified(mu, L, n; solver=Clarabel.Optimizer, verbose=true)

Problem statement

Compute a PEPit worst-case guarantee for wc_accelerated_gradient_convex_simplified.

Consider the convex minimization problem

\[f_\star \triangleq \min_x f(x),\]

where $f$ is $L$-smooth and $\mu$-strongly convex ($\mu$ is possibly 0).

Performance metric

This code computes a worst-case guarantee for an accelerated gradient method, a.k.a. fast gradient method with a set of classical slightly simplified sets of coefficients compared to the original [1]. That is, the code computes the smallest possible $\tau(n, L, \mu)$ such that the guarantee

\[f(x_n) - f_\star \leqslant \tau(n, L, \mu) \|x_0 - x_\star\|^2\]

is valid, where $x_n$ is the output of the accelerated gradient method below, and where $x_\star$ is the minimizer of $f$. In short, for given values of $n$, $L$ and $\mu$, $\tau(n, L, \mu)$ is computed as the worst-case value of $f(x_n)-f_\star$ when $\|x_0 - x_\star\|^2 \leqslant 1$.

Algorithm

The accelerated gradient method of this example is provided by

\[ \begin{aligned} x_{t+1} & = & y_t - \frac{1}{L} \nabla f(y_t) \\ y_{t+1} & = & x_{t+1} + \frac{t-1}{t+2} (x_{t+1} - x_t). \end{aligned}\]

Theoretical guarantee

When $\mu=0$, a tight empirical guarantee can be found in [2, Table 1]:

\[f(x_n)-f_\star \leqslant \frac{2L\|x_0-x_\star\|^2}{n^2 + 5 n + 6},\]

where tightness is obtained on some Huber loss functions.

References

[1] Y. Nesterov (1983). A method for solving the convex programming problem with convergence rate O(1/k^2). In Dokl. akad. nauk Sssr (Vol. 269, pp. 543-547).

[2] A. Taylor, J. Hendrickx, F. Glineur (2017). Exact worst-case performance of first-order methods for composite convex optimization. SIAM Journal on Optimization, 27(3):1283-1313.

Arguments

  • mu: strong convexity or monotonicity parameter, as used by the modeled class.
  • L: smoothness or Lipschitz parameter, as used by the modeled class.
  • n: number of iterations.
  • solver: JuMP optimizer constructor used to solve the generated SDP.
  • verbose: print example and solver progress information when true.

Returns

  • pepit_tau: worst-case value
  • theoretical_tau: theoretical value

Julia usage

pepit_tau, theoretical_tau = wc_accelerated_gradient_convex_simplified(0, 1, 1; solver=Clarabel.Optimizer, verbose=true)
## Returns approximately: (pepit_tau, theoretical_tau) = (0.166667, 0.166667)