Point Saga

Source file

wc_point_saga(L, mu, n; solver=Clarabel.Optimizer, verbose=true)

Problem statement

Compute a PEPit worst-case guarantee for wc_point_saga.

Consider the finite sum minimization problem

\[F^\star \triangleq \min_x \left\{F(x) \equiv \frac{1}{n} \sum_{i=1}^n f_i(x)\right\},\]

where $f_1, \dots, f_n$ are $L$-smooth and $\mu$-strongly convex, and with proximal operator readily available. In the sequel, we use the notation $\mathbb{E}$ for denoting the expectation over the uniform distribution of the index $i \sim \mathcal{U}\left([|1, n|]\right)$, e.g., $F(x)\equiv \mathbb{E}[f_i(x)]$.

Performance metric

This code computes a tight (one-step) worst-case guarantee using a Lyapunov function for Point SAGA [1]. The Lyapunov (or energy) function at a point $x$ is given in [1, Theorem 5]:

\[V(x) = \frac{1}{L \mu}\frac{1}{n} \sum_{i \leq n} \|\nabla f_i(x) - \nabla f_i(x_\star)\|^2 + \|x - x^\star\|^2,\]

where $x^\star$ denotes the minimizer of $F$. The code computes the smallest possible $\tau(n, L, \mu)$ such that the guarantee (in expectation):

\[\mathbb{E}\left[V\left(x^{(1)}\right)\right] \leqslant \tau(n, L, \mu) V\left(x^{(0)}\right),\]

is valid (note that we use the notation $x^{(0)},x^{(1)}$ to denote two consecutive iterates for convenience; as the bound is valid for all $x^{(0)}$, it is also valid for any pair of consecutive iterates of the algorithm).

In short, for given values of $n$, $L$, and $\mu$, $\tau(n, L, \mu)$ is computed as the worst-case value of $\mathbb{E}\left[V\left(x^{(1)}\right)\right]$ when $V\left(x^{(0)}\right) \leqslant 1$.

Algorithm

Point SAGA is described by

\[\begin{aligned} \text{Set }\gamma & = & \frac{\sqrt{(n - 1)^2 + 4n\frac{L}{\mu}}}{2Ln} - \frac{\left(1 - \frac{1}{n}\right)}{2L} \\ \text{Pick random }j & \sim & \mathcal{U}\left([|1, n|]\right) \\ z^{(t)} & = & x_t + \gamma \left(g_j^{(t)} - \frac{1}{n} \sum_{i\leq n}g_i^{(t)} \right), \\ x^{(t+1)} & = & \mathrm{prox}_{\gamma f_j}(z^{(t)})\triangleq \arg\min_x\left\{ \gamma f_j(x)+\frac{1}{2} \|x-z^{(t)}\|^2 \right\}, \\ g_j^{(t+1)} & = & \frac{1}{\gamma}(z^{(t)} - x^{(t+1)}). \end{aligned}\]

L: smoothness or Lipschitz parameter, as used by the modeled class.
mu: strong convexity or monotonicity parameter, as used by the modeled class.
n: number of iterations.
solver: JuMP optimizer constructor used to solve the generated SDP.
verbose: print example and solver progress information when true.

Returns

pepit_tau: worst-case value
theoretical_tau: theoretical value

Julia usage

pepit_tau, theoretical_tau = wc_point_saga(1.0, 0.01, 10; solver=Clarabel.Optimizer, verbose=true)
## Returns approximately: (pepit_tau, theoretical_tau) = (0.971405, 0.973292)

Point Saga

Problem statement

Performance metric

Algorithm

Theoretical guarantee

References

Arguments

Returns

Julia usage