Saga

Source file

wc_saga(L, mu, n; solver=Clarabel.Optimizer, verbose=true)

Problem statement

Compute a PEPit worst-case guarantee for wc_saga.

Consider the finite sum convex minimization problem

\[F_\star \triangleq \min_x \left\{F(x) \equiv h(x) + \frac{1}{n} \sum_{i=1}^{n} f_i(x)\right\},\]

where the functions $f_i$ are assumed to be $L$-smooth $\mu$-strongly convex, and $h$ is closed, proper, and convex with a proximal operator readily available. In the sequel, we use the notation $\mathbb{E}$ for denoting the expectation over the uniform distribution of the index $i \sim \mathcal{U}\left([|1, n|]\right)$, e.g., $F(x)\equiv h(x)+\mathbb{E}[f_i(x)]$.

Performance metric

This code computes the exact rate for a Lyapunov (or energy) function for SAGA [1]. That is, it computes the smallest possible $\tau(n,L,\mu)$ such this Lyapunov function decreases geometrically

\[\mathbb{E}[V^{(1)}] \leqslant \tau(n, L, \mu) V^{(0)},\]

where the value of the Lyapunov function at iteration $t$ is denoted by $V^{(t)}$ and is defined as

\[V^{(t)} \triangleq \frac{1}{n} \sum_{i=1}^n \left(f_i(\phi_i^{(t)}) - f_i(x^\star) - \langle \nabla f_i(x^\star); \phi_i^{(t)} - x^\star\rangle\right) + \frac{1}{2 n \gamma (1-\mu \gamma)} \|x^{(t)} - x^\star\|^2,\]

with $\gamma = \frac{1}{2(\mu n+L)}$ (this Lyapunov function was proposed in [1, Theorem 1]). We consider the case $t=0$ in the code below, without loss of generality.

In short, for given values of $n$, $L$, and $\mu$, $\tau(n, L, \mu)$ is computed as the worst-case value of $\mathbb{E}[V^{(1)}]$ when $V(x^{(0)}) \leqslant 1$.

Algorithm

One iteration of SAGA [1] is described as follows: at iteration $t$, pick $j\in\{1,\ldots,n\}$ uniformely at random and set:

\[ \begin{aligned} \phi_j^{(t+1)} & = & x^{(t)} \\ w^{(t+1)} & = & x^{(t)} - \gamma \left[ \nabla f_j (\phi_j^{(t+1)}) - \nabla f_j(\phi_j^{(t)}) + \frac{1}{n} \sum_{i=1}^n(\nabla f_i(\phi^{(t)}))\right] \\ x^{(t+1)} & = & \mathrm{prox}_{\gamma h} (w^{(t+1)})\triangleq \arg\min_x \left\{ \gamma h(x)+\frac{1}{2}\|x-w^{(t+1)}\|^2\right\} \end{aligned}\]

Theoretical guarantee

The following upper bound (empirically tight) can be found in [1, Theorem 1]:

\[\mathbb{E}[V^{(t+1)}] \leqslant \left(1-\gamma\mu \right)V^{(t)}\]

References

[1] A. Defazio, F. Bach, S. Lacoste-Julien (2014). SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives. In Advances in Neural Information Processing Systems (NIPS).

Arguments

  • L: smoothness or Lipschitz parameter, as used by the modeled class.
  • mu: strong convexity or monotonicity parameter, as used by the modeled class.
  • n: number of iterations.
  • solver: JuMP optimizer constructor used to solve the generated SDP.
  • verbose: print example and solver progress information when true.

Returns

  • pepit_tau: worst-case value
  • theoretical_tau: theoretical value

Julia usage

pepit_tau, theoretical_tau = wc_saga(1.0, 0.1, 5; solver=Clarabel.Optimizer, verbose=true)
## Returns approximately: (pepit_tau, theoretical_tau) = (0.966667, 0.966667)