Point Saga
wc_point_saga(L, mu, n; solver=Clarabel.Optimizer, verbose=true)Problem statement
Compute a PEPit worst-case guarantee for wc_point_saga.
Consider the finite sum minimization problem
\[F^\star \triangleq \min_x \left\{F(x) \equiv \frac{1}{n} \sum_{i=1}^n f_i(x)\right\},\]
where $f_1, \dots, f_n$ are $L$-smooth and $\mu$-strongly convex, and with proximal operator readily available. In the sequel, we use the notation $\mathbb{E}$ for denoting the expectation over the uniform distribution of the index $i \sim \mathcal{U}\left([|1, n|]\right)$, e.g., $F(x)\equiv \mathbb{E}[f_i(x)]$.
Performance metric
This code computes a tight (one-step) worst-case guarantee using a Lyapunov function for Point SAGA [1]. The Lyapunov (or energy) function at a point $x$ is given in [1, Theorem 5]:
\[V(x) = \frac{1}{L \mu}\frac{1}{n} \sum_{i \leq n} \|\nabla f_i(x) - \nabla f_i(x_\star)\|^2 + \|x - x^\star\|^2,\]
where $x^\star$ denotes the minimizer of $F$. The code computes the smallest possible $\tau(n, L, \mu)$ such that the guarantee (in expectation):
\[\mathbb{E}\left[V\left(x^{(1)}\right)\right] \leqslant \tau(n, L, \mu) V\left(x^{(0)}\right),\]
is valid (note that we use the notation $x^{(0)},x^{(1)}$ to denote two consecutive iterates for convenience; as the bound is valid for all $x^{(0)}$, it is also valid for any pair of consecutive iterates of the algorithm).
In short, for given values of $n$, $L$, and $\mu$, $\tau(n, L, \mu)$ is computed as the worst-case value of $\mathbb{E}\left[V\left(x^{(1)}\right)\right]$ when $V\left(x^{(0)}\right) \leqslant 1$.
Algorithm
Point SAGA is described by
\[\begin{aligned} \text{Set }\gamma & = & \frac{\sqrt{(n - 1)^2 + 4n\frac{L}{\mu}}}{2Ln} - \frac{\left(1 - \frac{1}{n}\right)}{2L} \\ \text{Pick random }j & \sim & \mathcal{U}\left([|1, n|]\right) \\ z^{(t)} & = & x_t + \gamma \left(g_j^{(t)} - \frac{1}{n} \sum_{i\leq n}g_i^{(t)} \right), \\ x^{(t+1)} & = & \mathrm{prox}_{\gamma f_j}(z^{(t)})\triangleq \arg\min_x\left\{ \gamma f_j(x)+\frac{1}{2} \|x-z^{(t)}\|^2 \right\}, \\ g_j^{(t+1)} & = & \frac{1}{\gamma}(z^{(t)} - x^{(t+1)}). \end{aligned}\]
Theoretical guarantee
A theoretical upper bound is given in [1, Theorem 5].
\[\mathbb{E}\left[V\left(x^{(t+1)}\right)\right] \leqslant \frac{1}{1 + \mu\gamma} V\left(x^{(t)}\right)\]
References
Arguments
L: smoothness or Lipschitz parameter, as used by the modeled class.mu: strong convexity or monotonicity parameter, as used by the modeled class.n: number of iterations.solver: JuMP optimizer constructor used to solve the generated SDP.verbose: print example and solver progress information when true.
Returns
pepit_tau: worst-case valuetheoretical_tau: theoretical value
Julia usage
pepit_tau, theoretical_tau = wc_point_saga(1.0, 0.01, 10; solver=Clarabel.Optimizer, verbose=true)
## Returns approximately: (pepit_tau, theoretical_tau) = (0.971405, 0.973292)