Polyak Steps In Function Value
wc_polyak_steps_in_function_value(L, mu, gamma; verbose=true)Problem statement
Compute a PEPit worst-case guarantee for wc_polyak_steps_in_function_value.
Consider the minimization problem
\[f_\star \triangleq \min_x f(x),\]
where $f$ is $L$-smooth and $\mu$-strongly convex, and $x_\star=\arg\min_x f(x)$.
Performance metric
This code computes a worst-case guarantee for a variant of a gradient method relying on Polyak step-sizes. That is, it computes the smallest possible $\tau(L, \mu, \gamma)$ such that the guarantee
\[f(x_{t+1}) - f_\star \leqslant \tau(L, \mu, \gamma) (f(x_t) - f_\star)\]
is valid, where $x_t$ is the output of the gradient method with PS and $\gamma$ is the effective value of the step-size of the gradient method.
In short, for given values of $L$, $\mu$, and $\gamma$, $\tau(L, \mu, \gamma)$ is computed as the worst-case value of $f(x_{t+1})-f_\star$ when $f(x_t)-f_\star \leqslant 1$.
Algorithm
Gradient descent is described by
\[x_{t+1} = x_t - \gamma \nabla f(x_t),\]
where $\gamma$ is a step-size. The Polyak step-size rule under consideration here corresponds to choosing of $\gamma$ satisfying:
\[\|\nabla f(x_t)\|^2 = 2 L (2 - L \gamma) (f(x_t) - f_\star).\]
Theoretical guarantee
The gradient method with the variant of Polyak step-sizes under consideration enjoys the tight theoretical guarantee [1, Proposition 2]:
\[f(x_{t+1})-f_\star \leqslant \tau(L,\mu,\gamma) (f(x_{t})-f_\star),\]
where $\gamma$ is the effective step-size used at iteration $t$ and
\[ \begin{aligned} \tau(L,\mu,\gamma) & = & \left\{\begin{array}{ll} (\gamma L - 1) (L \gamma (3 - \gamma (L + \mu)) - 1) & \text{if } \gamma\in[\tfrac{1}{L},\tfrac{2L-\mu}{L^2}],\\ 0 & \text{otherwise.} \end{array}\right. \end{aligned}\]
References
Arguments
L: smoothness or Lipschitz parameter, as used by the modeled class.mu: strong convexity or monotonicity parameter, as used by the modeled class.gamma: step-size parameter.verbose: print example and solver progress information when true.
Returns
pepit_tau: worst-case valuetheoretical_tau: theoretical value
Julia usage
pepit_tau, theoretical_tau = wc_polyak_steps_in_function_value(1.0, 0.1, 2 / (1 + 0.1); verbose=true)
## Returns approximately: (pepit_tau, theoretical_tau) = (0.669421, 0.669421)