This is a companion piece to the publication:

@article{cg_variants_convergence_rates, doi = {10.1137/20m1346249}, year = {2021}, month = jul, publisher = {Society for Industrial {\&} Applied Mathematics (SIAM)}, pages = {S496--S515}, author = {Anne Greenbaum and Hexuan Liu and Tyler Chen}, title = {On the Convergence Rate of Variants of the Conjugate Gradient Algorithm in Finite Precision Arithmetic}, journal = {SIAM Journal on Scientific Computing}, eprint = {1905.05874}, archivePrefix = {arXiv}, primaryClass = {cs.NA}, }

If you are not familiar with the Conjugate Gradient method, it may be worth reading through my introduction here.

The behaviour of the conjugate gradient algorithm in finite precision is very different than what is predicted by exact arithmetic theory. In this sense, the algorithm could be considered unstable. However, the conjugate gradient algorithm is widely used in practice, so it is important to understand its behaviour in finite precision.

One effect is that in finite precision the “rate of convergence” (how many iterations it takes to reach a given level of accuracy) is reduced compared to exact arithmetic. Previously Anne Greenbaum showed that a “good” implementation of the CG algorithm, when run in finite precision, will behave like exact CG applied to a larger matrix, and that this larger matrix has eigenvalues very near to those of the original matrix. I’ve written more about this result here. A natural question is whether any of the high performance variants satisfy the conditions for Greenbaum’s analysis to apply. Specifically, the analysis requires that (i) the three term Lanczos recurrence close to satisfied, and (ii) that successive residual vectors are nearly orthogonal. Unfortunately, nobody has been able to prove this for any of the high performance variants, or even the standard implementation.

In this paper we show (numerically) why on some problems certain variants of the conjugate gradient algorithm converge more slowly than others, but on some problems all variants behave the same. To do this we first analyze how closely different variants satisfy the three term Lanczos recurrence. It turns out that the standard implementation, and one due to Chronopoulos and Gear satisfy the three term recurrence to within local rounding errors. However, the pipelined variant due to Ghysels and Vanroose, which is more parallel, has a larger deviation from a three term recurrence. While the paper does not prove that this leads to worse convergence, it does provide some intuition for why the rates of convergence observed differ on some problems.