Sparse LSQR solver

LSQR is a popular algorithm for finding the least-squares solution to a large, sparse, linear system of equations. The system does not have to be positive definite, symmetric or even square. It can also be rank-deficient.

ALGLIB, a free and commercial open source numerical library, provides its own implementation of the LSQR algorithm in C/C++ and C#, with wrappers for several other programming languages.

Contents

    1 Sparse LLS overview
           Sparse linear least squares
           LSQR algorithm
           Constrained problems
    2 LSQR solver workflow
    3 C/C++ and C# performance
    4 Downloads section

Sparse LLS overview

Sparse linear least squares

"Sparse LLS" is another name for a rectangular sparse system of linear equations which can be overdetermined (does not have a solution which reduces error to zero) and/or underdetermined (there are many solutions which result in exactly the same error). Such systems are usually solved in a least squares sense, often with regularization:

From the performance and robustness points of view, the best situation is when A is symmetric (square) positive definite. Thus, it is tempting to reduce the problem to a symmetric positive definite one: (A'·A)·x=A'·b (so called normal equations approach). However, such operation often results in a catastrophic loss of precision because the system condition number is squared in the process.

The better way is to use a specialized algorithm which can work directly with the original rectangular matrix. For dense problems, it is usually done with QR or SVD-based solvers. For sparse matrices, LSQR is the best solution.

LSQR algorithm

LSQR is an iterative algorithm developed by Chris Paige and Michael Saunders which computes implicit bidiagonal decomposition of the system matrix (some similarity to dense SVD-based solvers can be seen). Such representation allows us to easily find a least squares solution to the original system.

The algorithm has modest memory requirements (O(max(N,M)) in addition to the matrix itself). Unlike normal equations mentioned above, it does not increase matrix condition number (one more similarity to SVD-based solvers). Thus, it provides the best precision possible with iterative solvers.

Constrained problems

LSQR is the best solution for an unconstrained LLS problem. However, it can't handle constrained problems - even merely box-constrained ones. Such problems can be solved with other algorithms from ALGLIB optimization and fitting packages.

LSQR solver workflow

LSQR solver is provided by linlsqr subpackage of ALGLIB (the link above points to ALGLIB Reference Manual Entry for the solver). A typical use case is outlined below:

  1. Create solver object with linlsqrcreate method.
  2. Set stopping criteria (step size and/or iterations count) with linlsqrsetcond call. Default settings will be used if this step is omitted.
  3. Set regularization term with linlsqrsetlambdai (also optional).
  4. Start solution process with linlsqrsolvesparse call and get results with linlsqrresults

An example (source code) can be found in ALGLIB Reference Manual: linlsqr_d_1.

C/C++ and C# performance

LSQR is an iterative matrix-free algorithm that accesses the system matrix only by evaluating matrix-vector products Ax and ATx. Its running time is usually dominated by these products, with the solver itself introducing only negligible overhead. In turn, efficiency of matrix-vector products greatly depends on the matrix-size/CPU-cache ratio.

For small tasks (ones that fit into L1/L2 CPU caches) the C/C++ version of LSQR provides roughly 2x better performance than the C# implementation. The difference in performance is explained by the fact that C/C++ is consistently faster at memory operations than C# due to lack of runtime checks. However, as soon as the matrix size exceeds that of the CPU cache, both versions of LSQR show comparable performance, with the C++ implementation being merely 20% faster than the C# one. The reason is that the algorithm spends most of its time waiting for the memory bus.

Finally, we would like to touch on the question of low level optimizations. Unfortunately, sparse matrix-vector products are especially hard to accelerate with either SIMD or multithreading. Thus, one should not expect any improvements from activating SIMD/SMP support in ALGLIB when using LSQR.

This article is licensed for personal use only.

Download ALGLIB for C++ / C# / Java / Python / ...

ALGLIB Project offers you two editions of ALGLIB:

ALGLIB Free Edition:
+delivered for free
+offers full set of numerical functionality
+extensive algorithmic optimizations
-no multithreading
-non-commercial license

ALGLIB Commercial Edition:
+flexible pricing
+offers full set of numerical functionality
+extensive algorithmic optimizations
+high performance (SMP, SIMD)
+commercial license with support plan

Links to download sections for Free and Commercial editions can be found below:

ALGLIB 4.04.0 for C++

C++ library.
Delivered with sources.
Monolithic design.
Extreme portability.
Editions:   FREE   COMMERCIAL

ALGLIB 4.04.0 for C#

C# library with native kernels.
Delivered with sources.
VB.NET and IronPython wrappers.
Extreme portability.
Editions:   FREE   COMMERCIAL

ALGLIB 4.04.0 for Java

Java wrapper around HPC core.
Delivered with sources.
Seamless integration with Java.
Editions:   FREE   COMMERCIAL

ALGLIB 4.04.0 for Delphi

Delphi wrapper around C core.
Delivered as precompiled binary.
Compatible with FreePascal.
Editions:   FREE   COMMERCIAL

ALGLIB 4.04.0 for CPython

CPython wrapper around C core.
Delivered as precompiled binary.
Editions:   FREE   COMMERCIAL