Josiah Manson | Research Résumé Projects Prose

Eigen is a great linear algebra library

October 2013

I only recently tried out using the Eigen library, because I needed a linear solver and the TAUCS website was down. most of my previous experience with linear algebra libraries was with TAUCS and LAPACK. After using those libraries for a few years, Eigen is a breath of fresh air.

I will first start by highlighting some of the problems I encountered with TAUCS, and then how Eigen addresses those problems. TAUCS is a fast solver, but is somewhat limited in what systems it can solve. TAUCS performs Cholesky decomposition, so matrices need to be Hermitian, positive-definite. The interface to TAUCS is also quite clunky, as you have to provide matrices in compressed column format. I also feel that the most recent development efforts in TAUCS are misdirected. TAUCS was rewritten to use multiple processors in the language Cilk, but TAUCS is the first and last time I have heard of the Cilk language, so I doubt its usefulness.

Building TAUCS is a pain in the ass, because the authors configured it to build in Windows with a statically linked runtime, which is incompatible with the standard practice of using a dynamically linked runtime. If you want to use TAUCS with any other libraries you will need to change the TAUCS build rules. TAUCS comes with precompiled binaries of its dependencies, but those are all statically linked also, and are therefore useless. This means you must find and compile all dependencies against a dynamic runtime. Just to make things fun, you can't use the current version of all the dependent libraries either. If I recall correctly, you need an old version of Metis. I was planning to write a guide on how to build TAUCS as a public service and so that I don't forget myself, but my new advice is simple. Use Eigen instead.

Eigen is a well designed C++ template library that has lots of different routines for solving both sparse and dense linear systems. Regardless of what sort of system you have, there is probably a solver in Eigen that is specific to your problem. For example, if you have a least squares problem $\min_x (Ax-b)^2$, you can directly solve the system using $A$ and $b$, whereas in TAUCS you would have to solve $A^TAx=A^Tb$. Using a solver tailored to your problem can be beneficial both in terms of speed and of precision. In the previous example, squaring $A$ to make $A^TA$ symmetric, positive-definite halves numerical precision.

In terms of speed, it is difficult to find reliable benchmarks comparing different libraries, but one benchmark put out by the Eigen folks shows good performance. At the very least, it shows that Eigen developers are actively optimizing for speed and are aware of the competition. This means that, at worst, the speed of Eigen is on par with other libraries, and can probably be disregarded as a deciding factor.

The key selling point of Eigen for me is ease of use. For example, installation is a snap. The entire library is contained in header files, so there is literally no compilation or linking of the library. All you do is add Eigen to your include path and it magically works. Eigen is written in C++, so operator overloading makes it easy to both write and read expressions. What is great is that the more complex the expression you write, the more likely that the expression templates in Eigen will generate well optimized code. The code you write is generally so simple and easy to read compared to other libraries that I can hardly imagine going back. In the code snippet below, I have a complete example of a program that builds, solves, and prints the solution to a sparse linear system of equations.

#include <iostream>
#include <vector>
#include <Eigen/Dense>
#include <Eigen/Sparse>
#include <Eigen/SparseQR>
#include <Eigen/Core>

using namespace std;
using namespace Eigen;

int main()
{
    vector<Triplet<double>> trips(4);
    trips.push_back(Triplet<double>(0, 0, .435));
    trips.push_back(Triplet<double>(1, 1, .435));
    trips.push_back(Triplet<double>(2, 2, .435));
    trips.push_back(Triplet<double>(3, 3, .435));

    SparseMatrix<double> A;
    A.resize(4,4);
    A.setFromTriplets(trips.begin(), trips.end());
    SparseQR<SparseMatrix<double>, COLAMDOrdering<int>> solverA;
    solverA.compute(A);

    VectorXd B;
    B.resize(4);
    B(0) = .435;
    B(1) = .435;
    B(2) = .435;
    B(3) = .435;
    
    VectorXd X = solverA.solve(B);

    for (int i = 0; i < 4; i++)
        cout << X(i) << endl;
}

The code below is a quick reference for how to save and load sparse matrices. Again, the code is almost simple enough to be self-evident of what is happening. The first function saves the matrix into a binary file with a header that contains the dimensions of the matrix and number of non-zero entries. It then proceeds to loop over the rows and columns of the matrix, where each non-zero stores its index and value. Notice that this code works both for column major and row major matrices.

The loading code is slightly less obvious. It is necessary to store the non-zero matrix entries in a vector. When the vector is completely loaded, the matrix is built from that vector using the setFromTriplets function. Constructing a compressed column storage matrix essentially involves sorting an array of non-zero entries by their index, which takes $O(n \log n)$ time when all entries are given at once, whereas incremental insertion takes $O(n^2)$ time.

#include <string> // Needed in addition to the
#include <cstdio> // previous include statements.

bool save_triplets_bin(SparseMatrix<double> &mat, string fn)
{
    FILE *f = fopen(fn.c_str(), "wb");
    if (!f)
        return false;
        
    int xyn[3] = {mat.rows(), mat.cols(), mat.nonZeros()};
    fwrite(xyn, sizeof(__int32), 3, f);

    for (int k=0; k < mat.outerSize(); ++k)
    {
        SparseMatrix<double>::InnerIterator it(mat, k);
        for (; it; ++it)
        {
            __int32 rc[2] = {it.row(), it.col()};
            fwrite(rc, sizeof(__int32), 2, f);
            double v = it.value();
            fwrite(&v, sizeof(double), 1, f);
        }
    }

    fclose(f);
    return true;
}
 
bool load_triplets_bin(SparseMatrix<double> &a, string fn)
{
    FILE *f = fopen(fn.c_str(), "rb");
    if (!f)
        return false;

    int xyn[3];
    fread(xyn, sizeof(__int32), 3, f);
    a.resize(xyn[0], xyn[1]);
    vector<Triplet<double>> trips(xyn[2]);
    
    for (int k=0; k < trips.size(); ++k)
    {
        __int32 rc[2];
        fread(rc, sizeof(__int32), 2, f);
        double v;
        fread(&v, sizeof(double), 1, f);

        trips[k] = Triplet<double>(rc[0], rc[1], v);
    }

    fclose(f);

    a.setFromTriplets(trips.begin(), trips.end());

    return true;
}

One final note is that expression templates allow Eigen to generate efficient code by analyzing expressions as a whole. In the first example below, Eigen will not actually create an intermediate transposed copy of A and will simply change the order in which it loops over the entries in A. This means that breaking up expressions like in the second example is not equivalent and will negatively affect performance. Finally, if you want to force Eigen to evaluate any subexpression into a temporary, you can do so using the eval function.

SparseMatrix<double> AA = A.transpose() * A; // good

SparseMatrix<double> At = A.transpose(); // bad
SparseMatrix<double> AA = At * A;

SparseMatrix<double> AA = A.transpose().eval() * A; // force temp