An introduction to the LLL-algorithm

Just a note in case you are here to evaluate this blogpost for my GSoC-proposal: This is my old website, I am currently building a new one here (which should be finished before the summer starts). When it is done I will move this blogpost. So that’s just an explanation for why the website might seem a bit outdated.


In this blogpost we will take a closer look at what exactly the LLL-algorithm does. We will not describe the internal workings of the algorith, but rather try to explain what it tries to achieve.

The case of regular vector spaces

Given a set of vectors \(v_1,\cdots,v_n\), the vector (sub)space generated by these vectors is defined as \(V = \{\lambda_1 v_1 + \lambda_2 v_2 + \cdots + \lambda_n v_n | \lambda_i \in \mathbb{R}\}\), i.e. all linear combinations of these basis vectors. Note that, given such a space \(V\), we can pick any set of n vectors as a basis, as long as they are linearly independent. However, it quickly becomes clear that some bases are just ‘nicer’ than others, consider the following bases of \(\mathbb{R}^3\): \[ (1, 0, 0)^T, (0, 0, 1)^T, (0, 1, 0)^T \] or \[ (1, 2, 3)^T, (23, 54, 6254)^T, (\pi, \mathbf{e}, -1)^T \] While the second set of vectors is indeed linearly independent, it is quite clear that working with this basis would be quite a pain. Why? For two reasons: 1) the vectors are not orthogonal, and 2) the vectors do not have length 1.

A ‘nice’ basis

While not having an orthonormal basis is not an insurmountable disaster, there are a lot of things that become a lot easier when your basis is orthonormal. For example, given that we have some vector that is a linear combination of the basis vectors described as \(\lambda_1, \lambda_2,\cdots, \lambda_n\), what is the projection of this vector on the i’th basisvector? If the basis is orthonormal, this is clearly \(\lambda_i\), whereas with a non-‘nice’ basis you would have to calculate the projection. In other words, we can define a ‘nice’ basis for a vector (sub)space as a basis that is orthonormal. But then the next question is: What if we only have an ‘ugly’ basis, can we get to a ‘nice’ one quickly? It turns out: we can! The Gram-Schmidt-process is a simple algorithm that orthonormalizes a given set of linearly independent vectors in \(\mathcal{O}(n^2)\) such that the generated space is the same as before. The algorithm essentially works like this: for each basis vector \(v_i\), normalize it and project all \(v_j\) (\(j > i\)) into \(v_i^\perp\) (the orthagonal complement of \(v_i\)). We can observe that this projection is a non-trivial linear combination of \(v_i\) and \(v_j\), and thus preserves the generated vector space. An implementation I made of this algorithm can be found here.

Lattices

We move on to lattices. The definition of a lattice is quite simple: we are again given a set of  basis vectors \(v_1, \cdots, v_n\), and look at its lattice: \[ \mathcal{L}(v) = \{\lambda_1 v_1 + \lambda_2 v_2 + \cdots + \lambda_n v_n | \lambda_i \in \mathbb{Z} \} \] What has changed? We only allow integer multiples of the basis! It turns out, this complicates things considerably. As a start, we cannot just replace our \(n\) basis vectors by \(n\) other linearly independent vectors from the same space. As an example, consider \((1, 0)\) and \((0, 1)\), which generate \(\mathbb{Z}^2\), whereas \((2, 0)\) and \((0, 2)\) generate \( 2\mathbb{Z}^2 \) (all tuples of even numbers). This implies that our Gram-Schmidt-process won’t work anymore either: it would convert the second basis in our example to the first one, which does not lie in the lattice. This is because \((1, 0)\) is not an integer multiple of \((2, 0)\). As another example, if our basis is \((1, 0)\), and the same vector rotated 60 degrees. This results in the following lattice:

Taken from wikimedia

After playing with this basis a bit on paper, it becomes clear that there is no orthonormal basis for this lattice.

Can we still have nice bases?

So, when we cannot find a nice basis anymore, what do we do? We redefine ‘nice’ of course! On a more serious note, it is not entirely clear what a ‘nice’ basis for a lattice is. Ideally we would like the vectors in our basis to be short and as orthogonal as possible. In their 1982 paper “Factoring polynomials with rational coefficients”, Arjen Lenstra, Hendrik Lenstra and László Lovász introduced both a concrete notion of ‘a nice lattice basis’ and an algorithm to compute one: the LLL-algorithm. So what is the new ‘nice’? Given our basis \(v_1,\cdots, v_n\), let \(v_1^*,\cdots,v_n^*\) be the orthonormalization of this basis (i.e. the ‘nice’ basis if we forgot for once second that we were in a lattice). Then an LLL-reduced basis satisfies the following conditions:

  • The basis is size-reduced: \( |({v_i \cdot v_j^*}) / ({v_j^* \cdot v_j^*})| \leq \frac{1}{2} \) for \(1\leq j < i \leq n\) where \(\cdot\) denotes the inproduct. What does this mean? Note that the value of the left hand side is the (absolute value of the) projection of \(v_i\) onto \(v_j^*\). If this was the projection of \(v_i^*\), this value would be zero as \(v_i^*\) and \(v_j^*\) are orthagonal. By requiring the value to be smaller than \(\frac{1}{2}\), we are asking \(v_i\) to simultaneously be short and sufficiently orthagonal to \(v_j^*\) (note that since \(j < i\), \(v_j^* \) is not based on \(v_i\). For \(j > i\), this condition trivially holds.
  • The Lovász condition: \(|v_i^* + \mu_{i i-1} v_{i – 1}^*|^2 \geq \delta |v_{i – 1}^*|^2 \) for \(\frac{1}{4} < \delta < 1\) and \(\mu_{i i – 1} = |v_i \cdot v_{i – 1}^*| / |v_{i – 1}^* \cdot v_{i – 1}^*|\). Note that the left hand side of this expression is the projection of \(v_i\) into the orthagonal complement of the first i – 2 vectors, and the right hand side the projection of \(v_{i – 1}\) into the first i – 2 vectors. In other words, \(v_i\) should be ‘further into this complement’ than \(v_{i – 1}\).

It turns out that using these conditions, we can compute a ‘nice’ basis for a lattice in polynomial time! In particular, the LLL-algorithm does this in \(\mathcal{O}(d^5 \cdot n \cdot \log_3 B)\) for \(B\) the length of the largest of the \(v_i\) (under the euclidean norm). We will look deeper into this algorithm later, but if you can’t wait I would recommend either the original paper (Factoring Polynomials with Rational Coefficients) or the book ‘A Course in Computational Algebraic Number Theory’ by Henri Cohen.