Back in 2014 I wrote a master thesis implementing an exhaustive search algorithm for curves with many rational points over finite fields. The algorithm worked — it found some genuinely interesting curves — but I always had a nagging feeling that it was slower than it needed to be. Apparently the main algorithm loop had a redundancy that nobody had removed(even though quite a few people looked at the implementation), and I kept thinking someone would publish a cleaner version before I got around to it.
Recently I sat down and fixed it. The main fix is elementary in hindsight, as good fixes usually are. I also explored an further optimization ideas using Fourier analysis on the class group, but it turns out to be slower in practice, so I decided to keep for later.
Why Curves with Many Points?
Let \( \mathbb{F}_q \) be a finite field with \( q \) elements — say \( q = 7 \) or \( q = 2^{128} \). A smooth projective curve \( X \) over \( \mathbb{F}_q \) of genus \( g \) has at most
\[
\#X(\mathbb{F}_q) \leq q + 1 + 2g\sqrt{q}
\]
rational points. This is the Hasse–Weil bound, proved in full generality by Weil in 1948 after partial results by Hasse for elliptic curves (\( g = 1 \)) in the 1930s.
Curves that come close to this bound are useful in coding theory. A curve \( X \) with many points over \( \mathbb{F}_q \) gives a Goppa code (also called an algebraic-geometry code) with good parameters. Specifically, a curve of genus \( g \) with \( N \) rational points gives a linear code of length \( N \), dimension at least \( N – g + 1 – k \) for a degree-\( k \) divisor, and minimum distance at least \( k \). To make a good code you want \( N \) large relative to \( g \), which is exactly what Hasse–Weil measures.
The question is: how do you find them systematically? One approach, going back to Serre, is to search through families of curves. A generalisation of this approach uses the structure of unramified abelian coverings, which turns the search into an explicit computation in a finite abelian group, and whose results can be checked directly against manypoints.org.
Coverings and Class Field Theory
This section uses some algebraic geometry — class field theory, the Jacobian, the Artin map. These are important parts of the theoretical foundation, and I will describe them. But understanding the actual optimization only requires knowing how to count elements in cosets of a finite abelian group. If the geometry becomes unfamiliar, feel free to skim ahead to the Setup section.
The key idea is that we can build a new curve with many points from one we already have.
Let \( X \) be a genus-\( g \) curve over \( \mathbb{F}_q \) with \( N \) rational points. If \( Y \to X \) is an unramified covering of degree \( d \), then by Riemann–Hurwitz the genus of \( Y \) is \( g_Y = d(g-1) + 1 \).
Now, by class field theory, unramified abelian coverings of \( X \) are classified by subgroups of the Jacobian \( G = \mathrm{Pic}^0(X)(\mathbb{F}_q) \). Specifically, a subgroup \( H \leq G \) of index \( d \) corresponds to a degree-\( d \) covering. This is the arithmetic analogue of the classical correspondence between subgroups of \( \pi_1 \) and covering spaces. The Artin map makes this explicit: the Frobenius at a rational point \( P \in X(\mathbb{F}_q) \) acts on the covering by the class \( [P – P_j] \in G/H \) (for a chosen base point \( P_j \)). A point \( P_i \) splits completely in the covering — i.e., gives \( d \) rational points of the covering curve — if and only if this Frobenius is trivial, meaning \( [P_i – P_j] \in H \).
So the number of rational points on the covering \( Y_{H,j} \) corresponding to subgroup \( H \) with base point \( P_j \) is exactly
\[
\#Y_{H,j}(\mathbb{F}_q) = d \cdot \bigl|\{P_i \in X(\mathbb{F}_q) : [P_i – P_j] \in H\}\bigr|.
\]
This is beautiful: a question about covering curves reduces entirely to a counting problem in a finite abelian group. And it is completely computable in a system like Magma, which handles Jacobians, class groups, and subgroup enumeration explicitly.
The Setup in Magma
Magma is a computer algebra system specialised for number theory and algebraic geometry. If you do not have a local installation, you can run all the examples in this post for free at the Magma online calculator.
Fix a genus-\( g \) curve \( X \) over \( \mathbb{F}_q \) with \( N \) rational points \( P_1, \ldots, P_N \). Let \( G = \mathrm{Pic}^0(X)(\mathbb{F}_q) \) be the Jacobian and let \( \{H_1, \ldots, H_M\} \) be the complete list of subgroups of \( G \).
For a chosen base point \( P_j \), define the image set
\[
A_j = \{ [P_i – P_j] \in G : i = 1, \ldots, N \}.
\]
Then \( \#Y_{H,j}(\mathbb{F}_q) = [G:H] \cdot |A_j \cap H| \). The goal is to find, for each possible genus \( d(g-1)+1 \), the maximum value of \( [G:H] \cdot |A_j \cap H| \) over all subgroups \( H \) and base points \( j \).
I will use \( y^2 = x^7 + x \) over \( \mathbb{F}_7 \) as a running example throughout. This is a hyperelliptic curve of genus 3. Its Jacobian is \( G \cong \mathbb{Z}/8 \times \mathbb{Z}/8 \times \mathbb{Z}/8 \), with \( |G| = 512 \). There are \( M = 802 \) subgroups, \( N = 8 \) rational points, and 10 distinct genera to search over.
Setting this up in Magma:
F := GF(7);
R<x> := PolynomialRing(F);
C := HyperellipticCurve(x^7 + x);
g := Genus(C); // = 3
Cl, _, phi := ClassGroup(C);
Tors := TorsionSubgroup(Cl); // G = Pic^0(C)(F_7)
H_list := Subgroups(Tors); // all 802 subgroups
pts := RationalPoints(C); // 8 rational points
The phi map sends a divisor to its class in \( G \), so phi(Divisor(P) - Divisor(Q)) computes \( [P – Q] \in G \). Everything else follows from that.
Algorithm 1: The Naive Approach — \( O(M \cdot N^2) \)
The original 2014 algorithm is direct: for every base point \( P_j \) and every subgroup \( H \), count how many \( P_i \) satisfy \( [P_i – P_j] \in H \).
for f in [1..N] do
Pf := pts[f];
DivPts := [Divisor(pts[i]) - Divisor(Pf) : i in [1..N]];
for j in [1..M] do
H := H_list[j]`subgroup;
Ind := Index(Tors, H);
d := Ind * (g - 1) + 1;
k := #{i : i in [1..N] | phi(DivPts[i]) in H};
val := k * Ind;
// update best[d] if val is larger ...
end for;
end for;
The outer loop iterates over base points (\( N \) choices), the middle loop over subgroups (\( M \) choices), and the inner membership test checks all \( N \) points. That is \( O(M \cdot N^2) \) operations in total.
Why is this wasteful? Because for a fixed subgroup \( H \), we recompute the membership count from scratch for each of the \( N \) base points, even though these counts are intimately related to each other. We are throwing away information on every iteration of the outer loop.
On the \( \mathbb{F}_7 \) example: 1.15 seconds. Reasonable for 8 points and 802 subgroups, but notice how this scales: over \( \mathbb{F}_{13} \) the same curve has 24030 subgroups, and the naive loop takes 36 seconds.
Optimization: The Translation Lemma — \( O(M \cdot N) \)
Here is the key observation, and I think it is genuinely lovely.
Lemma. For any base point index \( j \), with \( a_k = [P_k – P_1] \in G \):
\[
A_j = A_1 – a_j \qquad \text{and therefore} \qquad |A_j \cap H| = |A_1 \cap (a_j + H)|.
\]
Proof. Simply compute: \( [P_i – P_j] = [P_i – P_1] – [P_j – P_1] = a_i – a_j \). So \( A_j = \{a_i – a_j : i = 1,\ldots,N\} = A_1 – a_j \). Then \( a_i – a_j \in H \) iff \( a_i \in a_j + H \), so \( |A_j \cap H| = |A_1 \cap (a_j + H)| \). ∎
What does this buy us? The question “how many points of \( A_j \) lie in \( H \)?” has become “how many points of \( A_1 \) lie in the coset \( a_j + H \)?” For a fixed \( H \), all \( N \) base-point queries are answered by a single histogram of \( A_1 \) modulo \( H \).
Concretely: for each subgroup \( H \), use Magma’s quotient map \( \pi: G \to G/H \) to map every element of \( A_1 \) to its coset, and count multiplicities. Then for each base point \( P_j \), the count \( |A_j \cap H| \) is just the histogram value at coset \( \pi(a_j) \).
// Compute A_1 once — the outer loop over base points disappears entirely
P1 := pts[1];
A1 := [phi(Divisor(pts[i]) - Divisor(P1)) : i in [1..N]];
for j in [1..M] do
H := H_list[j]`subgroup;
Ind := Index(Tors, H);
d := Ind * (g - 1) + 1;
Q, qmap := quo<Tors | H>; // quotient map pi: G -> G/H
// Build histogram over G/H in O(N)
hist := AssociativeArray(Q);
for a in A1 do
c := qmap(a);
hist[c] := IsDefined(hist, c) select hist[c] + 1 else 1;
end for;
// Answer all N base-point queries in O(1) each
for f in [1..N] do
c := qmap(A1[f]);
k := IsDefined(hist, c) select hist[c] else 0;
val := k * Ind;
// update best[d] ...
end for;
end for;
The outer loop over base points is gone. For each of the \( M \) subgroups we do \( O(N) \) work — total \( O(M \cdot N) \).
A small example. Suppose \( G = \mathbb{Z}/4 \) and \( A_1 = \{0, 1, 1, 3\} \) (with multiplicity — two points map to element 1). Take \( H = \{0, 2\} \leq G \), so \( G/H = \{H,\, 1+H\} \). The histogram is: coset \( H \) gets count 1 (only element 0), coset \( 1+H \) gets count 3 (two copies of 1 and one copy of 3, since \( 3 \in 1 + H \)). Now for base point \( P_j \) with \( a_j = 1 \): we look up coset \( 1 + H \) in the histogram and read off 3 immediately. No recomputation from \( A_j \) needed.
On \( y^2 = x^7 + x \) over \( \mathbb{F}_7 \): 1.15 s naive, 0.10 s with the Lemma — about a 12× speedup. The table below extends this to the same curve over larger fields; the speedup roughly tracks \( N \), consistent with the \( O(M \cdot N) \) analysis.
| Curve | N | M | Naive | Lemma | Speedup |
|---|---|---|---|---|---|
| \( y^2 = x^7+x \) / \( \mathbb{F}_7 \) | 8 | 802 | 1.15s | 0.10s | ~12× |
| \( y^2 = x^7+x \) / \( \mathbb{F}_{11} \) | 12 | 3612 | 12.08s | 0.57s | ~21× |
| \( y^2 = x^7+x \) / \( \mathbb{F}_{13} \) | 8 | 24030 | 36.52s | 3.04s | ~12× |
It might seem surprising that this was not done in 2014. In hindsight, the identity \( A_j = A_1 – a_j \) is one line. But the original code was written thinking of the outer loop as “iterating over base points” rather than “iterating over cosets,” and that framing makes the redundancy invisible.
What the Algorithm Finds
On \( y^2 = x^7 + x \) over \( \mathbb{F}_{13} \) (\( G \cong \mathbb{Z}/2 \times \mathbb{Z}/2 \times \mathbb{Z}/2 \times \mathbb{Z}/2 \times \mathbb{Z}/4 \times \mathbb{Z}/20 \), \( M = 24030 \) subgroups), both algorithms produce identical results. Here are the covering curves with genus up to 50:
| Genus of covering \( Y_{H,j} \) | Max \( \#Y_{H,j}(\mathbb{F}_{13}) \) found | Hasse–Weil bound |
|---|---|---|
| 3 | 8 | 35 |
| 5 | 16 | 50 |
| 9 | 32 | 78 |
| 11 | 40 | 93 |
| 17 | 48 | 136 |
| 21 | 80 | 165 |
| 33 | 80 | 251 |
| 41 | 160 | 309 |
With 24030 subgroups to search, this took 36.52 s for the naive algorithm and 3.04 s for the Translation Lemma — about a 12× speedup.
Comparing the Algorithms in Practice
Comparing implementations directly is essential for verifying that an optimization actually works — and that it gives the same answers. The code below runs both algorithms on the same curve and checks that every result agrees.
// ============================================================
// Comparison: Naive O(M*N^2) vs Translation Lemma O(M*N)
// Curve: y^2 = x^7 + x over F_p (hyperelliptic, genus 3)
// Change p below to run on a different prime.
// ============================================================
p := 7; // <-- set the prime here
F := GF(p);
R<x> := PolynomialRing(F);
ff := x^7 + x;
C := HyperellipticCurve(ff);
g := Genus(C);
Cl, _, phi := ClassGroup(C);
Tors := TorsionSubgroup(Cl);
H_list := Subgroups(Tors);
pts := RationalPoints(C);
N := #pts;
M := #H_list;
printf "Curve: y^2 = x^7 + x over F_%o\n", p;
printf "Genus = %o, invs = %o, |G| = %o\n", g, Invariants(Tors), &*Invariants(Tors);
printf "#RationalPoints = %o, #Subgroups = %o\n\n", N, M;
// ============================================================
// ALGORITHM 1: Naive O(M * N^2)
// ============================================================
printf "=== Algorithm 1: Naive O(M*N^2) ===\n";
t0 := Cputime();
best_naive := AssociativeArray();
for f in [1..N] do
Pf := pts[f];
DivPts := [Divisor(pts[i]) - Divisor(Pf) : i in [1..N]];
for j in [1..M] do
H := H_list[j]`subgroup;
Ind := Index(Tors, H);
d := Ind * (g - 1) + 1;
k := #{i : i in [1..N] | phi(DivPts[i]) in H};
val := k * Ind;
if not IsDefined(best_naive, d) or best_naive[d] lt val then
best_naive[d] := val;
end if;
end for;
end for;
for d in Sort(Setseq(Keys(best_naive))) do
printf "Genus = %o, Max points = %o\n", d, best_naive[d];
end for;
printf "CPU time: %o s\n\n", Cputime() - t0;
// ============================================================
// ALGORITHM 2: Translation Lemma O(M * N)
// ============================================================
printf "=== Algorithm 2: Translation Lemma O(M*N) ===\n";
t0 := Cputime();
P1 := pts[1];
A1 := [phi(Divisor(pts[i]) - Divisor(P1)) : i in [1..N]];
best_lemma := AssociativeArray();
for j in [1..#H_list] do
H := H_list[j]`subgroup;
Ind := Index(Tors, H);
d := Ind * (g - 1) + 1;
Q, qmap := quo<Tors | H>;
hist := AssociativeArray(Q);
for a in A1 do
c := qmap(a);
hist[c] := IsDefined(hist, c) select hist[c] + 1 else 1;
end for;
for f in [1..N] do
c := qmap(A1[f]);
k := IsDefined(hist, c) select hist[c] else 0;
val := k * Ind;
if not IsDefined(best_lemma, d) or best_lemma[d] lt val then
best_lemma[d] := val;
end if;
end for;
end for;
for d in Sort(Setseq(Keys(best_lemma))) do
printf "Genus = %o, Max points = %o\n", d, best_lemma[d];
end for;
printf "CPU time: %o s\n\n", Cputime() - t0;
// ============================================================
// COMPARISON
// ============================================================
printf "=== Comparison ===\n";
all_keys := Keys(best_naive) join Keys(best_lemma);
all_match := true;
for d in Sort(Setseq(all_keys)) do
v1 := IsDefined(best_naive, d) select best_naive[d] else -1;
v2 := IsDefined(best_lemma, d) select best_lemma[d] else -1;
if v1 ne v2 then
printf "MISMATCH at Genus=%o: Naive=%o Lemma=%o\n", d, v1, v2;
all_match := false;
end if;
end for;
if all_match then
printf "Both algorithms agree on all genera. OK\n";
end if;
With p := 7: Naive takes 1.15 s, the Lemma takes 0.10 s, and the comparison block prints Both algorithms agree on all genera. OK. Change p to 11 or 13 to run on a larger example — the speedup grows accordingly.
Bonus: An Alternative Idea Using Spectral Pruning
I also explored a second possible optimization: using the Fourier transform on the class group \( G \) to compute spectral upper bounds on how many points a covering can have. The idea is elegant — most subgroups can be ruled out without full evaluation by sorting them by their “spectral mass,” a quantity derived from the Fourier coefficients of the indicator function of \( A_1 \). In practice, however, this turned out to be slower than the Translation Lemma for hyperelliptic curves, which we use here because they offer the simplest computational model. The setup cost grows as \( O(N \cdot |G|) \approx O(N^4) \) for genus-3 hyperelliptic curves, while the Lemma grows as \( O(M \cdot N) \), and the FFT never catches up despite pruning 88–94% of subgroups. So lets retain this as a direction for future investigation…