Introduction
Given a string s
and a substring sub
, a common problem is counting how often sub
occurs as a substring within s
. While the naive algorithm often performs well in practice, it can be useful to explore other alternatives. Of course, there are many interesting algorithms available, each with its own strengths. For example, for Leetcode challenges I typically use the polynomial rolling-hash algorithm as my go-to approach. However, it does have some limitations. One notable challenge is adapting it to handle pattern matching where a character may match any other non-empty character — something not easily addressed with the rolling-hash method. For instance, if the pattern is "a??"
and the string is "ababa"
, the question marks in the pattern should match any character given the total answer “2”, but the rolling-hash algorithm doesn’t naturally handle this type of wildcard matching.
Of course, the naive algorithm can be easily adjusted to handle this case, but for certain combinations of s
and sub
, its running time complexity can become proportional to the product of the lengths of the strings, which is a bit inefficient. Today, we’ll explore another approach: using polynomial multiplication with FFT (Fast Fourier Transform) for pattern matching. This method provides a more efficient way to solve the problem, especially for larger strings. As with some of our previous posts, we’ll use Magma to implement this approach due to its convenient and powerful capabilities for handling polynomials and number-theoretic computations.
The simplified idea behind the approach
Let’s first consider a simpler case where special characters are not allowed. Imagine we map each character in the strings to an integer value, for example, using their ASCII codes. Now, suppose we want to compare two strings a
and b
of length n
. We can define a function to measure the difference between a
and b
by looking at the squared differences of corresponding characters. Specifically, consider the expression: f(k) = ∑i (ak+i − bi)2.
This sum f(k)
is zero if and only if for each index i
, we have ak+i = bi,
meaning the substring of a starting at position k
in exactly matches b
.
Now let’s expand the expression for f(k). When we expand this, we get:
∑i (ak+i − bi)2 = ∑i (ak+i)2 - 2 ∑ ak+i bi + ∑i (bi)2
Now, the interesting part is how we can efficiently recalculate this expression as we roll over index k
when shifting through the string a
. The third term is a constant because b
is fixed. The a2
term is also straightforward: as we move the window one position to the right, we simply remove the first element of the previous window and add the new element at the end.
The real challenge lies in the middle term, which involves multiplying the corresponding characters from the two strings. This is where convolution and Fast Fourier Transform (FFT) come into play. By using convolution, we can efficiently compute the sum over all i
for each possible position k
in the string. This allows us to calculate the cross term efficiently.
To see how the polynomial multiplication helps, let’s consider the product of two polynomials:
(a0 + a1x + a2x2 + … )(b0 + b1x + b2x2 + … ) =
(∑
k∑
i
ak bi−k)xk
This is essentially the convolution of the coefficients a_i
and b_i
. Each coefficient of the resulting polynomial is obtained by summing the products of corresponding coefficients from the two polynomials. This is similar to what we need in pattern matching, where we calculate sums of products between corresponding parts of the two strings.
So far, this may not seem much simpler, but the key point is that we can compute the product of two polynomials of degree n
in approximately n log(n)
operations, rather than the n2
operations required by the naive approach.
The simple matching algorithm
This leads to the following algorithm:
- Map the string to polynomials. Note that the second string should be reversed to match the desired order of indices.
- Compute their product using FFT (which is the default in Magma).
- Roll over the coefficients and update the sum of squares of differences. If the result is zero, a match is found, and we increase the match count.
Implementation of the simple matching algorithm
Here is the code in Magma to achieve it:
R<x> := PolynomialRing(Rationals());
StringToPolynomial := function(s, reverse)
polynomial := 0;
n := #s;
for i := 1 to #s do
m := StringToCode(s[i]);
if reverse then
term := m * x^(n - i);
else
term := m * x^(i - 1);
end if;
polynomial := polynomial + term;
end for;
return polynomial;
end function;
FFTCount := function(a, b)
b_squared_sum := 0;
window_square_sum := 0;
for i := 1 to #b do
b_squared_sum +:= StringToCode(b[i])^2;
window_square_sum +:= StringToCode(a[i])^2;
end for;
poly_a := StringToPolynomial(a, false);
poly_b := StringToPolynomial(b, true);
poly_prod := poly_a * poly_b;
match_count := 0;
for i := 1 to #a - #b + 1 do
correlation := Real(Coefficient(poly_prod, i + #b - 2));
diff_sum := window_square_sum - 2 * correlation + b_squared_sum;
if diff_sum eq 0 then
match_count +:= 1;
end if;
if i + #b le #a then
old_char := StringToCode(a[i]);
new_char := StringToCode(a[i + #b]);
window_square_sum := window_square_sum - old_char^2 + new_char^2;
end if;
end for;
return match_count;
end function;
As you can see the code is quite elementary. All the computational complexity is hidden inside the polynomial multiplication. Lets also add a “naive” count approach and compare performances:
NaiveCount := function(a, b)
total_matches := 0;
for i := 1 to #a - #b + 1 do
match := true;
for j := 1 to #b do
if a[i + j - 1] ne b[j] then
match := false;
break;
end if;
end for;
if match then
total_matches +:= 1;
end if;
end for;
return total_matches;
end function;
CompareAndTime := function(a, b)
start_time := Cputime();
result1 := FFTCount(a, b);
end_time := Cputime();
print "FFTCount result: ", result1;
print "Execution time for FFTCount: ", end_time - start_time, " seconds";
start_time := Cputime();
result2 := NaiveCount(a, b);
end_time := Cputime();
print "NaiveCount result: ", result2;
print "Execution time for NaiveCount: ", end_time - start_time, " seconds";
return 0;
end function;
a := "aaaaaaaaa"^1000;
b := "aaaa"^1000;
CompareAndTime(a, b);
On small inputs, the naive algorithm naturally outperforms the FFT-based approach. However, when the strings are several thousand characters long and structured in a way that quick escape conditions provide little benefit, the FFT approach shows a significant advantage:
FFT polynomial multiplication time: 1.040 seconds
FFTCount result: 5001
Execution time for FFTCount: 1.090 seconds
NaiveCount result: 5001
Execution time for NaiveCount: 9.400 seconds
The idea behind the pattern matching
A natural question arises: how can this approach be adapted to handle pattern matching with special characters? The first intuitive idea is to replace the coefficient corresponding to ?
in sub
with zero. This adjustment ensures that the contribution of the corresponding ak+i
becomes zero, essentially allowing ak+i
to match any character—exactly as needed. However, the contribution of ak+i2
must still be accounted for in the overall computation.
One possible approach is to map the coefficients not to integers but to powers of an N
-th root of unity, where N
is the size of the alphabet, in our case 128 will be more than enough. The key algebraic idea is that the product of an N
-th root of unity with its conjugate equals 1
, while the product of two non-conjugate, non-trivial roots has a real part strictly less than 1
. This allows the real part of the product of coefficients to serve as our ‘character equality indicator function‘ distinguishing matches from mismatches.
The algorithm
This leads to the following algorithm:
- Map the string to a polynomial where the
i
-th character is represented byζi
, whereζ= ζN
is a primitiveN
-th root of unity. The ‘?
‘ characters are mapped to zero. The second polynomial should be reversed, and its coefficients should be complex conjugated. - Compute their product using FFT.
- Loop over the coefficients of the resulting polynomial and check which have a real part equal to the length of the substring
sub
minus the number of ‘?’ characters occurring in it.
The implementation
Let’s first adjust the code for the initial step. Since Magma provides native support for number theory and algebraic geometry, modifying the code is straightforward. We simply initialize the polynomial ring to have coefficients in the cyclotomic field with N=128
:
R<x> := PolynomialRing(CyclotomicField(128));
RootOfUnityMap := function()
mapping := [];
root := CyclotomicField(128).1;
for i := 0 to 127 do
Append(~mapping, root^i);
end for;
return mapping;
end function;
StringToPolynomial := function(s, reverse, precomputed_map)
n := #s;
coefficients := [CyclotomicField(128)!0 : i in [1..n]];
for i := 1 to n do
if s[i] eq "?" then
m := CyclotomicField(128)!0;
else
m := precomputed_map[StringToCode(s[i]) mod 128 + 1];
if reverse then
coefficients[n - i + 1] := m^-1;
else
coefficients[i] := m;
end if;
end if;
end for;
return PolynomialRing(CyclotomicField(128))!coefficients;
end function;
CountQuestionMarks := function(b)
count := 0;
for i := 1 to #b do
if b[i] eq "?" then
count +:= 1;
end if;
end for;
return count;
end function;
FFTCount := function(a, b)
precomputed_map := RootOfUnityMap();
poly_a := StringToPolynomial(a, false, precomputed_map);
poly_b := StringToPolynomial(b, true, precomputed_map);
start_time := Cputime();
poly_prod := poly_a * poly_b;
end_time := Cputime();
match_count := 0;
expected_matches := #b - CountQuestionMarks(b);
for i := 0 to #a - #b do
correlation := Real(Coefficient(poly_prod, i + #b - 1));
if Abs(correlation - expected_matches) lt 1e-3 then
match_count +:= 1;
end if;
end for;
return match_count;
end function;
Of course, the NaiveCount method also requires a trivial adjustment inside the for loop:
for j := 1 to #b do
if b[j] eq "?" then
continue;
elif a[i + j - 1] ne b[j] then
match := false;
break;
end if;
end for;
Finally we can run on some big input like a := “abcd”^5000 and b := “a??d”^1000 and get the following output:
Optimized string to polynomial conversion time: 0.040 seconds
Optimized string to polynomial conversion time: 0.170 seconds
FFT polynomial multiplication time: 2.280 seconds
Result checking time: 0.200 seconds
FFTCount result: 4001
Execution time for FFTCount: 2.700 seconds
NaiveCount result: 4001
Execution time for NaiveCount: 6.390 seconds
—
This shows that polynomial multiplication can be quite effective, even for tasks as specific as counting substrings. The final implementation in one code block with some additional debugging information:
R<x> := PolynomialRing(CyclotomicField(128));
RootOfUnityMap := function()
mapping := [];
root := CyclotomicField(128).1;
for i := 0 to 127 do
Append(~mapping, root^i);
end for;
return mapping;
end function;
StringToPolynomial := function(s, reverse, precomputed_map)
start_time := Cputime();
n := #s;
coefficients := [CyclotomicField(128)!0 : i in [1..n]];
for i := 1 to n do
if s[i] eq "?" then
m := CyclotomicField(128)!0;
else
m := precomputed_map[StringToCode(s[i]) mod 128 + 1];
if reverse then
coefficients[n - i + 1] := m^-1;
else
coefficients[i] := m;
end if;
end if;
end for;
polynomial := PolynomialRing(CyclotomicField(128))!coefficients;
end_time := Cputime();
print "Optimized string to polynomial conversion time: ", end_time - start_time, " seconds";
return polynomial;
end function;
CountQuestionMarks := function(b)
count := 0;
for i := 1 to #b do
if b[i] eq "?" then
count +:= 1;
end if;
end for;
return count;
end function;
FFTCount := function(a, b)
precomputed_map := RootOfUnityMap();
poly_a := StringToPolynomial(a, false, precomputed_map);
poly_b := StringToPolynomial(b, true, precomputed_map);
start_time := Cputime();
poly_prod := poly_a * poly_b;
end_time := Cputime();
print "FFT polynomial multiplication time: ", end_time - start_time, " seconds";
match_count := 0;
result_check_start_time := Cputime();
expected_matches := #b - CountQuestionMarks(b);
for i := 0 to #a - #b do
correlation := Real(Coefficient(poly_prod, i + #b - 1));
if Abs(correlation - expected_matches) lt 1e-3 then
match_count +:= 1;
end if;
end for;
result_check_end_time := Cputime();
print "Result checking time: ", result_check_end_time - result_check_start_time, " seconds";
return match_count;
end function;
NaiveCount := function(a, b)
total_matches := 0;
for i := 1 to #a - #b + 1 do
match := true;
for j := 1 to #b do
if b[j] eq "?" then
continue;
elif a[i + j - 1] ne b[j] then
match := false;
break;
end if;
end for;
if match then
total_matches +:= 1;
end if;
end for;
return total_matches;
end function;
CompareAndTime := function(a, b)
start_time := Cputime();
result1 := FFTCount(a, b);
end_time := Cputime();
print "FFTCount result: ", result1;
print "Execution time for FFTCount: ", end_time - start_time, " seconds";
start_time := Cputime();
result2 := NaiveCount(a, b);
end_time := Cputime();
print "NaiveCount result: ", result2;
print "Execution time for NaiveCount: ", end_time - start_time, " seconds";
return 0;
end function;
a := "abcd"^5000;
b := "a??d"^1000;
CompareAndTime(a, b);