Introduction

Given a string s and a substring sub, a common problem is counting how often sub occurs as a substring within s. While the naive algorithm often performs well in practice, it can be useful to explore other alternatives. Of course, there are many interesting algorithms available, each with its own strengths. For example, for Leetcode challenges I typically use the polynomial rolling-hash algorithm as my go-to approach. However, it does have some limitations. One notable challenge is adapting it to handle pattern matching where a character may match any other non-empty character — something not easily addressed with the rolling-hash method. For instance, if the pattern is "a??" and the string is "ababa", the question marks in the pattern should match any character given the total answer “2”, but the rolling-hash algorithm doesn’t naturally handle this type of wildcard matching.

Of course, the naive algorithm can be easily adjusted to handle this case, but for certain combinations of s and sub, its running time complexity can become proportional to the product of the lengths of the strings, which is a bit inefficient. Today, we’ll explore another approach: using polynomial multiplication with FFT (Fast Fourier Transform) for pattern matching. This method provides a more efficient way to solve the problem, especially for larger strings. As with some of our previous posts, we’ll use Magma to implement this approach due to its convenient and powerful capabilities for handling polynomials and number-theoretic computations.

The simplified idea behind the approach

Let’s first consider a simpler case where special characters are not allowed. Imagine we map each character in the strings to an integer value, for example, using their ASCII codes. Now, suppose we want to compare two strings a and b of length n. We can define a function to measure the difference between a and b by looking at the squared differences of corresponding characters. Specifically, consider the expression: f(k) = i (ak+i − bi)2.

This sum f(k) is zero if and only if for each index i, we have ak+i = bi​, meaning the substring of a starting at position k in exactly matches b.

Now let’s expand the expression for f(k). When we expand this, we get:

i (ak+i − bi)2 = ∑i (ak+i)2 - 2 ∑ ak+i bi + ∑i (bi)2

Now, the interesting part is how we can efficiently recalculate this expression as we roll over index k when shifting through the string a. The third term is a constant because b is fixed. The a2 term ​ is also straightforward: as we move the window one position to the right, we simply remove the first element of the previous window and add the new element at the end.

The real challenge lies in the middle term, which involves multiplying the corresponding characters from the two strings. This is where convolution and Fast Fourier Transform (FFT) come into play. By using convolution, we can efficiently compute the sum over all i for each possible position k in the string. This allows us to calculate the cross term efficiently.

To see how the polynomial multiplication helps, let’s consider the product of two polynomials:

(a0 + a1x + a2x2 + … )(b0 + b1x + b2x2 + … ) = k (i​ a​k bi−k​)xk

This is essentially the convolution of the coefficients a_i​ and b_i​. Each coefficient of the resulting polynomial is obtained by summing the products of corresponding coefficients from the two polynomials. This is similar to what we need in pattern matching, where we calculate sums of products between corresponding parts of the two strings.

So far, this may not seem much simpler, but the key point is that we can compute the product of two polynomials of degree n in approximately n log⁡(n) operations, rather than the n2 operations required by the naive approach.

The simple matching algorithm

This leads to the following algorithm:

  1. Map the string to polynomials. Note that the second string should be reversed to match the desired order of indices.
  2. Compute their product using FFT (which is the default in Magma).
  3. Roll over the coefficients and update the sum of squares of differences. If the result is zero, a match is found, and we increase the match count.

Implementation of the simple matching algorithm

Here is the code in Magma to achieve it:

R<x> := PolynomialRing(Rationals());

StringToPolynomial := function(s, reverse)
    polynomial := 0;
    n := #s;
    for i := 1 to #s do
        m := StringToCode(s[i]);
        if reverse then
            term := m * x^(n - i);
        else
            term := m * x^(i - 1);
        end if;
        polynomial := polynomial + term;
    end for;
    return polynomial;
end function;

FFTCount := function(a, b)
    b_squared_sum := 0;
    window_square_sum := 0;

    for i := 1 to #b do
        b_squared_sum +:= StringToCode(b[i])^2;
        window_square_sum +:= StringToCode(a[i])^2;
    end for;

    poly_a := StringToPolynomial(a, false);
    poly_b := StringToPolynomial(b, true);
    poly_prod := poly_a * poly_b;

    match_count := 0;
    
    for i := 1 to #a - #b + 1 do
        correlation := Real(Coefficient(poly_prod, i + #b - 2));
        diff_sum := window_square_sum - 2 * correlation + b_squared_sum;
        
        if diff_sum eq 0 then
            match_count +:= 1;
        end if;

        if i + #b le #a then
            old_char := StringToCode(a[i]);
            new_char := StringToCode(a[i + #b]);
            window_square_sum := window_square_sum - old_char^2 + new_char^2;
        end if;
    end for;

    return match_count;
end function;

As you can see the code is quite elementary. All the computational complexity is hidden inside the polynomial multiplication. Lets also add a “naive” count approach and compare performances:

NaiveCount := function(a, b)
    total_matches := 0;

    for i := 1 to #a - #b + 1 do
        match := true;
        for j := 1 to #b do
            if a[i + j - 1] ne b[j] then
                match := false;
                break;
            end if;
        end for;
        
        if match then
            total_matches +:= 1;
        end if;
    end for;

    return total_matches;
end function;

CompareAndTime := function(a, b)
    start_time := Cputime();
    result1 := FFTCount(a, b);
    end_time := Cputime();
    print "FFTCount result: ", result1;
    print "Execution time for FFTCount: ", end_time - start_time, " seconds";

    start_time := Cputime();
    result2 := NaiveCount(a, b);
    end_time := Cputime();
    print "NaiveCount result: ", result2;
    print "Execution time for NaiveCount: ", end_time - start_time, " seconds";
    return 0;
end function;

a := "aaaaaaaaa"^1000;
b := "aaaa"^1000;

CompareAndTime(a, b);

On small inputs, the naive algorithm naturally outperforms the FFT-based approach. However, when the strings are several thousand characters long and structured in a way that quick escape conditions provide little benefit, the FFT approach shows a significant advantage:

FFT polynomial multiplication time:  1.040  seconds
FFTCount result:  5001
Execution time for FFTCount:  1.090  seconds
NaiveCount result:  5001
Execution time for NaiveCount:  9.400  seconds

The idea behind the pattern matching

A natural question arises: how can this approach be adapted to handle pattern matching with special characters? The first intuitive idea is to replace the coefficient corresponding to ? in sub with zero. This adjustment ensures that the contribution of the corresponding ak+i becomes zero, essentially allowing ak+i to match any character—exactly as needed. However, the contribution of ak+i2 must still be accounted for in the overall computation.

One possible approach is to map the coefficients not to integers but to powers of an N-th root of unity, where N is the size of the alphabet, in our case 128 will be more than enough. The key algebraic idea is that the product of an N-th root of unity with its conjugate equals 1, while the product of two non-conjugate, non-trivial roots has a real part strictly less than 1. This allows the real part of the product of coefficients to serve as our ‘character equality indicator function‘ distinguishing matches from mismatches.

The algorithm

This leads to the following algorithm:

  1. Map the string to a polynomial where the i-th character is represented by ζi, where ζ= ζN​ is a primitive N-th root of unity. The ‘?‘ characters are mapped to zero. The second polynomial should be reversed, and its coefficients should be complex conjugated.
  2. Compute their product using FFT.
  3. Loop over the coefficients of the resulting polynomial and check which have a real part equal to the length of the substring sub minus the number of ‘?’ characters occurring in it.

The implementation

Let’s first adjust the code for the initial step. Since Magma provides native support for number theory and algebraic geometry, modifying the code is straightforward. We simply initialize the polynomial ring to have coefficients in the cyclotomic field with N=128:

R<x> := PolynomialRing(CyclotomicField(128));

RootOfUnityMap := function()
    mapping := [];
    root := CyclotomicField(128).1;
    for i := 0 to 127 do
        Append(~mapping, root^i);
    end for;
    return mapping;
end function;

StringToPolynomial := function(s, reverse, precomputed_map)
    n := #s;
    coefficients := [CyclotomicField(128)!0 : i in [1..n]];
    for i := 1 to n do
        if s[i] eq "?" then
            m := CyclotomicField(128)!0;
        else
            m := precomputed_map[StringToCode(s[i]) mod 128 + 1];
            if reverse then
                coefficients[n - i + 1] := m^-1;
            else
                coefficients[i] := m;
            end if;
        end if;
    end for;

    return PolynomialRing(CyclotomicField(128))!coefficients;
end function;

CountQuestionMarks := function(b)
    count := 0;
    for i := 1 to #b do
        if b[i] eq "?" then
            count +:= 1;
        end if;
    end for;
    return count;
end function;

FFTCount := function(a, b)
    precomputed_map := RootOfUnityMap();
    poly_a := StringToPolynomial(a, false, precomputed_map);
    poly_b := StringToPolynomial(b, true, precomputed_map);

    start_time := Cputime();
    poly_prod := poly_a * poly_b;
    end_time := Cputime();
    
    match_count := 0;
    
    expected_matches := #b - CountQuestionMarks(b);
    for i := 0 to #a - #b do
        correlation := Real(Coefficient(poly_prod, i + #b - 1));
        if Abs(correlation - expected_matches) lt 1e-3 then
            match_count +:= 1;
        end if;
    end for;

    return match_count;
end function;

Of course, the NaiveCount method also requires a trivial adjustment inside the for loop:

        for j := 1 to #b do
            if b[j] eq "?" then
                continue;
            elif a[i + j - 1] ne b[j] then
                match := false;
                break;
            end if;
        end for;


Finally we can run on some big input like a := “abcd”^5000 and b := “a??d”^1000 and get the following output:

Optimized string to polynomial conversion time:  0.040  seconds
Optimized string to polynomial conversion time:  0.170  seconds
FFT polynomial multiplication time:  2.280  seconds
Result checking time:  0.200  seconds
FFTCount result:  4001
Execution time for FFTCount:  2.700  seconds
NaiveCount result:  4001
Execution time for NaiveCount:  6.390  seconds

This shows that polynomial multiplication can be quite effective, even for tasks as specific as counting substrings. The final implementation in one code block with some additional debugging information:

R<x> := PolynomialRing(CyclotomicField(128));

RootOfUnityMap := function()
    mapping := [];
    root := CyclotomicField(128).1;
    for i := 0 to 127 do
        Append(~mapping, root^i);
    end for;
    return mapping;
end function;

StringToPolynomial := function(s, reverse, precomputed_map)
    start_time := Cputime();
    n := #s;
    coefficients := [CyclotomicField(128)!0 : i in [1..n]];
    for i := 1 to n do
        if s[i] eq "?" then
            m := CyclotomicField(128)!0;
        else
            m := precomputed_map[StringToCode(s[i]) mod 128 + 1];
            if reverse then
                coefficients[n - i + 1] := m^-1;
            else
                coefficients[i] := m;
            end if;
        end if;
    end for;

    polynomial := PolynomialRing(CyclotomicField(128))!coefficients;

    end_time := Cputime();
    print "Optimized string to polynomial conversion time: ", end_time - start_time, " seconds";
    return polynomial;
end function;

CountQuestionMarks := function(b)
    count := 0;
    for i := 1 to #b do
        if b[i] eq "?" then
            count +:= 1;
        end if;
    end for;
    return count;
end function;

FFTCount := function(a, b)
    precomputed_map := RootOfUnityMap();
    poly_a := StringToPolynomial(a, false, precomputed_map);
    poly_b := StringToPolynomial(b, true, precomputed_map);

    start_time := Cputime();
    poly_prod := poly_a * poly_b;
    end_time := Cputime();
    print "FFT polynomial multiplication time: ", end_time - start_time, " seconds";
    
    match_count := 0;
    result_check_start_time := Cputime();
    
    expected_matches := #b - CountQuestionMarks(b);
    for i := 0 to #a - #b do
        correlation := Real(Coefficient(poly_prod, i + #b - 1));
        if Abs(correlation - expected_matches) lt 1e-3 then
            match_count +:= 1;
        end if;
    end for;

    result_check_end_time := Cputime();
    print "Result checking time: ", result_check_end_time - result_check_start_time, " seconds";

    return match_count;
end function;

NaiveCount := function(a, b)
    total_matches := 0;
    for i := 1 to #a - #b + 1 do
        match := true;
        for j := 1 to #b do
            if b[j] eq "?" then
                continue;
            elif a[i + j - 1] ne b[j] then
                match := false;
                break;
            end if;
        end for;
        if match then
            total_matches +:= 1;
        end if;
    end for;
    return total_matches;
end function;

CompareAndTime := function(a, b)
    start_time := Cputime();
    result1 := FFTCount(a, b);
    end_time := Cputime();
    print "FFTCount result: ", result1;
    print "Execution time for FFTCount: ", end_time - start_time, " seconds";

    start_time := Cputime();
    result2 := NaiveCount(a, b);
    end_time := Cputime();
    print "NaiveCount result: ", result2;
    print "Execution time for NaiveCount: ", end_time - start_time, " seconds";

    return 0;
end function;

a := "abcd"^5000;
b := "a??d"^1000;

CompareAndTime(a, b);