Introduction.
As the title suggests, I’ll attempt to explain why mathematicians often value proofs more than the final result itself. The short answer is that a proof reveals how something works. It shows which arguments are being used, how they might apply in broader contexts, and which components are essential versus incidental. In this way, a proof offers a deeper understanding of the structure behind the result — something far more profound than the bare fact alone.
And I think I have a good example in mind to illustrate this point. I recently came across an elegant proof of an elementary combinatorial formula involving binomial coefficients. Honestly, I didn’t care much about the final formula — in the end, it’s just one of hundreds of similar identities you can find in any combinatorics book. What truly struck me was the proof itself, and I couldn’t resist sharing it.
The formula and the magic trick.
Statement of the result.
For integers \( \ell \) and \( n \) such that \( 0 \leq \ell < n \), we have: \[ \sum_{k=0}^{n} (-1)^k {n\choose k} k^{\ell} = 0, \] where \( {n\choose k} = \frac{n!}{k! (n -k)!} \) denotes binomial coefficients. So for instance, for \( n = 3 \) and \( \ell = 2 \), we get: \[ \sum_{k=0}^{3} (-1)^k {3\choose k} k^2 = (-1)^0 \cdot 1 \cdot 0^2 + (-1)^1 \cdot 3 \cdot 1^2 + (-1)^2 \cdot 3 \cdot 2^2 + (-1)^3 \cdot 1 \cdot 3^2 = 0 - 3 + 12 - 9 = 0. \]Alright, at first glance this might seem like just another formula involving binomial coefficients and powers. But is it really telling us something special about the binomial coefficients or maybe is it more about the behavior of the powers? How can we generalize it? What is real meaning of these numbers? To get closer to the answers we need to look at the proof and reflect on it. The proof I’m going to present appears as an intermediate lemma in the notes of Richard Evan Schwartz I accidentally found online.
The proof.
Consider the product \( (1 – q)^n \). According to the binomial theorem it can be expanded as: \[ (1 – q)^n = \sum_{k=0}^{n} (-1)^k {n\choose k} q^k. \]Plugging in \( q = e^{x} \), we get: \[ (1 – e^{x})^n = \sum_{k=0}^{n} (-1)^k {n\choose k} e^{x k}. \]
Now, expressing \[ e^{x} = 1 + x + \frac{x^2}{2!} + \cdots , \] we see that the series expansion of the left-hand side \( (1 – e^{x})^n \) starts from \( x^n \). This implies that if we differentiate this expression \( \ell \)-times for \( \ell < n \) and then substitute \( x = 0 \) we get that the result is zero. On the other hand differentiating the right-hand side \( \ell \)-times and plugging in \( x = 0 \) gives us the sum \[ \sum_{k=0}^{n} (-1)^k {n\choose k} k^{\ell}. \]What a brilliant trick!
Looking at this proof, one can already spot a few obvious generalizations. For instance, what happens if \( \ell = n \)? We know that the coefficient of \( x^n \) in the expansion is \( (-1)^{n} \). So, when we differentiate \( n \) times and set \( x = 0 \) we obtain \((-1)^{n}n! \), which in combination with previous observations gives us: \[ \sum_{k=0}^{n} (-1)^k {n\choose k} k^{n} = (-1)^{n}n! \]
But reflecting a bit deeper, we can uncover far more interesting generalizations.
A first generalization
Remember, everything started from the product
\[
(1 – q)^n = \sum_{k=0}^n (-1)^k {n\choose k} q^k.
\]
But why?
Without any difficulty, for any (finite) sequence of natural numbers \( a_1, a_2, \ldots, a_n \), we can consider the product
\[
(1 – q^{a_1})(1 – q^{a_2}) \cdots (1 – q^{a_n}) = \sum_{k} c_k q^{k}.
\]
What are the coefficients \( c_k \) in terms of the \( a_i \)?
Since \( q^{x} \cdot q^{y} = q^{x + y} \), the coefficient \( c_k \) equals the difference between the number of ways to write \( k \) as a sum of an even number of elements chosen from the set \( \{a_1, \ldots, a_n\} \) and the number of ways to write \( k \) as a sum of an odd number of such elements. This might sound complicated, but it is actually quite straightforward. Simply expanding the brackets each combination of numbers that sum up to \( k \) leads to a monomial \(q^{k}\) and if we were using odd numbers then the sign in front of this monomial is equal to \(-1\), otherwise it is \( +1 \).
Mimic initial approach we plugging in \( q = e^{x} \) again and differentiating both sides \( \ell \)-times, to obtain the desired identity:
\[
\sum_k c_k k^{\ell} = 0 \quad \text{for } \ell < n.
\]
So, there is nothing particularly special about the binomial coefficients \( (-1)^k {n \choose k} \) in our original formula! Up to a sign, they simply express the number of ways to write \( k \) as a sum using a set of \( n \) ones, and of course there are exactly \( {n \choose k} \) ways to do so.
Let’s "invent" another formula of a similar type.
The next natural sequence to try is \( 1, 2, 3, \ldots, n \). So we consider the product:
\[
(1 - q)(1 - q^2)(1 - q^3) \cdots (1 - q^{n}) = \sum b_{n,k} q^{k}.
\]
This finite product might ring a bell for those familiar with the
Pentagonal Number Theorem,
but let’s not get distracted by that.
As before, for \( 0 \leq \ell < n \), we get the identity for free: \[ \sum_k b_{n,k} \, k^{\ell} = 0. \] For instance, when \( n = 3 \), the product is: \[ (1 - q)(1 - q^2)(1 - q^3) = 1 - q - q^2 + q^4 + q^5 - q^6. \] So, for \( \ell = 2 \), we compute: \[ 1 \cdot 0^2 - 1 \cdot 1^2 - 1 \cdot 2^2 + 1 \cdot 4^2 + 1 \cdot 5^2 -1 \cdot 6^2 = 0 - 1 - 4 + 16 + 25 -36 = 0. \]
A second generalization: towards q-analogue
Reflecting on the proof again
Let’s first pause and ponder on which properties were crucial in the reasoning above. To compute derivatives at \( x = 0 \) quickly, we relied on inserting functions of the form \( f_i(x) = 1 + x g_i(x) \) into the product \( (1 – f_i(x)) \). On the other hand, to group terms effectively when expanding the product, we used the fundamental identity: \[ x^{a + b} = x^a x^b. \] This multiplicative rule was key in expressing the coefficients on the right-hand side as the difference between the number of ways of writing a number as a sum of an even versus odd number of elements. These two properties are in many ways characteristic of the exponential function. Which strongly suggests that we may have already captured all natural generalizations of the original identity. But mathematics often thrives on one powerful trick: take a known idea and place it in a slightly different context. When I first saw this proof, I immediately had the instinct to explore it in other settings. Two natural candidates came to mind: quantum exponential and Carlitz exponential. In this last part of this post I will briefly mention my findings with regard to the former one.Prerequisites for the q-analogue
The setup for quantum calculus begins with a parameter \( q \), and a special operator called the \( q \)-derivative, denoted \( D_q \). This operator maps a function \( f(x) \) to: \[ D_q f(x) = \frac{f(qx) – f(x)}{qx – x}. \] If you’ve never seen this before, it may seem like a strange definition—but it’s actually a remarkably rich generalization of ordinary differentiation. It uncovers deep connections across mathematics: from combinatorics and number theory to special functions and even quantum physics.Here, we’ll briefly outline just a few key facts and notions from \( q \)-calculus(for details you can check for instance this book) :
- Linearity: The operator \( D_q \) is linear. Moreover, there exist \( q \)-analogues of product and chain rules.
- Power rule: For \( n \in \mathbb{N} \), \[ D_q(x^n) = \frac{q^n – 1}{q – 1} x^{n-1}. \]
- \( q \)-Factorial: Define \[ [n]_q = \frac{q^n – 1}{q – 1} = 1 + q + q^2 + \cdots + q^{n-1}, \] and the \( q \)-factorial as \[ [n]_q! = [n]_q [n-1]_q \cdots [1]_q. \]
- \( q \)-Exponential: Define the \( q \)-exponential function as \[ e_q(z) = \sum_{n=0}^{\infty} \frac{z^n}{[n]_q!}. \] From the previous facts, we find: \[ D_q(e_q(z)) = e_q(z), \] mirroring the familiar property of the classical exponential function.
Computing q-analogue
So what are we doing now? Just as before, consider the product: \[ (1 – x)^n = \sum c_{n,k} x^k, \] but now instead of substituting \( x = e^z \), we insert the \( q \)-exponential function: \[ x = e_q(z). \] And this time we apply the \( q \)-derivative \( D_q \) \( \ell \)-times to both sides, and then set \( z = 0 \). Why will the result be zero (again)? Well, as before, the series \( (1 – e_q(z))^n \) starts from \( z^n \), so all \( q \)-derivatives up to order \( \ell < n \) will vanish at \( z = 0 \). This follows from property (2): every application of \( D_q \) reduces the degree of the lowest power by 1, so we must reach at least degree \( \ell \) to get a nonzero value when evaluating at zero. On the other hand, we can explicitly compute \( D_q^\ell \left( (e_q(z))^k \right) \big|_{z=0} \) by expanding brackets and thus obtain a formula involving \( c_{n,k} \) and analogs of \( k^\ell \) in the \( q \)-calculus setup. (To emphasise it again: here, the role of \( k^\ell \) is played by the value of the \( \ell \)-th \( q \)-derivative of \( (e_q(z))^k \) at \( z = 0 \).) For comparison lets also evaluate the formula for \( n = 3 \), \( \ell = 2 \) We compute the second \( q \)-derivative at \( z = 0 \) of: \[ f_q(z) = 1 - 3 e_q(z) + 3 (e_q(z))^2 - (e_q(z))^3. \] Let’s go term-by-term:– Term 1: Constant contributes 0 to any derivative.
– Term 2: \[ -3 \cdot D_q^2(e_q(z))\big|_{z=0} = -3. \] Because \( e_q(z) \) has \( \frac{z^2}{[2]_q!} \) and we need to \( q \)-differentiate it twice.
– Term 3: \( (e_q(z))^2 \) has coefficient in front of \( z^2 \) equals to: \[ \frac{2}{[2]_q!} + 1, \] so after applying \( D_q^2 \), the final contribution is: \[ 3 \cdot \left(2 + [2]_q! \right). \]
– Term 4: \( (e_q(z))^3 \) has coefficient of \( z^2 \) equal to: \[ 3 + 3 \cdot \frac{1}{[2]_q!}, \] therefore final impact of this summand is \[ -1 \cdot \left(3 \cdot [2]_q! + 3 \right). \]
Putting it all together: \[ -3 + 3 \cdot (2 + [2]_q!) – (3 + 3 \cdot [2]_q!) \] Recalling that \( [2]_q! = [2]_q [1]_q = (1 + q)(1) = q + 1 \) the equation becomes: \[ -3 + 3(2 + q + 1) – (3 + 3(q + 1)) = -3 + 3(q + 3) – (3q + 6) = 0. \] A new identity confirmed!
Remark: Often when working with q-analogue it is interesting to plug back \( q = 1 \) in the final formula. For instance the \(q\)-factorial, the \(q\)-exponent and \(q\)-binomials will become “regular” entities. So in the last formula with \(q = 1\) we got \( -3 + 3\cdot(4) – (9) \). Looks familiar, right ?