October 18, 2022

Cryptographic Primitives (Mostly Sum-Check)

As a cryptographer, people frequently ask me questions about cryptography...

As a cryptographer, people frequently ask me questions about cryptography. Sometimes, when the questions are really good, they turn into blog posts to share with the rest of the world. Today’s question would be “what is a cryptographic primitive?” but as it turns out, that one is too hard. Instead this post will cover:

  • Why is it so hard to define?
  • How is such an ambiguous term useful?
  • Can you provide one (1) example of how it's used in Space and Time?

How hard could it be?

Why is it hard? Other attempts at explaining the term list a bunch of primitives. Giving plenty of examples is usually good, but unless you already know how they’re used in more complex systems it’s hard to see what makes them “primitive.” Wikipedia takes a top-down approach to defining them: 

Cryptographic primitives are well-established, low-level cryptographic algorithms that are frequently used to build cryptographic protocols for computer security systems.- Cryptographic primitive - Wikipedia

So now the problem is coming into focus. A cryptographic primitive is a low-level building block that often relies on very high-level “moon math.” And cryptographers somehow resolve this contradiction and then the term becomes useful.

The more I encounter the term "cryptographic primitive," the less I feel confident that I truly understand what it means. Is it just me, or is there no universal definition for the term? Or does the term have some contextual sensitivity that I'm missing?
- The most relatable post on StackExchange

An analogy

In the absence of a real definition of a “cryptographic primitive,” I am going all-in on this analogy (so I really hope it lands).

Furniture is like a cryptosystem. The “primitives” are allen wrenches, dowels, L-brackets, wood glue, metal brackets, solid wood, plywood, engineered wood, etc.- Quoting myself for emphasis

If I had to give a definitive answer to where the division is between a cryptographic primitive and a cryptosystem (or furniture and a component) it’d be that designing the primitive isn’t done “in-house.” The cryptosystem designer takes no credit for it. What they take credit for is putting the primitives together in a novel way to create a more complex system. Somebody else may have engineered the particleboard, but the furniture company designed the bookshelf. Likewise, Space and Time is building Proof of SQL™, but we use existing libraries for components like polynomial commitments.

That’s not to say that creating cryptographic primitives isn’t a creative or difficult task. Quite the opposite. A good primitive is going to be used over and over again in contexts that its developer can’t possibly predict. You know, kind of like a screwdriver (which actually takes a lot of engineering). But once you’ve got that marvel of modern engineering, you can build something even grander, like a TV stand.

I like the furniture metaphor better than “tools” or “building blocks” because it also lets us make one more conceptual jump that’s very useful. When explaining cryptographic protocols, the explanation of what they actually do often gets entangled with how they do it. Really, these are two separate axes they can be grouped by, but they’re often grouped by functionality, not construction. But classifying protocols based on their primitives allows us to make broad statements like, “protocols based on symmetric-key primitives are generally faster and plausibly post-quantum secure,” without specifying whether we’re talking about encryption, or multiparty computation, or cryptographically secured voting schemes. It’s a bit like saying “flat-pack furniture is cheaper to ship but less sturdy” without specifying whether you’re talking about tables or chairs.

Data Structures 🤝 Algorithms

There's a joke about the structure of research presentations that goes, "the first section is for everyone, the second section is for the experts, and the last section is for the speaker.” We’ve hit the second section. It will make more sense if you’ve programmed before.

An expressive programming language or query language is going to have data types. So in order to prove anything about those programs we're going to need a way to represent those data types in our proof system. As I alluded to in a previous blog post, that means we turn it into math.

What kind of data types do we need? Well, most can be broken down into scalar types, which are kind of unbreakable atoms, (as an example, your booleans, integer types, maybe characters depending on the language) and composite types, (like tuples and arrays) which are built up from simpler types. Handling scalar data types is fairly straightforward; we just encode them as numbers.1

In contrast with scalar data types, a composite data type is a thing that has stuff in places. The thing about composite data types is you want to look inside them. We'll focus on a very simple, list-like, one-dimensional array example. If you've got an array called `𝙲𝚞𝚜𝚝𝚘𝚖𝚎𝚛𝚜` then the most basic thing you will want to do with it is index into it. You call `𝙲𝚞𝚜𝚝𝚘𝚖𝚎𝚛𝚜[𝚒]` and you get the ith customer in the list (zero-indexed, of course).

We can do this with an interpolating polynomial. Basically, for any array `𝙰` we can construct a polynomial ã(x) so that plugging in i for x gives you the entry `𝙰[𝚒]`. 2

The crown jewel of interactive proofs / the worst for-loop you’ve seen in your life

Data structures and algorithms are often taught together because they work synergistically. Data structures are more useful when there are efficient algorithms to work with them. We just described a way to represent a generic composite data type in a way that our proof language can reason about. But we noted that to look at one particular place in an array requires evaluating a polynomial, and that just doesn’t seem scalable. What kind of algorithms can we even use with this?

The answer is the sum-check protocol of Lund, Fortnow, Karloff, and Nisan. It’s been a staple of interactive proofs since the 90’s. This may seem a little contrived, but what it does is let a prover convince a verifier that the sum of a multi-variable polynomial g over the hypercube H = {0, 1}m (that’s all ordered m-tuples of zeroes and ones) is, well, whatever that value is. Let’s call it y.

Before we even try to clarify some of the technical definitions, let's build up an understanding of what it can actually be used for. The first step is to check anime cat math twitter for some insight.

(Also, quick note: the sum in sum-check is over zeroes and ones, but if we read it as binary it’s just the range 0 to 2m-1.) So combining what we learned in the crash-course on data structures for proof systems and this helpful tweet, we get that sum-check gives us a way to provably iterate over an array, but it's very limited in what we can do during the iteration. So our task in designing Proof-of-SQL™️ is to turn other, more complicated operations into a summation. 

That's why the definition of the sum-check protocol is confusing, contrived, and kind of backwards. When trying to apply the protocol, you don't ask, "what polynomials do I care about?" or "what is special about the hypercube here?". The real technical challenge is to choose or construct the polynomial so that this summation is meaningful. Annotated, the statement that sum-check proves looks more like this:

Example

Here's one example of how you could use it. Many (all?) languages have a map functionality where you can take a list-like thing (or several of them) and a function, then map will apply that function to each element of the list and return a list of the results.

If that function (call it f) is a polynomial, and the multivariate polynomial a(x1, x2, … xm) represents an array, then the composition f(a(x1, x2, … xm)) is also a polynomial. In fact, plugging in a point to this polynomial gives the result of the map at that point!

And since we can also represent the result array as a polynomial, the prover can run sum-check to convince the verifier that they computed the correct result. If we set it up right, we can make the polynomial zero only if r(x1, x2, … xm) is actually the result of applying the map to a(x1, x2, … xm). Then the sum should be adding a bunch of zeroes and the whole thing is zero.

(Okay, there’s a little more care to be taken with some randomized weighting and stuff but I promise, this is almost something we actually use in Proof of SQL™.)

Summary

Thank you for reading this far. This dove very deeply into details, so I wanted to zoom back out and cover the main takeaways. Hopefully after reading this you can go to your next blockchain/crypto conference and make small talk with a cryptography researcher. You can say, "hey, have you heard about Space and Time's Proof of SQL™? It's like a SNARK but their main cryptographic primitive is sum-check." And the cryptographer will say, "oh, that's smart. That way the prover can avoid doing an FFT." Then you'll look up and smile and share a knowing glance, like two carpenters admiring a well-done dovetail joint (and knowing that once the wood glue dries, it doesn't even need nails or screws). And you'll realize that knowing a super-technical and exact definition of cryptographic primitives isn't important. What matters is knowing you made a friend.

Footnotes

[1] Specifically, everything is going to turn into an element of 𝔽p, which is a fancy way of saying everything is an integer mod p, where p is a prime.

[2] This motivates the choice to work over 𝔽p. By choosing p to be prime, we get a very useful property that we’ve just come to expect from real numbers: that everything but zero has a multiplicative inverse. This ends up having far-reaching consequences, such as the existence of interpolating polynomials over 𝔽p. Unfortunately, most of the accessible literature on polynomial interpolation (here’s a resource) deals with real numbers or sometimes complex numbers but because of the nice property noted (in mathematical terms, 𝔽p is a field) a lot of the theory goes through, mutatis mutandis.