zk
A mathematical introduction to ZKPs

Setup:

  • Consider a prover PP and a verifier VV.
  • A common input xx is known to both PP and VV.
  • A secret input ww (the "witness") is known only to PP.

Protocol:

  • PP and VV interact for nn rounds.
  • In each round, PP sends a message aa to VV, VV sends a challenge cc to PP, and PP sends a response zz to VV.
  • At the end of the interaction, VV outputs a decision (Accept or Reject) based on the messages it has received.

Mathematically, this can be modeled as follows:

  1. PP computes a=A(x,w,r)a = A(x, w, r), where rr is a random value, and sends aa to VV - AA is the "Announcement" function, used by the Prover to generate an initial message.
  2. VV computes c=C(x,a,s)c = C(x, a, s), where ss is a random value, and sends cc to PP - CC is the "Challenge" function, used by the Verifier to create a challenge for the Prover.
  3. PP computes z=Z(x,w,a,c,r)z = Z(x, w, a, c, r) and sends zz to VV - ZZ is the "Zero-Knowledge Response" function, used by the Prover to respond to the Verifier's challenge.
  4. VV computes d=D(x,a,c,z)d = D(x, a, c, z) and accepts if d=1d = 1 and rejects if d=0d = 0 - DD is the "Decision" function, used by the Verifier to decide whether to accept or reject the Prover's claim.

This protocol satisfies the following properties:

  • Completeness: If x,wx, w satisfy the relation RR, then for any a,ca, c computed by P,VP, V according to the protocol, VV accepts with high probability.
  • Soundness: If x,wx, w do not satisfy the relation RR, then for any a,c,za, c, z computed by PP and any a,ca, c computed by VV according to the protocol, VV rejects with high probability.
  • Zero-Knowledgeness: For any verifier strategy VV^*, there exists a simulator SS such that for any x,wx, w satisfying RR, the distribution of transcripts between P,VP, V^* is close to the distribution of transcripts between S,VS, V^*.

These are the three properties that we want to achieve in a ZKP. The first two are pretty straightforward: if the prover and verifier follow the protocol, then the verifier will accept if the statement is true and reject if the statement is false.

The last point is what makes this a zero-knowledge proof: no matter how VV^* behaves, there's a simulator SS that can create a transcript that looks just like a real interaction between PP and VV^*, without knowing the secret ww.

So why isn't a SHA256 password hash matching scheme a ZKP?

This was a pretty stupid question I had when I first started learning about ZKs. It's not a stupid question, it was intuitive at the time because I didn't truly understand ZKs. But a hashing scheme is not a ZK, a ZKP has certain requirements (outlined above).

  1. Interactive Process: ZKPs are typically interactive, involving a series of exchanges between the Prover and Verifier. The Prover doesn't simply present a proof; the Verifier actively participates in the process by sending challenges to the Prover. In the case of password hashing, the process is non-interactive. The user submits a password, it gets hashed, and the hash is compared with the stored value. This is not explicitly outlined, but you'll see why this is important as we go on.

  2. Zero-Knowledge Property: In a ZKP, the Verifier learns nothing more than the fact that the statement is true. In the case of password hashing, if the hash function is somehow compromised, or if the hashed passwords are not stored securely, an attacker could learn more than just whether the password is correct or not. They could potentially learn the actual password.

  3. Completeness and Soundness: While password hashing does have completeness (if the user provides the correct password, they will be authenticated) and soundness (an incorrect password will not authenticate), these alone do not make it a ZKP.

  4. Simulation: A critical aspect of ZKPs is the ability to simulate the Verifier's view without the Prover's secret input. The Verifier's view in a ZKP should be simulatable without knowledge of the Prover's secret. In password hashing, there is no such simulation process.

  5. Use of randomness in the challenge: In a ZKP, the Verifier's challenge should be random. In password hashing, the challenge is not random; it is the hash of the password.