Proving Cryptography Correct: Our Approach to Formal Verification

Why Testing Isn't Enough

In 2014, the Heartbleed vulnerability in OpenSSL exposed that even widely-deployed, extensively-tested cryptographic code can contain critical flaws. The bug had existed for over two years, affecting an estimated 17% of secure web servers worldwide.

Traditional software testing, no matter how thorough, can only demonstrate the presence of bugs—never their absence. For cryptographic implementations where a single error can compromise the security of millions of systems, we need stronger guarantees.

At SECEQ, we employ formal verification: using mathematical proof to guarantee that our implementations are correct by construction.

The Coq Theorem Prover

Our verification infrastructure is built around Coq, a mature proof assistant that enables us to write machine-checkable mathematical proofs. Unlike informal arguments, Coq proofs are verified by a small, trusted kernel that has been extensively analyzed over decades.

Specification-First Development

We begin every implementation by writing a formal specification in Coq that captures the exact mathematical definition from the relevant standard:

(* Formal specification of GHASH multiplication in GF(2^128) *)
Definition ghash_mul (x y : GF128) : GF128 :=
  let z := gf128_zero in
  fold_left (fun acc i =>
    let xi := gf128_bit x i in
    let v_shifted := gf128_shift acc in
    let z' := if xi then gf128_xor acc y else acc in
    if gf128_msb acc
    then gf128_xor v_shifted R
    else v_shifted
  ) (seq 0 128) z.

(* R is the reduction polynomial x^128 + x^7 + x^2 + x + 1 *)
Definition R : GF128 := (* ... polynomial definition ... *).

Implementation Verification

Once we have a specification, we implement the algorithm in Rust and prove equivalence:

(* Theorem: Our Rust implementation equals the specification *)
Theorem ghash_mul_correct : forall x y : GF128,
  rust_ghash_mul x y = ghash_mul x y.
Proof.
  intros x y.
  unfold rust_ghash_mul, ghash_mul.
  (* ... detailed proof steps ... *)
  reflexivity.
Qed.

This theorem, once proven, guarantees that our implementation produces the exact same output as the mathematical specification for all possible inputs—not just the test cases we thought to check.

The Verification Process

Verifying GHASH required proving several key properties:

1. Algebraic Correctness

We proved that our GF(2¹²⁸) arithmetic correctly implements the finite field operations:

Lemma gf128_mul_assoc : forall a b c,
  gf128_mul a (gf128_mul b c) = gf128_mul (gf128_mul a b) c.

Lemma gf128_mul_comm : forall a b,
  gf128_mul a b = gf128_mul b a.

Lemma gf128_mul_distrib : forall a b c,
  gf128_mul a (gf128_xor b c) =
  gf128_xor (gf128_mul a b) (gf128_mul a c).

2. Optimization Equivalence

Our production implementation uses PCLMULQDQ intrinsics for performance. We proved these optimizations preserve correctness:

Theorem pclmulqdq_equiv : forall a b,
  pclmulqdq_ghash_mul a b = reference_ghash_mul a b.

3. Constant-Time Execution

We proved that execution time is independent of secret inputs:

Theorem ghash_constant_time : forall h1 h2 m,
  execution_trace (ghash h1 m) = execution_trace (ghash h2 m).

AI-Assisted Proof Development

Writing formal proofs is time-intensive. We've developed AI tooling to accelerate the process:

Automated Lemma Discovery

Our tools analyze proof obligations and suggest intermediate lemmas:

Proof State Analysis:
  Goal: gf128_mul (gf128_mul a b) c = gf128_mul a (gf128_mul b c)

Suggested Lemmas:
  1. gf128_mul_step_invariant (confidence: 0.92)
  2. reduction_polynomial_property (confidence: 0.87)
  3. bit_manipulation_equiv (confidence: 0.84)

Proof Tactic Suggestion

Given a proof state, our model suggests the most likely successful tactic:

Current goal: x = y -> f x = f y

Suggested tactics (ranked):
  1. intros H; rewrite H; reflexivity    [0.94]
  2. intros; f_equal; assumption         [0.89]
  3. congruence                          [0.76]

Specification Gap Detection

We automatically identify potential discrepancies between specifications and standards:

RFC 5116 Analysis:
  ✓ Nonce handling matches specification
  ✓ Tag length requirements satisfied
  ⚠ Implicit assumption: plaintext length < 2^39 - 256 bits
    Recommendation: Add explicit length check

Verification Results

Our verification efforts have discovered and prevented several issues:

Category	Issues Found	Severity
Edge case handling	3	High
Carry propagation	1	Critical
RFC ambiguities	2	Medium
Optimization errors	2	High

All issues were fixed before any code was released to production.

Integration with Development Workflow

We provide VS Code extensions for interactive verification:

Real-time proof checking: See proof status as you type
Goal visualization: Interactive display of proof obligations
Tactic completion: AI-powered suggestions for proof steps
Specification navigation: Jump between specs and implementations

Current Verification Coverage

Component	Specification	Implementation	Equivalence
GHASH	✓	✓	✓
AES-GCM	✓	✓	✓
ChaCha20-Poly1305	✓	✓	✓
ML-DSA	✓	✓	✓
ML-KEM	✓	✓	✓

The Path Forward

Formal verification is transitioning from research novelty to industry requirement. As cryptographic threats evolve and quantum computing approaches, mathematically proven correctness becomes not just desirable but essential.

At SECEQ, we're committed to expanding our verification coverage to include all post-quantum algorithms in HPCrypt, ensuring that the cryptographic foundations of tomorrow's systems are built on solid mathematical ground.