Good design is understandable.
Most STARK tutorials are written by cryptographers for cryptographers. Dense notation. Assumed knowledge. Hand-waving at the hard parts.
This is written by a builder for builders.
If you can write code, you can understand zero-knowledge proofs. The mathematics is just careful bookkeeping. The cryptography is just well-chosen primitives. The complexity is just layers of simple ideas.
This article teaches you to build a STARK proof system from scratch.
Not production-grade. Not optimized. Just functional—enough to prove you understand.
What we are building
A STARK (Scalable Transparent Argument of Knowledge) proves computational correctness with three properties:
- -Completeness: If the computation is correct, the verifier accepts
- -Soundness: If the computation is incorrect, the verifier rejects (with high probability)
- -Zero-knowledge: The proof reveals nothing except correctness
We will build a toy STARK that proves: "I know a number x such that x² = y (mod p)" without revealing x.
Simple problem. Complete proof system. All the core ideas.
Part 1: Finite Fields
Zero-knowledge proofs live in finite fields—arithmetic modulo a prime.
Why? Polynomials over finite fields have unique properties we need:
- -Division always works (except by zero)
- -Schwartz-Zippel lemma (testing polynomial equality)
- -Efficient FFT algorithms (fast polynomial operations)
Basic implementation:
class FieldElement {
constructor(
public value: bigint,
public modulus: bigint
) {
this.value = ((value % modulus) + modulus) % modulus;
}
add(other: FieldElement): FieldElement {
return new FieldElement(
this.value + other.value,
this.modulus
);
}
multiply(other: FieldElement): FieldElement {
return new FieldElement(
this.value * other.value,
this.modulus
);
}
// Extended Euclidean algorithm for modular inverse
inverse(): FieldElement {
let [old_r, r] = [this.value, this.modulus];
let [old_s, s] = [1n, 0n];
while (r !== 0n) {
const quotient = old_r / r;
[old_r, r] = [r, old_r - quotient * r];
[old_s, s] = [s, old_s - quotient * s];
}
return new FieldElement(old_s, this.modulus);
}
pow(exponent: bigint): FieldElement {
let result = new FieldElement(1n, this.modulus);
let base = this;
let exp = exponent;
while (exp > 0n) {
if (exp % 2n === 1n) {
result = result.multiply(base);
}
base = base.multiply(base);
exp = exp / 2n;
}
return result;
}
}
Example usage:
// Work in the field mod 97 (a small prime)
const p = 97n;
const a = new FieldElement(5n, p);
const b = new FieldElement(10n, p);
console.log(a.add(b).value); // 15
console.log(a.multiply(b).value); // 50
console.log(a.inverse().value); // 39 (because 5 * 39 = 195 ≡ 1 mod 97)
Why this matters: All STARK arithmetic happens in finite fields. Master this, master the foundation.
Part 2: Polynomials
Polynomials are the language of STARKs. We represent computation as polynomials, commit to polynomials, and verify polynomial properties.
Key insight: A polynomial of degree d is uniquely defined by d+1 points.
Lagrange interpolation finds that polynomial:
class Polynomial {
constructor(public coefficients: FieldElement[]) {}
// Evaluate polynomial at point x
evaluate(x: FieldElement): FieldElement {
let result = new FieldElement(0n, x.modulus);
let power = new FieldElement(1n, x.modulus);
for (const coef of this.coefficients) {
result = result.add(coef.multiply(power));
power = power.multiply(x);
}
return result;
}
// Lagrange interpolation: find polynomial through points
static interpolate(
points: Array<[FieldElement, FieldElement]>
): Polynomial {
const n = points.length;
const modulus = points[0][0].modulus;
let coefficients: FieldElement[] = [];
for (let i = 0; i < n; i++) {
coefficients[i] = new FieldElement(0n, modulus);
}
// Build Lagrange basis polynomials
for (let i = 0; i < n; i++) {
const [x_i, y_i] = points[i];
let basis = [new FieldElement(1n, modulus)];
for (let j = 0; j < n; j++) {
if (i === j) continue;
const [x_j] = points[j];
// Multiply basis by (x - x_j) / (x_i - x_j)
const denominator = x_i.add(x_j.multiply(
new FieldElement(-1n, modulus)
)).inverse();
const newBasis: FieldElement[] = [];
for (let k = 0; k <= basis.length; k++) {
newBasis[k] = new FieldElement(0n, modulus);
}
for (let k = 0; k < basis.length; k++) {
// Coefficient of x^k
newBasis[k] = newBasis[k].add(
basis[k].multiply(x_j).multiply(denominator).multiply(
new FieldElement(-1n, modulus)
)
);
// Coefficient of x^(k+1)
newBasis[k + 1] = newBasis[k + 1].add(
basis[k].multiply(denominator)
);
}
basis = newBasis;
}
// Add y_i * basis to result
for (let k = 0; k < basis.length; k++) {
coefficients[k] = coefficients[k].add(y_i.multiply(basis[k]));
}
}
return new Polynomial(coefficients);
}
}
Example:
const p = 97n;
const points: Array<[FieldElement, FieldElement]> = [
[new FieldElement(1n, p), new FieldElement(2n, p)], // (1, 2)
[new FieldElement(2n, p), new FieldElement(5n, p)], // (2, 5)
[new FieldElement(3n, p), new FieldElement(10n, p)], // (3, 10)
];
const poly = Polynomial.interpolate(points);
// Verify: polynomial passes through all points
console.log(poly.evaluate(new FieldElement(1n, p)).value); // 2
console.log(poly.evaluate(new FieldElement(2n, p)).value); // 5
console.log(poly.evaluate(new FieldElement(3n, p)).value); // 10
Why this matters: We encode computation as polynomial constraints. "Correct computation" = "polynomial has specific values at specific points."
Part 3: Merkle Trees and Commitments
To make proofs succinct, we cannot send entire polynomials. We send commitments (cryptographic hashes) and respond to queries (random spot-checks).
Merkle trees let us commit to data and later prove specific values:
import { createHash } from 'crypto';
class MerkleTree {
root: string;
private tree: string[][];
constructor(leaves: string[]) {
this.tree = [leaves.map(leaf => this.hash(leaf))];
// Build tree bottom-up
while (this.tree[this.tree.length - 1].length > 1) {
const currentLevel = this.tree[this.tree.length - 1];
const nextLevel: string[] = [];
for (let i = 0; i < currentLevel.length; i += 2) {
const left = currentLevel[i];
const right = i + 1 < currentLevel.length
? currentLevel[i + 1]
: left;
nextLevel.push(this.hash(left + right));
}
this.tree.push(nextLevel);
}
this.root = this.tree[this.tree.length - 1][0];
}
private hash(data: string): string {
return createHash('sha256').update(data).digest('hex');
}
// Generate proof for leaf at index
getProof(index: number): string[] {
const proof: string[] = [];
let currentIndex = index;
for (let level = 0; level < this.tree.length - 1; level++) {
const levelSize = this.tree[level].length;
const siblingIndex = currentIndex % 2 === 0
? currentIndex + 1
: currentIndex - 1;
if (siblingIndex < levelSize) {
proof.push(this.tree[level][siblingIndex]);
}
currentIndex = Math.floor(currentIndex / 2);
}
return proof;
}
// Verify a leaf with its proof
static verify(
root: string,
leaf: string,
index: number,
proof: string[]
): boolean {
let hash = createHash('sha256').update(leaf).digest('hex');
let currentIndex = index;
for (const sibling of proof) {
const left = currentIndex % 2 === 0 ? hash : sibling;
const right = currentIndex % 2 === 0 ? sibling : hash;
hash = createHash('sha256').update(left + right).digest('hex');
currentIndex = Math.floor(currentIndex / 2);
}
return hash === root;
}
}
Example:
const leaves = ['data0', 'data1', 'data2', 'data3'];
const tree = new MerkleTree(leaves);
console.log('Root:', tree.root);
const proof = tree.getProof(2); // Prove 'data2' is in the tree
const verified = MerkleTree.verify(tree.root, 'data2', 2, proof);
console.log('Verified:', verified); // true
Why this matters: In STARKs, we commit to polynomial evaluations via Merkle roots. Verifier queries random positions. We prove those values without revealing the entire polynomial.
Succinctness: Tree with 1 million leaves → proof size ~20 hashes (log₂(1M) ≈ 20).
Part 4: Constraint Systems (Algebraic Intermediate Representation)
A constraint system encodes computation as polynomial equations.
Example: Fibonacci sequence
The Fibonacci sequence satisfies: f(n+2) = f(n+1) + f(n)
As a constraint system:
- -Execution trace: Table of computation steps
- -Boundary constraints: Initial/final values
- -Transition constraints: Step-to-step rules
Fibonacci trace:
Step | Value |
---|---|
0 | 1 |
1 | 1 |
2 | 2 |
3 | 3 |
4 | 5 |
5 | 8 |
Constraints:
- -Boundary: trace[0] = 1, trace[1] = 1
- -Transition: For all i: trace[i+2] - trace[i+1] - trace[i] = 0
Key insight: If we encode the trace as a polynomial T(x), the constraints become polynomial equations:
- -T(0) = 1
- -T(1) = 1
- -T(x+2) - T(x+1) - T(x) = 0 for specific x values
Why this matters: "Correct computation" becomes "polynomial satisfies constraints." Proving computation = proving polynomial properties.
Part 5: The FRI Protocol (Low-Degree Testing)
Problem: Prover commits to a polynomial. How does Verifier know it's actually a polynomial (not random noise) and has low degree?
Solution: FRI (Fast Reed-Solomon Interactive Oracle Proof)
Core idea:
- -If P(x) is a polynomial of degree < d, then it satisfies specific algebraic properties
- -We can "fold" the polynomial repeatedly, reducing its degree
- -After log(d) folds, we reach a constant polynomial (degree 0)
- -Verifier spot-checks this process with random queries
FRI folding step:
Given polynomial P(x) of degree < d:
P(x) = P_even(x²) + x · P_odd(x²)
Where:
- -P_even contains even-degree terms
- -P_odd contains odd-degree terms
Random challenge: Verifier sends random α
Folded polynomial: P_folded(y) = P_even(y) + α · P_odd(y)
Degree of P_folded < d/2
Repeat: Fold again, and again, until degree = 0 (constant polynomial)
Verification: Verifier queries random points, checks:
- -Merkle proofs for each layer
- -Folding consistency: P(x) = P_even(x²) + x · P_odd(x²)
- -Final layer is actually constant
Soundness: If prover cheats (commits to non-polynomial or high-degree), they must guess all query positions ahead of time. Probability of success ≈ (d/field_size)^(query_count).
Why this matters: FRI is the "magic" that makes STARKs scalable. Verification cost is O(log d), not O(d).
Part 6: Putting It Together (Toy STARK)
Problem: Prove "I know x such that x² = y (mod p)" without revealing x.
Step 1: Define computation trace
Step | Register |
---|---|
0 | x |
1 | x² |
Step 2: Define constraints
- -Boundary: trace[1] = y (public value)
- -Transition: trace[1] = trace[0]²
Step 3: Encode as polynomials
Interpolate polynomial T(x) through trace values:
- -T(0) = x (secret)
- -T(1) = x² (must equal y)
Step 4: Boundary constraint polynomial
B(x) = T(1) - y
Must equal 0 if computation is correct.
Step 5: Transition constraint polynomial
C(x) = T(1) - T(0)²
Must equal 0 at x=0.
Step 6: Composition polynomial
Combine constraints: H(x) = B(x) / Z_boundary(x) + C(x) / Z_transition(x)
Where Z are "vanishing polynomials" (zero at constraint points).
If computation is correct, H(x) is a low-degree polynomial.
Step 7: Commit and prove
- -Prover evaluates H(x) on a large domain
- -Commits via Merkle tree
- -Uses FRI to prove H is low-degree
- -Verifier queries random positions
Step 8: Verification
- -Check Merkle proofs for queried values
- -Check constraint equations hold at those points
- -Check FRI proof (low-degree test)
If all checks pass: computation is correct with high probability.
Zero-knowledge addition: Add random "blinding" polynomial to hide x. Details omitted for clarity.
Part 7: Security Parameters
How to choose parameters:
1. Field size (p)
- -Larger = more secure
- -Typical: 256-bit prime (same as Ethereum hashes)
- -Minimum: 128-bit for production
2. Domain size (d)
- -Must be power of 2 (for FFT)
- -At least 4× constraint degree (blow-up factor)
- -Typical: 2^16 to 2^20 for moderate computations
3. Query count (q)
- -More queries = higher security
- -Typical: 10-12 queries for 80-bit security
- -More queries = larger proof, higher security
4. FRI folding rate (ρ)
- -How much to reduce degree each round
- -ρ = 2 (halve degree) is typical
- -Smaller ρ = more rounds, better soundness
Security estimation:
Simplified for illustration
Real-world systems achieve 80-bit security through:
- -10-12 FRI queries (~50 bits of security)
- -Proof-of-work grinding (2^30 difficulty, ~30 bits)
- -Combined: ~80 bits total
Note: This is a simplification. Actual security analysis involves the Johnson bound for list-decoding Reed-Solomon codes and relies on conjectures about hash function collision resistance. Production systems require rigorous cryptographic analysis.
Tradeoffs:
- -More queries → higher security, larger proofs
- -Larger domain → better soundness, slower proving
- -Larger field → more security, slower arithmetic
Production choice: Balance security target (80-128 bits) with performance constraints.
Part 8: Common Pitfalls
1. Grinding attacks
Problem: Prover tries many random nonces until finding one that makes fake proof pass.
Solution: Proof-of-work requirement. Force prover to find nonce where hash(proof) < target. If difficulty is 2^k, grinding attack becomes 2^k times harder.
Example: Production systems typically use difficulty 10-15 for grinding protection.
2. Under-constrained systems
Problem: Constraints don't fully specify computation. Multiple valid solutions exist.
Example: Proving "x² = 4" without constraining sign. Both x=2 and x=-2 satisfy constraint.
Solution: Add boundary constraints for all degrees of freedom. Test with known inputs.
3. Soundness errors
Problem: Incorrectly calculating soundness. System is less secure than claimed.
Solution: Use established formulas (Johnson bound for list-decoding). Verify with security audit.
4. FFT domain mismatch
Problem: Domain size not a power of 2, or doesn't contain needed roots of unity.
Solution: Always use d = 2^k. Verify field has primitive d-th root of unity.
5. Fiat-Shamir security
Problem: Converting interactive proof to non-interactive via hashing. Hash function collision could break soundness.
Solution: Use cryptographic hash (SHA-256, Keccak). Hash all public data and prior commitments.
Part 9: Performance Considerations
Prover time: O(d log d) where d is domain size
- -Dominated by FFT operations
- -Parallelizable (split domain across cores)
- -Scales poly-logarithmically with computation size
Verifier time: O(log² d + q log d)
- -q queries, each with log d Merkle proof
- -FRI verification: log d rounds
- -Independent of computation size (after commitment)
Proof size: O(q log d)
- -q query responses
- -Each with log d Merkle siblings
- -FRI commitments: log d hashes
Memory: O(d)
- -Store polynomial evaluations
- -Merkle tree leaves
- -Can stream to disk for large d
Optimization strategies:
- -Choose minimal domain size: d = 4 × max_constraint_degree
- -Parallelize FFT: Split across CPU cores or GPU
- -Batch commitments: One Merkle tree for all polynomials
- -Optimize field arithmetic: Use native 64-bit arithmetic when possible
- -Minimize constraint degree: Higher degree = larger domain = slower
Real-world performance example:
- -100 transactions batched
- -Proof generation: ~30-35 seconds (consumer hardware)
- -Verification: ~60,000 gas (on-chain)
- -Security: ~80 bits
- -Proof size: ~100 KB
This represents achievable performance for custom STARK implementations focused on transaction batching.
You now understand how STARKs work:
- -Finite fields provide the arithmetic
- -Polynomials encode computation
- -Merkle trees enable succinct commitments
- -Constraint systems define correctness
- -FRI proves low-degree (the magic)
- -Security parameters balance performance and soundness
- -Careful engineering avoids pitfalls
The rest is implementation—FFT optimizations, constraint design, parameter tuning.
What separates theory from production:
- -Performance (making it fast enough to matter)
- -Security (auditing every assumption)
- -Integration (making it useful for real applications)
But the core ideas are simple. Just careful composition of simple primitives.
If you can build this toy example, you can understand any STARK system.
The math is public. The blockchain does not lie.
Start building.
Sources & Further Reading
Foundational Papers:
- -Ben-Sasson et al. - Scalable, transparent, and post-quantum secure computational integrity
- -Ulrich Haböck - Improved Soundness Analysis of the FRI Protocol
Educational Tutorials:
- -Alan Szepieniec - Anatomy of a STARK
- -RISC Zero - STARK by Hand
- -Three Sigma - Arithmetization: STARKs and AIR
- -LambdaClass - How to Code FRI from Scratch
StarkWare Resources:
- -StarkWare - Arithmetization I
- -StarkWare - Low Degree Testing
- -StarkWare - Safe and Sound: A Deep Dive into STARK Security
Academic Background:
- -Bhaskar - Understanding Zero Knowledge Proofs Part 4: Polynomial Commitments
- -Berkeley - ZKP Course Lecture 8
All code examples are simplified for educational purposes. Production implementations require additional optimizations and security considerations.