Cryptography and blockchain

My projects.

Built a full Nakamoto blockchain. [github]
Built a Rust implementation of Tendermint. [github]
Built a Go implementation of Pod. [github]
Worked on prediction markets for content curation with the Ethereum Foundation.
Launched Dappnet to decentralize the DeFi frontend.
Launched Mergeswap with some friends, storage proof bridge of ETHW to Eth mainnet, $100k volume.
Reverse-engineered StarkNet traces, contributed to first open-source ZK prover. (pullreq)
Papers (self-published):
- Verifiable bandwidth sampling in P2P networks
- Byzantine Google File System using Tendermint Consensus
Articles:

Smart, interesting people to follow.

Starkware
LambdaClass / Fede
Paradigm / Gakonst
Vitalik
Dan Boneh
Remco Bloemen
Alex Obadia
Michael Saylor - see his bitcoin thesis.
Arthur Hayes.

Blockchains.

Blockchains are byzantine fault tolerant (BFT) replicated state machines.

They usually consist of:

state, whether its a mapping of accounts to balances, or more complex things like positions in lending protocols
a virtual machine (VM) paired with an execution environment, that runs smart contracts (programs which modify state and have built-in API’s)
users creating transactions, which are cryptographically signed intents to execute functions on smart contracts on the VM.
a cryptocurrency for paying for the fees of running the network
a resource metering system like gas, which measures the cost of one transaction.
a consensus protocol which decides on the order of transactions applied, which is usually byzantine-fault-tolerant (BFT)
a regular interval at which transactions are applied, termed a block
the space inside one block, termed blockspace, where transactions can be filled. This space is paid for via fees. And fees are priced according to an automated market maker for gas.
a set of incentives (block rewards) for nodes to (1) store the data (state and history), (2) execute transactions and (3) continue providing blockspace via consensus.

A blockchain’s state can be derived by getting every transaction in an ordered sequence $T$, and applying the state transition function serially.

Concepts:

Blockspace.
Gas markets.
- Mempools.
- Progressive gas auction (PGA) (first-price auction).
- Congestion pricing.
  - Vitalik’s Position paper on resource pricing
  - EIP1559.
- Multidimensional resource pricing
Data availability.
- L1 as a DA layer. Blobspace.
- Data Availability Sampling: From Basics to Open Problems
- Celestia (paper)
- EigenDA.
- Data availability committees.
Blockchain state models:
- UXTO - Bitcoin.
- Account-based - Ethereum, Solana.
- Private UXTO (privacy systems - Aztec, Zcash).
- Comparison with legal systems:
  - UXTO: deeds
  - Account-based: Torrens titles
State growth and history.
- State is the up-to-date balances of the blockchain, history is the full set of all blocks and transactions.
- History is necessary for verification in replicated state machines (but not ZK), as well as historical data analysis.
- Articles:
  - How to Raise the Gas Limit, Part 1: State Growth
  - How to Raise the Gas Limit, Part 2: History Growth
Execution.
- EVM.
- Cosmos.
- SVM.
- zkVM.
Settlement.
Sequencing.
- Sequencing refers to the algorithm to derive the transaction sequence. In Bitcoin and Ethereum, this is just the consensus algorithm.
- In other cases (L2’s), sequencing can be centralized.
Merklelization.
- Bitcoin and Ethereum make use of merkle trees to commit state, which makes it possible to proof individual state leaves in places outside of the chain (e.g. storage proofs).
Models for integrity:
- Replicated state machine. $O(N)$ where $N$ is number of transactions
- ZK proof. $O(log(N))$ to $O(1)$, depending on scheme.
Scaling.
- Optimistic rollups.
  - Interactive binary search for fraudulent opcode.
- ZK rollups.
  - Use L1 as the DA layer for L2 transactions, post proof and state diff.
- Based rollups.
  - Using Ethereum L1 as a base layer, and enshrining a Lisp-style eval opcode.
Security.
- Proof-of-work (PoW).
  - Objectivity w/ checkpointing in early days.
- Proof-of-stake (PoS).
  - Weak subjectivity and this.
  - Restaking.
Tokenomics.
- Coinbase / block reward mechanism.
- Issuance curve.
- Deflationary vs. inflationary.
Bridging / interoperability.
- Vitalik’s R3 paper on chain interop
- Hash-lock time contracts (HLTC’s).
MEV

Bitcoin.

Bitcoin whitepaper
Readings:
Tech:
- Uxtreeo
- Bitcoin Core PR review group
History:
- Bitcoin is worse is better - Gwern
Philosophy.
- Saylor’s Bitcoin thesis.

Ethereum / EVM.

Ethereum whitepaper
Ethereum yellowpaper (technical).
Eth 1.0 (POW)
- Greediest Heaviest Observed Sub-Tree (GHOST)
Eth 2.0 (POS)
- Casper finality gadget
- Architecture
  - Consensus layer
  - Execution layer
  - PeerDAS
Readings
- Erigon state sync

Cosmos.

Cosmos chains are based on CometBFT (aka. Tendermint).
Execution.
- There is no execution environment - smart contracts are simply written in Go, and call out to standard Cosmos API’s for storage.
- There is no gas metering - Cosmos chains define their own fees for each smart contract function.
- Bridging is standard, thanks to the IBC protocol.

Solana.

Solana whitepaper
Proof of history.
- Validators run hashes recursively, that is, H(0 ++ t0), H(H(t0) ++ t1), H(H(H(t0) ++ t1) ++ t2).
- Each hash’s input is the last hash concatenated with current state.
- Transactions are inserted into t as they are received.
- The output of H functions as a logical clock, that defines a forward transaction sequence.

Rogue.

Blockchain-specific devices.

Smart contracts.
- The execution model of blockchains. Small pieces of code which have their own persistent state, can emit events, can call other contracts (subject to call depth limitations) and can receive tokens just like any other externally owned account (operated by users).
- First published by Nick Szabo.
Oracles.
- Data feeds from the external world, usually involving prices of assets (USD, BTC) or in prediction markets, adjudications of outcomes.
- Centralized:
  - ChainLink.
- Token-based.
  - Kleros.
  - SchellingCoin.
Bridges.
Keepers.
- Small agents which “keep” systems up to date.
- Examples: liquidation keepers.
Indexers.
- Although blockchains are databases, the indexing is not enshrined at the blockchain layer. Instead, external third-party indexers allow you to query the chain.

MEV.

MEV is known as Miner Extractible Value / Max Extractible Value.

The basic idea - you can manipulate the transaction sequence (include, uninclude, reorder) on a blockchain in order to make profit.

A model for Bitcoin’s security and the declining block subsidy
Flash Boys 2.0: Frontrunning, Transaction Reordering, and Consensus Instability in Decentralized Exchanges
A Formalization of Cross-Domain Maximal Extractable Value
Builder-Proposer separation, relayers, bundles.
Sandwich attacks, salmonella, sybolic re-execution, multi-block MEV, MEV unbundling attacks.
Flashbots research collective
MEV and me
Time, slots, and the ordering of events in Ethereum Proof-of-Stake
Priority is All You Need
Concurrent Block Proposers
Leaderless Auctions

Consensus protocols.

Nakamoto (paper)
- Basis of Bitcoin
- Proof-of-work = difficulty readjustment + hashcash.
- Stateless SPV proofs
Ethereum 1.0 (POW).
- Greediest Heaviest Observed Sub-Tree (GHOST)
Ethereum 2.0 (POS)
- Casper finality gadget
- How does proof of stake fit into traditional Byzantine fault tolerance research?
Tendermint (paper)
- How does Tendermint work?
- Rust implementation
Malachite (Tendermint v2)
- Multi-Proposer Construction on top of Tendermint
Pod (paper)
- Introduction from LambdaClass
- Go implementation
DAG-based mempools / consensus.
- Insight: you can separate transaction ordering from transaction dissemination.
- Narwhal and Tusk: A DAG-based Mempool and Efficient BFT Consensus
- Cosmos without Tendermint: Exploring Narwhal and Bullshark

Markets

Markets are where buyers and sellers transact. Markets are usually priced in pairs, e.g. BTC/USD refers to a market with BTC and USD in it. Market makers are organisations that keep the price steady, or otherwise facilitate liquidity such that people can buy and sell when they want to. Market manipulation refers to when people commit short-term trades in order to move the price, and frighten other buyers into selling (and some more complex strategies).

Markets are implemented in a few ways on-chain:

orderbooks (e.g. CLOB) + matching engines (compute mid-point, fill orders).
automated market makers

Automated market makers.

Constant product $xy=k$ (CPPM)
- Uniswap v1
- Formal model
Stablecoin-efficient curves.
- StableSwap (used in Curve)
- Uniswap v3
Synthetix simulated liquidity AMM
Research
- Improved Price Oracles: Constant Function Market Makers

Perpetual futures.

Crypto is the first place perpetual futures were widely implemented (on BitMEX). For a good engineering rundown, I suggest reading the Synthetix futures protocol proposal, which details the mathematics.

Prediction markets.

Prediction markets are markets where you bet on the outcome of an event.

For example, “president of USA in 2025” is the event, and “Donald J Trump”, “Kamala Harris” etc. are the outcomes. Each outcome is tokenised, and sold via a market maker. For example, “Donald J Trump” might trade at 60c. The price reflects the probability of the outcome - meaning 60c is a 60% odds. All outcomes prices sum to $1, following the law that all probabilities for mutually exclusive events must sum to 1.

The market has defined resolution critiera, whereby the outcome is defined according to some data source plus additional conditions. Participants can buy/sell outcome tokens at any time. In Polymarket, they use a CLOB to fill trades. Prices reflect supply/demand tension and are filled at midpoint.

Types:
- Discrete.
  - Conditional tokens (Gnosis)
  - Basis of Polymarket (which uses a centralized CLOB matcher ?).
- Continuous.
  - https://www.paradigm.xyz/2024/12/distribution-markets

Stablecoins.

Algorithmic, backed by ETH, priced in USD.
- DAI v1.
- RAI.
- GHO.
Algorithmic, backed by BTC, priced in BTC.
- NakaDollar, which is the basis of Ethena.
  - Backed by 1x short 0.5 BTC + 0.5 BTC
Centralized.
- Centralized stablecoins are minted by centralized issuers like Tether and Circle. Their business model is that of a bank - they take customer deposits and generally-speaking buy treasury bills for risk-free yield. Ironically we have seen a bank run on DAI v2 (due to being collateralised by USDC) but not on Tether - their proprietary allocation has been defensive against rumours, whereas MakerDAO’s is open to the public to see.
- USDC (Circle USDC).
- USDT (Tether).

Privacy protocols.

Protocols:
- Zcash
- Aztec
- Tornado Cash
- Mimblewimble / Grin
Articles:
- How do privacy pools like Tornado Cash work?

Zero Knowledge Proofs (ZK).

ZK allows you to create proofs of computation that you can verify faster than you can run them. For a relation $F(x)=y$, you can run $F(x)$ and generate a proof $P$. The proof $P$ can be verified, $\mathrm{verify}(P)$, which shows $\mathrm{verify}(P)=F(x)$. Interestingly, while $F(x)$ is $O(N)$ to evaluate, $\mathrm{verify}(P)$ is $O(log(N))$ or even $O(1)$ in some schemes. $^{1}$

These are referred to as succinct arguments of knowledge. The ZK stands for zero knowledge. Schemes that employ the ZK aspect mean that your input to $F(x)$, $x$, remains encrypted, such that anyone reading your proof $P$ cannot know what $x$ is. This means you could for example, prove that your age is over 18, without revealing your ID. But most times you see ZK proofs, it’s only referring to the succinct part.

Not only that, but $F(x)$ can of course implement $\mathrm{verify(x)}$, whereby the function we are proving is in itself verifying proofs. This is referred to as recursive proofing, and sometimes associated with proof aggregation.

Using ZK proofs, you can compress the entire bitcoin verification work down to a single 20kB proof. The big-O notation can refer to time and space - some schemes are fast to prove, but generate larger proofs (STARK’s). Some schemes generate very small proofs (groth16) that are only 3 field multiplications to verify. You can even mix and match schemes using recursive proofing to get the benefits of both.

$^{1}$ this explanation is simplified. You are in fact proving the relation, $F(x) = y$ and $\mathrm{verify}(P)$ technically only returns a 1 or a 0. In order to get a return value, typically we wrap the execution inside a bootloader of sorts, which shows what program we’re proving and takes as input the hash of the return data. Also - generating a proof is not running $F(x)$ directly, rather converting $F(x)$ to a form like an arithmetic circuit or an algebraic intermediate representation (AIR).

Articles.
Concepts.
- Prover-verifier distinction.
- Satisfiability.
- Commitment scheme.
- STARK’s.
  - Trace - history of CPU registers and operations.
  - Arithmetization - transform trace to set of linear equations.
  - Interpolation - transform trace to set of polynomial equations.
  - Composition polynomial.
  - Proving low-degreeness.
    - Reed-Solomun encoding.
    - Interactive oracle proofs of proximity.
- SNARK’s.
  - Groth16.
    - Arithmetic circuit.
      - Quadratic Arithmetic Program (QAP).
Foundational papers.
Machine Learning proofs.
- Zero-Knowledge Proofs of Training for Deep Neural Networks
Further reading:
- Remco Bloemen’s blog
- LambdaClass blog
Books.
- Proofs, arguments, and zero knowledge
Open-source.
- Reverse-engineered StarkNet traces, contributed to first open-source ZK prover. (pullreq)

Multi-party-computation (MPC) and FHE.

Multi-party computation is akin to a shared encrypted CPU, where you upload a program, run it with inputs, and get an output. It can be used for example to do data analysis without revealing data to third parties. Imagine a program designed to detect how old you are - you encrypt your genome, send it to a remote server, it does MPC, and you get back the result - without them ever seeing your data. That’s the magic.

Threshold ECDSA wallets.
SPDZ
- general multiparty computation protocol secure against an active adversary corrupting up to $n−1$ of the $n$ players
- compute securely arithmetic circuits over any finite field $\displaystyle \mathbb{F}_{p^k}$
- Paper 1, paper 2
Renegade encrypted DEX (whitepaper)
- Renegade constructs a non-custodial dark pool - a CLOB-style DEX where the amounts are encrypted.
- Uses:
  - MPC to perform order matching and transfers (SPDZ-style maliciously-secure MPC-with-abort).
  - coSNARK’s to generate proofs of the MPC between trading parties.
  - Blockchain to settle ZK proofs, which advance the state machine.
Fully Homomorphic Encryption (FHE)
- Tiny BFV HE implemented in Tinygrad

Cryptographic techniques.

New Directions in Cryptography - a seminal paper by Diffie and Helman.
Accumulators and commitment schemes.
- Merkle trees.
- Sparse merkle trees (dicts/maps) (paper)
- Merkle mountain ranges.
- BLS signatures.
  - BLS signatures can be aggregated, as in Ethereum 2.0 consensus, hence they qualify as accumulators for specifically signatures.
- Vector commitments
  - Overview
Commit-reveal.
- Used in SchellingCoin, and atomic swaps.
Bloom filters.
Skip ratchets.
- Used in Signal, Matrix.
- How does a skip ratchet work?
Probabilistic sampling:
- Consensus.
  - Bitcoin, Ethereum probabilistically elect a leader per-round using a seed.
    - In Bitcoin, this is based on hashpower, adjusted in realtime using difficulty.
    - In Ethereum, this is based on the RANDAO hash, which is formed from block proposers contributing randomness through BLS signatures.
  - Arweave.
    - Arweave uses probabilistic sampling to test if nodes are storing content. Each block, a set of data roots is randomly sampled as a challenge to storage providers that they are storing that content.
- ZK proofs.
  - STARK proofs are based on a verifier probabilistically sampling random points on the curve given by the prover, the curve being constructed from a computational trace, representing the program’s execution.
Recursion in cryptographic computation schemes:
- ZK proofs
  - SNARK’s can themselves implement verifiers for SNARK’s (recursive verification)
- FHE.
  - FHE circuits theoretically implement decryptors of their own encrypted state
ZK proofs.
- SNARK’s
  - Groth16
- STARK’s
  - Binius
  - Circle STARK’s
- Collaborative SNARK’s (coSNARK’s) (paper, presentation): construct a SNARK proof between $N$ mutually distrusting parties.
Multi-party computation.
- SPDZ (paper)
- Threshold signing schemes. Private keys controlled by $N$ parties, where $M$ ($M<N$) can construct a signature.
  - Threshold ECDSA with identifiable abort (paper).
    - The classic. Used in many crypto protocols.
  - FROST (paper).
Public key schemes.
- RSA.
- ECDSA.
  - Compressed public keys / recovery trick.
- BLS signatures
- Schnorr signatures
- Ring signatures
- Merkle puzzles
  - This is pre-RSA and very interesting.
- Merkle signature scheme / one-time signatures
Curves.
- secp256k1, ed25519
- pairing-friendly 2 cycles: mnt4, mnt6
- pairing-friendly curves: bn254
Distributed Hash Tables (DHT’s).
- Kademlia
Information retrieval
- Private Information Retrieval (PIR).
- Fuzzy Message Detection
  - Problem: in a private cryptocurrency like Zcash, how do you check your balance without revealing your address to the node?
  - Penumbra implementation
Puzzles
- Hashcash, POW puzzles
- Verifiable Delay Functions (VDF’s)
Esoteric inventions.
- Verifiable Encryption under Committed Key (VECK)
  - Enables fair data exchange - atomic swaps of data for $.
  - Blog post
- Rateless Invertible Bloom Filters
  - Paper
  - Video
- Short-lived zero-knowledge proofs and signatures
- Low Overhead Broadcast Encryption from Multilinear Maps
- Secret Sharing with Certified Deletion (quantum)
- Proof of Latency Using a Verifiable Delay Function
- Order-preserving encryption
Functional Encryption
- Short: Classic public key encryption + new ability to generate keys specific to functions that compute over the encrypted data and return public output.
- Example usage of private contact tracing: Phones encrypt users’ location logs. Health authorities hold functional keys that only reveal whether two users were co-located within a time window—without revealing their full location histories
Indistinguishability Obfuscation

Security / trust models.

Trusted Execution Environments (TEE’s)
Liveness
Market manipulation
- See iMKR Synthetix debacle
Censorship resistance
Capture resistance
Slashing.
- Ethereum - automatic.
- Solana - manual - social slashing.
Malicious M-of-N / byzantine faults
- Bitcoin: 51% attack
- Tendermint: $3F+1$
- Ethereum: $\frac{n}{3}$ accountable safety
Collusion
Technical
- DNS attacks
- BGP attacks
- Network Time Protocol attacks
Mechanism design
- Incentives Build Robustness in BitTorrent

Data availability.

Data availability schemes are ways to ensure block data (transactions) can be scaled. They basically redundantly extend data using Reed-Solomun, distribute shards of that data widely to full nodes, and light nodes sample these data shares in order to check it’s available. Reed-Solomun allows us to do $M$ of $N$ reconstruction of the data, meaning we don’t require all $N$ nodes.

P2P networks.

Distributed Hash Tables
- See distributed systems.
- Ethereum uses a Kademlia DHT.
Articles.
- Taxonomy: P2P vs. Federated vs. Cryptonetwork
- IPFS vs. BitTorrent

Reputation systems.

Projects.
- Evidence-Based Subjective Logic (cited by the EU)
Eigentrust / PageRank.
BitTorrent tit-for-tat (paper).

Product-market fit.

Article: Crypto-product market fit

Misc blog posts.

“Applied economics”, mechanism design
- The blockchain folk theorem - reasonable mathematics-heavy paper that talks about Bitcoin and proof-of-work in terms of game theory.
- https://blog.ethereum.org/2014/11/25/proof-stake-learned-love-weak-subjectivity/
- https://blog.ethereum.org/2015/01/23/superrationality-daos/
- https://blog.ethereum.org/2015/02/14/subjectivity-exploitability-tradeoff/
- https://blog.ethereum.org/2014/09/02/software-bounded-rationality/
- https://blog.ethereum.org/2015/01/28/p-epsilon-attack/
- https://blog.ethereum.org/2014/06/19/mining/
- https://vitalik.eth.limo/general/2019/04/03/collusion.html

Notes.

Notes on Zcash