‹ back Home/Cryptography and blockchain

Cryptography and blockchain

Ideas.

My projects.

Smart, interesting people to follow.

Blockchains.

Blockchains are byzantine fault tolerant (BFT) replicated state machines.

They usually consist of:

  • state, whether its a mapping of accounts to balances, or more complex things like positions in lending protocols
  • a virtual machine (VM) paired with an execution environment, that runs smart contracts (programs which modify state and have built-in API’s)
  • users creating transactions, which are cryptographically signed intents to execute functions on smart contracts on the VM.
  • a cryptocurrency for paying for the fees of running the network
  • a resource metering system like gas, which measures the cost of one transaction.
  • a consensus protocol which decides on the order of transactions applied, which is usually byzantine-fault-tolerant (BFT)
  • a regular interval at which transactions are applied, termed a block
  • the space inside one block, termed blockspace, where transactions can be filled. This space is paid for via fees. And fees are priced according to an automated market maker for gas.
  • a set of incentives (block rewards) for nodes to (1) store the data (state and history), (2) execute transactions and (3) continue providing blockspace via consensus.

A blockchain’s state can be derived by getting every transaction in an ordered sequence $T$, and applying the state transition function serially.

Concepts:

  • Blockspace.
  • Gas markets.
  • Data availability.
  • Blockchain state models:
    • UXTO - Bitcoin.
    • Account-based - Ethereum, Solana.
    • Private UXTO (privacy systems - Aztec, Zcash).
    • Comparison with legal systems:
  • State growth and history.
  • Execution.
    • EVM.
    • Cosmos.
    • SVM.
    • zkVM.
  • Settlement.
  • Sequencing.
    • Sequencing refers to the algorithm to derive the transaction sequence. In Bitcoin and Ethereum, this is just the consensus algorithm.
    • In other cases (L2’s), sequencing can be centralized.
  • Merklelization.
    • Bitcoin and Ethereum make use of merkle trees to commit state, which makes it possible to proof individual state leaves in places outside of the chain (e.g. storage proofs).
  • Models for integrity:
    • Replicated state machine. $O(N)$ where $N$ is number of transactions
    • ZK proof. $O(log(N))$ to $O(1)$, depending on scheme.
  • Scaling.
    • Optimistic rollups.
      • Interactive binary search for fraudulent opcode.
    • ZK rollups.
      • Use L1 as the DA layer for L2 transactions, post proof and state diff.
    • Based rollups.
      • Using Ethereum L1 as a base layer, and enshrining a Lisp-style eval opcode.
  • Security.
    • Proof-of-work (PoW).
      • Objectivity w/ checkpointing in early days.
    • Proof-of-stake (PoS).
  • Tokenomics.
    • Coinbase / block reward mechanism.
    • Issuance curve.
    • Deflationary vs. inflationary.
  • Bridging / interoperability.
  • MEV

Bitcoin.

Ethereum / EVM.

Cosmos.

  • Cosmos chains are based on CometBFT (aka. Tendermint).
  • Execution.
    • There is no execution environment - smart contracts are simply written in Go, and call out to standard Cosmos API’s for storage.
    • There is no gas metering - Cosmos chains define their own fees for each smart contract function.
    • Bridging is standard, thanks to the IBC protocol.

Solana.

  • Solana whitepaper
  • Proof of history.
    • Validators run hashes recursively, that is, H(0 ++ t0), H(H(t0) ++ t1), H(H(H(t0) ++ t1) ++ t2).
    • Each hash’s input is the last hash concatenated with current state.
    • Transactions are inserted into t as they are received.
    • The output of H functions as a logical clock, that defines a forward transaction sequence.

Rogue.

Blockchain-specific devices.

  • Smart contracts.
    • The execution model of blockchains. Small pieces of code which have their own persistent state, can emit events, can call other contracts (subject to call depth limitations) and can receive tokens just like any other externally owned account (operated by users).
    • First published by Nick Szabo.
  • Oracles.
    • Data feeds from the external world, usually involving prices of assets (USD, BTC) or in prediction markets, adjudications of outcomes.
    • Centralized:
      • ChainLink.
    • Token-based.
  • Bridges.
  • Keepers.
    • Small agents which “keep” systems up to date.
    • Examples: liquidation keepers.
  • Indexers.
    • Although blockchains are databases, the indexing is not enshrined at the blockchain layer. Instead, external third-party indexers allow you to query the chain.

MEV.

MEV is known as Miner Extractible Value / Max Extractible Value.

The basic idea - you can manipulate the transaction sequence (include, uninclude, reorder) on a blockchain in order to make profit.

Consensus protocols.

Markets

Markets are where buyers and sellers transact. Markets are usually priced in pairs, e.g. BTC/USD refers to a market with BTC and USD in it. Market makers are organisations that keep the price steady, or otherwise facilitate liquidity such that people can buy and sell when they want to. Market manipulation refers to when people commit short-term trades in order to move the price, and frighten other buyers into selling (and some more complex strategies).

Markets are implemented in a few ways on-chain:

  • orderbooks (e.g. CLOB) + matching engines (compute mid-point, fill orders).
  • automated market makers

Automated market makers.

Perpetual futures.

Crypto is the first place perpetual futures were widely implemented (on BitMEX). For a good engineering rundown, I suggest reading the Synthetix futures protocol proposal, which details the mathematics.

Prediction markets.

Prediction markets are markets where you bet on the outcome of an event.

For example, “president of USA in 2025” is the event, and “Donald J Trump”, “Kamala Harris” etc. are the outcomes. Each outcome is tokenised, and sold via a market maker. For example, “Donald J Trump” might trade at 60c. The price reflects the probability of the outcome - meaning 60c is a 60% odds. All outcomes prices sum to $1, following the law that all probabilities for mutually exclusive events must sum to 1.

The market has defined resolution critiera, whereby the outcome is defined according to some data source plus additional conditions. Participants can buy/sell outcome tokens at any time. In Polymarket, they use a CLOB to fill trades. Prices reflect supply/demand tension and are filled at midpoint.

Stablecoins.

  • Algorithmic, backed by ETH, priced in USD.
  • Algorithmic, backed by BTC, priced in BTC.
    • NakaDollar, which is the basis of Ethena.
      • Backed by 1x short 0.5 BTC + 0.5 BTC
  • Centralized.
    • Centralized stablecoins are minted by centralized issuers like Tether and Circle. Their business model is that of a bank - they take customer deposits and generally-speaking buy treasury bills for risk-free yield. Ironically we have seen a bank run on DAI v2 (due to being collateralised by USDC) but not on Tether - their proprietary allocation has been defensive against rumours, whereas MakerDAO’s is open to the public to see.
    • USDC (Circle USDC).
    • USDT (Tether).

Privacy protocols.

Cryptographic techniques.

  • New Directions in Cryptography - a seminal paper by Diffie and Helman.
  • Accumulators and commitment schemes.
  • Commit-reveal.
  • Bloom filters.
  • Skip ratchets.
  • Probabilistic sampling:
    • Consensus.
      • Bitcoin, Ethereum probabilistically elect a leader per-round using a seed.
        • In Bitcoin, this is based on hashpower, adjusted in realtime using difficulty.
        • In Ethereum, this is based on the RANDAO hash, which is formed from block proposers contributing randomness through BLS signatures.
      • Arweave.
        • Arweave uses probabilistic sampling to test if nodes are storing content. Each block, a set of data roots is randomly sampled as a challenge to storage providers that they are storing that content.
    • ZK proofs.
      • STARK proofs are based on a verifier probabilistically sampling random points on the curve given by the prover, the curve being constructed from a computational trace, representing the program’s execution.
  • Recursion in cryptographic computation schemes:
    • ZK proofs
      • SNARK’s can themselves implement verifiers for SNARK’s (recursive verification)
    • FHE.
      • FHE circuits theoretically implement decryptors of their own encrypted state
  • ZK proofs.
    • SNARK’s
      • Groth16
    • STARK’s
    • Collaborative SNARK’s (coSNARK’s) (, presentation): construct a SNARK proof between $N$ mutually distrusting parties.
  • Multi-party computation.
    • SPDZ ()
    • Threshold signing schemes. Private keys controlled by $N$ parties, where $M$ ($M<N$) can construct a signature.
      • Threshold ECDSA with identifiable abort ().
        • The classic. Used in many crypto protocols.
      • FROST ().
  • Public key schemes.
  • Curves.
    • secp256k1, ed25519
    • pairing-friendly 2 cycles: mnt4, mnt6
    • pairing-friendly curves: bn254
  • Distributed Hash Tables (DHT’s).
  • Information retrieval
  • Puzzles
    • Hashcash, POW puzzles
    • Verifiable Delay Functions (VDF’s)
  • Esoteric inventions.
  • Functional Encryption
    • Short: Classic public key encryption + new ability to generate keys specific to functions that compute over the encrypted data and return public output.
    • Example usage of private contact tracing: Phones encrypt users’ location logs. Health authorities hold functional keys that only reveal whether two users were co-located within a time window—without revealing their full location histories
  • Indistinguishability Obfuscation

Zero Knowledge Proofs (ZK).

ZK allows you to create proofs of computation that you can verify faster than you can run them. For a relation $F(x)=y$, you can run $F(x)$ and generate a proof $P$. The proof $P$ can be verified, $\mathrm{verify}(P)$, which shows $\mathrm{verify}(P)=F(x)$. Interestingly, while $F(x)$ is $O(N)$ to evaluate, $\mathrm{verify}(P)$ is $O(log(N))$ or even $O(1)$ in some schemes. $^{1}$

These are referred to as succinct arguments of knowledge. The ZK stands for zero knowledge. Schemes that employ the ZK aspect mean that your input to $F(x)$, $x$, remains encrypted, such that anyone reading your proof $P$ cannot know what $x$ is. This means you could for example, prove that your age is over 18, without revealing your ID. But most times you see ZK proofs, it’s only referring to the succinct part.

Not only that, but $F(x)$ can of course implement $\mathrm{verify(x)}$, whereby the function we are proving is in itself verifying proofs. This is referred to as recursive proofing, and sometimes associated with proof aggregation.

Using ZK proofs, you can compress the entire bitcoin verification work down to a single 20kB proof. The big-O notation can refer to time and space - some schemes are fast to prove, but generate larger proofs (STARK’s). Some schemes generate very small proofs (groth16) that are only 3 field multiplications to verify. You can even mix and match schemes using recursive proofing to get the benefits of both.

$^{1}$ this explanation is simplified. You are in fact proving the relation, $F(x) = y$ and $\mathrm{verify}(P)$ technically only returns a 1 or a 0. In order to get a return value, typically we wrap the execution inside a bootloader of sorts, which shows what program we’re proving and takes as input the hash of the return data. Also - generating a proof is not running $F(x)$ directly, rather converting $F(x)$ to a form like an arithmetic circuit or an algebraic intermediate representation (AIR).

Multi-party-computation (MPC) and FHE.

Multi-party computation is akin to a shared encrypted CPU, where you upload a program, run it with inputs, and get an output. It can be used for example to do data analysis without revealing data to third parties. Imagine a program designed to detect how old you are - you encrypt your genome, send it to a remote server, it does MPC, and you get back the result - without them ever seeing your data. That’s the magic.

  • Threshold ECDSA wallets.
  • SPDZ
    • general multiparty computation protocol secure against an active adversary corrupting up to $n−1$ of the $n$ players
    • compute securely arithmetic circuits over any finite field $\displaystyle \mathbb{F}_{p^k}$
    • Paper 1, paper 2
  • Renegade encrypted DEX (whitepaper)
    • Renegade constructs a non-custodial dark pool - a CLOB-style DEX where the amounts are encrypted.
    • Uses:
      • MPC to perform order matching and transfers (SPDZ-style maliciously-secure MPC-with-abort).
      • coSNARK’s to generate proofs of the MPC between trading parties.
      • Blockchain to settle ZK proofs, which advance the state machine.
  • Fully Homomorphic Encryption (FHE)

Indistinguishability obfuscation.

iO = (public inputs, public outputs, hidden logic)

Using iO, you might design a sealed state signer - a function which produces signatures over a keypair, where the keypair is embedded in the logic - and thus hidden from any user of the function.

This would be very powerful for token bridges. To bridge tokens from one blockchain to another, we typically require a counterparty to operate wallets on both networks. When a user makes a deposit, the counterparty signs a transaction withdrawing the same amount on the other chain.

All current bridge designs use either a single signer, a multisig, or a threshold signer (M-of-N, threshold ECDSA typically) for their counterparty. As there is no way to program what types of data the counterparty can sign, this makes the bridges susceptible to theft and hacks.

Now imagine we design a function which custodies a private key, and given a ZK proof of a deposit, outputs a signature for a withdrawal of that same amount. Applying IO, anyone can use this function to generate signatures, the signatures are generated only according to predefined conditions, and the private key is never leaked.

Public inputs (deposit transaction, ZK proof), public outputs (signature for withdrawal), hidden logic (private key).

Contrast this with the benefits of other cryptographic techniques:

  • Succinct proofs: offer succinct verifiable proofs of computation
    • (input) -> F(input) -> (output)
    • F(input) is O(N)
    • prove(F(input)) produces proof
    • verify(proof) is O(log N) or O(1)
  • MPC/FHE: offer encrypted computation that does not leak inputs/outputs
    • (input) -> F(input) -> (output)
    • input, F(input), and output all remain encrypted.

Read more:

Security / trust models.

  • Trusted Execution Environments (TEE’s)
  • Liveness
  • Market manipulation
  • Censorship resistance
  • Capture resistance
  • Slashing.
    • Ethereum - automatic.
    • Solana - manual - social slashing.
  • Malicious M-of-N / byzantine faults
    • Bitcoin: 51% attack
    • Tendermint: $3F+1$
    • Ethereum: $\frac{n}{3}$ accountable safety
  • Collusion
  • Technical
  • Mechanism design

Data availability.

Data availability schemes are ways to ensure block data (transactions) can be scaled. They basically redundantly extend data using Reed-Solomun, distribute shards of that data widely to full nodes, and light nodes sample these data shares in order to check it’s available. Reed-Solomun allows us to do $M$ of $N$ reconstruction of the data, meaning we don’t require all $N$ nodes.

P2P networks.

Reputation systems.

Product-market fit.

 
0:00