‹ Liam Zebedee

Machine learning

In 2025, I’m going deep into ML and want a job. There is a large cache of notes here.

Sub-pages.

Talks.

Papers.

Distillations.

AI is statistics (science) applied to big data (engineering).

My story of the field.

I’ve been interested in ML since high school, ever since DeepDream came out. But I chose to go into crypto so I could travel the world. Now I’m back into ML.

Modern AI has existed since 2010, when we combined big data (ImageNet) with big compute (GPU’s). There has been a very steady linear progress in the capabilities of AI since 2010:

I believe that the progress will continue, linearly. The major things are:

  1. Energy (power grids).
    • Add more compute, get more intelligence.
  2. Statistics.
    • At its core, the ChatGPT unlock was about four things: attention, scaling compute, good dataset, and RLHF.
    • Core unlocks like attention and TTT.
  3. Software.
    • Cut-Cross Entropy is one example.
    • Quantization is another.
  4. Hardware.
    • GPU’s, tensor cores, TPU’s.
    • Optimizing for hardware layout.
  5. Data.
    • TikTok gets this, online learning makes system better, thus more usage, more training data.
  6. Product.
    • This is probably the most counterintuitive one here. But hear me out.
    • Deepseek is interesting because the reward signal comes from an external tool - python evaluating math equations.
    • OpenAI is the best-in-class consumer product, and their next iteration as of March 2025 is buildng tooling integrations.
    • Tooling is the cheapest way to more signal and thus more data.

Questions.

Interesting ideas.

On Sama

I like this position - https://ia.samaltman.com/