Tony Chen
  • about
  • blog (current)
  • Autoscaling Autoresearch: Give your agents elastic GPUs on Modal

    How agentic research workloads can scale up and down elastically with Modal to accelerate experimentation.

    1 min read   ·   April 14, 2026   ·   Modal

    2026

  • Understand Policy Gradient by Building Cross Entropy from Scratch

    Understanding RL, especially policy gradient, could be non-trivial, particularly if you like grasping intuitions like I do. In this post, I will walk through a thread of thoughts that really helped me understand PG by starting from more familiar supervised learning setting.

    1 min read   ·   June 11, 2023

    2023