Chris Liu
Home
News
Papers
Services
Posts
[..]
Posts
2026-03-02
Off-Policy Drift in LLM RL
2024-01-28
Deriving Direct Preference Optimization
2023-06-05
A Minimal Example of Double Descent
2020-12-24
Deriving Policy Gradient