🌱 chrisliu298's Garden
Search
Search
Search
Dark mode
Light mode
Explorer
A Minimal Example of Double Descent
Archive
Deriving Direct Preference Optimization
Deriving Policy Gradient
Home
❯
tags
❯
Tag: rlhf
Tag: rlhf
1 item with this tag.
Jan 28, 2024
Deriving Direct Preference Optimization
llm
rl
rlhf