🌱 chrisliu298's Garden

Search

SearchSearch
      • A Minimal Example of Double Descent
      • Archive
      • Deriving Direct Preference Optimization
      • Deriving Policy Gradient
    Home

    ❯

    tags

    ❯

    Tag: rl

    Tag: rl

    2 items with this tag.

    • Jan 28, 2024

      Deriving Direct Preference Optimization

      • llm
      • rl
      • rlhf
    • Dec 24, 2020

      Deriving Policy Gradient

      • rl

    Created with Quartz v4.2.3 © 2024