Skip to main content

Policy Gradients

Meaning

Policy Gradients refer to a class of reinforcement learning algorithms that directly learn a parameterized policy function, which maps states to actions. They optimize the policy’s expected return through gradient ascent.