Skip to content

the calculation of advantage values #11

@caokaifa

Description

@caokaifa

Thank you for sharing such excellent work. I would like to ask about the calculation of advantage values. Initially, the data comes from the absolute_advantage in progress or stage 2. However, before stage 2 is trained, how is this data from progress calculated? Is it computed using Monte Carlo returns?

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions