Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Conversation

PeterSH6
Copy link
Collaborator

  • Add the split placement tutorial by monkey-patch the fit function in ray_trainer
  • Split the actor/rollout/ref in one set of GPUs while mapping the critic and reward model to remaining GPUs
  • Currently, for simplicity, we only parallelize the execution of actor.update_actor and critic.update_critic. We can further parallelize operation in prepare experience stage.

@vermouth1992
Copy link
Collaborator

Actually, the critic/rm group can also have rollout. And this placement is most efficient in most cases. But this requires changing the code

@PeterSH6 PeterSH6 merged commit 6e8667b into main Dec 11, 2024
2 checks passed
@PeterSH6 PeterSH6 deleted the gm/placement branch December 11, 2024 14:41
yuchenwang3 pushed a commit to yuchenwang3/verl that referenced this pull request Apr 25, 2025
* [example] add a split placement tutorial

* lint
kaiyliu pushed a commit to kaiyliu/knowl_verl that referenced this pull request Jun 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.