Fix test single_example_collator to wrap index as tensor by hamishivi · Pull Request #1477 · allenai/open-instruct

hamishivi · Feb 18, 2026

Summary

The single_example_collator in test_data_loader_gpu.py returned the raw example dict with index as a plain int. HFDataLoader._iter_batches calls len(batch["index"]) which fails with TypeError: object of type 'int' has no len().

Fix: wrap index as torch.tensor([index]) to match the production single_example_collator in data_loader.py.

Test plan

The fix matches the production collator pattern

Made with Cursor

The test collator returned raw int for index, causing TypeError in _iter_batches which calls len(batch["index"]). Match the production collator by wrapping as torch.tensor([index]). Co-authored-by: Cursor <cursoragent@cursor.com>

gemini-code-assist · Feb 18, 2026

Summary of Changes

Hello @hamishivi, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a runtime error in the test suite's data loading mechanism. It corrects the single_example_collator in test_data_loader_gpu.py to ensure the index field is consistently returned as a torch.Tensor, preventing a TypeError during batch processing and aligning its functionality with the production collator.

Highlights

Bug Fix: Addressed a TypeError in the single_example_collator within test_data_loader_gpu.py that occurred when len(batch["index"]) was called on an integer index.
Data Handling Consistency: Modified the single_example_collator to wrap the index as a torch.tensor([index]), aligning its behavior with the production single_example_collator in data_loader.py.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

open_instruct/test_data_loader_gpu.py
- Modified single_example_collator to wrap the index as a torch.tensor.
- Ensured consistency with the production single_example_collator's behavior to prevent TypeError.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request addresses a bug in test_data_loader_gpu.py where the single_example_collator function returned an index as a plain integer, leading to a TypeError. The fix wraps the index in a torch.tensor, which is the correct approach and aligns the test helper with its production counterpart in data_loader.py. The change is clear, correct, and improves the robustness of the test suite.

Co-authored-by: Cursor <cursoragent@cursor.com>

Now that single_example_collator wraps index as a tensor, test assertions need .item() or .tolist() to compare with plain ints. Co-authored-by: Cursor <cursoragent@cursor.com>

finbarrtimbers · Feb 18, 2026

ahhh can you re-run the single GPU script? I can do it tomorrow. I want to make sure that doesn't break

finbarrtimbers · Feb 18, 2026

ok ran the single GPU GRPO script: Beaker.

finbarrtimbers · Feb 18, 2026

And it passed! So I'm re-adding to the merge queue.

Fix test single_example_collator to wrap index as tensor

17097f5

The test collator returned raw int for index, causing TypeError in _iter_batches which calls len(batch["index"]). Match the production collator by wrapping as torch.tensor([index]). Co-authored-by: Cursor <cursoragent@cursor.com>

gemini-code-assist Bot reviewed Feb 18, 2026

View reviewed changes

hamishivi and others added 2 commits February 17, 2026 17:11

Add changelog entry for test collator fix

b38eb3e

Co-authored-by: Cursor <cursoragent@cursor.com>

Fix test assertions for tensor index values

e6689f6

Now that single_example_collator wraps index as a tensor, test assertions need .item() or .tolist() to compare with plain ints. Co-authored-by: Cursor <cursoragent@cursor.com>

finbarrtimbers enabled auto-merge February 18, 2026 06:13

finbarrtimbers disabled auto-merge February 18, 2026 06:13

finbarrtimbers enabled auto-merge February 18, 2026 16:16

finbarrtimbers approved these changes Feb 18, 2026

View reviewed changes

finbarrtimbers added this pull request to the merge queue Feb 18, 2026

Merged via the queue into main with commit bdf5554 Feb 18, 2026
7 checks passed

finbarrtimbers deleted the fix-test-collator-index branch February 18, 2026 17:33

gemini-code-assist Bot mentioned this pull request Feb 22, 2026

Add TextRLEnvironment for text-based RL environments #1489

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix test single_example_collator to wrap index as tensor#1477

Fix test single_example_collator to wrap index as tensor#1477
finbarrtimbers merged 3 commits intomainallenai/open-instruct:mainfrom
fix-test-collator-indexallenai/open-instruct:fix-test-collator-indexCopy head branch name to clipboard

hamishivi commented Feb 18, 2026

Uh oh!

gemini-code-assist Bot commented Feb 18, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

finbarrtimbers commented Feb 18, 2026

Uh oh!

finbarrtimbers commented Feb 18, 2026

Uh oh!

finbarrtimbers commented Feb 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Search code, repositories, users, issues, pull requests...

Conversation

hamishivi commented Feb 18, 2026

Summary

Test plan

Uh oh!

gemini-code-assist Bot commented Feb 18, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

finbarrtimbers commented Feb 18, 2026

Uh oh!

finbarrtimbers commented Feb 18, 2026

Uh oh!

finbarrtimbers commented Feb 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants