Fix gpt oss#1101
Merged
bryce13950 merged 13 commits intodev-3.x-foldingTransformerLensOrg/TransformerLens:dev-3.x-foldingfrom Nov 5, 2025
Merged
Fix gpt oss#1101bryce13950 merged 13 commits intodev-3.x-foldingTransformerLensOrg/TransformerLens:dev-3.x-foldingfrom
bryce13950 merged 13 commits intodev-3.x-foldingTransformerLensOrg/TransformerLens:dev-3.x-foldingfrom
Conversation
1c7660e to
9fa5125
Compare
This commit addresses hook alias resolution issues for GPT-OSS MoE models and adds comprehensive unit tests. Changes: 1. Fixed JointGateUpMLPBridge hook_aliases to use gate.hook_out instead of in.hook_out/input.hook_out, which don't exist in this bridge type 2. Added 7 comprehensive unit tests in test_gpt_oss_moe.py that verify: - Model loads without downloading weights (using meta device) - Bridge creation works correctly - MLP uses JointGateUpMLPBridge (not regular MLPBridge) - Compatibility mode hooks are accessible - Experts structure is correct (batched tensors, not iterable modules) - Hook aliases resolve correctly - No incorrect BlockBridge wrapper around experts Root cause: - JointGateUpMLPBridge inherits from MLPBridge which has hook_aliases expecting in.hook_out or input.hook_out submodules - JointGateUpMLPBridge creates gate and up submodules instead, causing AttributeError when resolving aliases - Solution: Override hook_aliases at class level to use gate.hook_out Testing: All 7 tests pass, verifying GPT-OSS loads correctly and hooks work in compatibility mode without downloading the full 20B parameter model. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
Fixes # (issue)
Type of change
Please delete options that are not relevant.
Screenshots
Please attach before and after screenshots of the change if applicable.
Checklist: