Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Comments

Close side panel

Add BERT support to FMS#467

Merged
ani300 merged 9 commits intomainfoundation-model-stack/foundation-model-stack:mainfrom
bertfoundation-model-stack/foundation-model-stack:bertCopy head branch name to clipboard
Oct 6, 2025
Merged

Add BERT support to FMS#467
ani300 merged 9 commits intomainfoundation-model-stack/foundation-model-stack:mainfrom
bertfoundation-model-stack/foundation-model-stack:bertCopy head branch name to clipboard

Conversation

@ani300
Copy link
Collaborator

@ani300 ani300 commented Sep 9, 2025

Needed for fine-tuning, and some other testing going on.

Still needs a full HF checkpoint adapter inference.

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
@ani300 ani300 changed the base branch from main to roberta-classification September 9, 2025 21:11
Copy link
Collaborator

@JRosenkranz JRosenkranz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add more model expectation tests?

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
fms/models/hf/utils.py Show resolved Hide resolved
position_ids = ((~is_pad).cumsum(1) - 1).clamp(min=0)
if position_ids is None:
position_ids = ((~is_pad).cumsum(1) - 1).clamp(min=0)
else:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the bert case?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this is the BERT case

fms/models/roberta.py Show resolved Hide resolved
head_bias: bool = True,
dropout: float = 0.0,
do_pooling: bool = False,
apply_pooling_fn: bool = False,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we removing this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trying to fully match both BERT adn roberta code in HF, which was a mess

@ani300 ani300 changed the base branch from roberta-classification to main October 1, 2025 15:04
Copy link
Collaborator

@JRosenkranz JRosenkranz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ani300 ani300 merged commit 68fd207 into main Oct 6, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.