Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Comments

Close side panel

Add Roberta for classification to FMS#466

Merged
ani300 merged 5 commits intomainfoundation-model-stack/foundation-model-stack:mainfrom
roberta-classificationfoundation-model-stack/foundation-model-stack:roberta-classificationCopy head branch name to clipboard
Oct 1, 2025
Merged

Add Roberta for classification to FMS#466
ani300 merged 5 commits intomainfoundation-model-stack/foundation-model-stack:mainfrom
roberta-classificationfoundation-model-stack/foundation-model-stack:roberta-classificationCopy head branch name to clipboard

Conversation

@ani300
Copy link
Collaborator

@ani300 ani300 commented Sep 9, 2025

Needed for testing fine-tuning on AIU

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
"""
if self.apply_pooling_fn:
x = x[:, 0]
x = self.pooler_linear(x)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does adding this break anything existing? We were not doing this before?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we were never pooling before, so we never hit this path, and this was not matching the HF implementation, but now it does

Copy link
Collaborator

@JRosenkranz JRosenkranz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a new model expectation test for this?

Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
Signed-off-by: Antoni Viros i Martin <aviros@ibm.com>
str_to_activation(classifier_activation_fn),
dropout=classifier_dropout,
apply_pooling_fn=True,
do_pooling=True,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need both of these?

dropout: float
the dropout to use directly after activation (default is 0.0)
apply_pooling_fn: bool
do_pooling: bool
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, this answers my above

Copy link
Collaborator

@JRosenkranz JRosenkranz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ani300 ani300 merged commit 9b6e5db into main Oct 1, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.