Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Remove task logic with lm_eval 0.4.4 for agg_score#143

Merged
mergify[bot] merged 1 commit intoinstructlab:maininstructlab/eval:mainfrom
danmcp:aggfixdanmcp/eval:aggfixCopy head branch name to clipboard
Oct 1, 2024
Merged

Remove task logic with lm_eval 0.4.4 for agg_score#143
mergify[bot] merged 1 commit intoinstructlab:maininstructlab/eval:mainfrom
danmcp:aggfixdanmcp/eval:aggfixCopy head branch name to clipboard

Conversation

@danmcp
Copy link
Contributor

@danmcp danmcp commented Oct 1, 2024

lm_eval used to return an extra entry that corresponded to the tasks requested. Ex: mmlu_pr. As of 0.4.4 the entries are now the same whether the tasks are custom are not and the extra entry is removed. So the agg score now needs to be calculated from the individual task scores returned so the logic can be shared with mmluevaluator.

Without this change, the overall_score for mmlu_branch is being returned as 0.0 with lm_eval 0.4.4

lm_eval used to return an extra entry that corresponded to the tasks requested. Ex: mmlu_pr.  As of 0.4.4 the entries are now the same whether the tasks are custom are not and the extra entry is removed.  So the agg score now needs to be calculated from the individual task scores returned so the logic can be shared with mmluevaluator.

Signed-off-by: Dan McPherson <dmcphers@redhat.com>
@nathan-weinberg nathan-weinberg removed the request for review from alinaryan October 1, 2024 01:35
@mergify mergify bot merged commit c05af4d into instructlab:main Oct 1, 2024
@mergify mergify bot removed the one-approval label Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments

Close sidebar
Morty Proxy This is a proxified and sanitized view of the page, visit original site.