return overall_score from MTBenchBranch.judge_answers()#138
Merged
danmcp merged 2 commits intoinstructlab:maininstructlab/eval:mainfrom Sep 28, 2024
alimaredia:mtbench-branch-judgement-return-overall-scorealimaredia/eval:mtbench-branch-judgement-return-overall-scoreCopy head branch name to clipboard
Merged
return overall_score from MTBenchBranch.judge_answers()#138danmcp merged 2 commits intoinstructlab:maininstructlab/eval:mainfrom alimaredia:mtbench-branch-judgement-return-overall-scorealimaredia/eval:mtbench-branch-judgement-return-overall-scoreCopy head branch name to clipboard
danmcp merged 2 commits intoinstructlab:maininstructlab/eval:mainfrom
alimaredia:mtbench-branch-judgement-return-overall-scorealimaredia/eval:mtbench-branch-judgement-return-overall-scoreCopy head branch name to clipboard
Conversation
Contributor
|
@sallyom FYI on the api change |
Contributor
|
Note: This can't merge until a change is made to the cli to allow for the 2 and 3 param tuple return at the same time. |
alimaredia
added a commit
to alimaredia/instructlab
that referenced
this pull request
Sep 27, 2024
This commit is in preparation of a commit to the eval library [1] that returns the overall score from MT-Bench-Branch judgement. [1] instructlab/eval#138 Signed-off-by: Ali Maredia <amaredia@redhat.com>
Merged
6 tasks
2e8afd7 to
510478a
Compare
danmcp
approved these changes
Sep 27, 2024
Contributor
danmcp
left a comment
There was a problem hiding this comment.
Looks good, just need to fix the formatting error
danmcp
pushed a commit
to alimaredia/instructlab
that referenced
this pull request
Sep 27, 2024
This commit is in preparation of a commit to the eval library [1] that returns the overall score from MT-Bench-Branch judgement. [1] instructlab/eval#138 Signed-off-by: Ali Maredia <amaredia@redhat.com>
This allows the overall_score to be shown by callers of the library along with qa pairs and the error rate. This commit changes what a function in the library returns and thus is not backwards compatible. Signed-off-by: Ali Maredia <amaredia@redhat.com>
510478a to
219bca1
Compare
nathan-weinberg
approved these changes
Sep 27, 2024
Signed-off-by: Ali Maredia <amaredia@redhat.com>
alinaryan
approved these changes
Sep 27, 2024
danmcp
reviewed
Sep 27, 2024
| Returns: | ||
| overall_score overall score from the evaluation | ||
| qa_pairs Question and answer pairs (with scores) from the evaluation | ||
| error_rate percentage of questions dropped due to errors during evaluation |
Contributor
There was a problem hiding this comment.
If you do make another change to this commit, this is also missing for the mt_bench case above
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This allows the overall_score to be shown by
callers of the library along with qa pairs
and the error rate.