Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Comments

Close side panel

ilab model evaluate command and eval library usage#1369

Merged
alinaryan merged 35 commits intoinstructlab:maininstructlab/instructlab:mainfrom
cdoern:evalcdoern/instructlab:evalCopy head branch name to clipboard
Jun 30, 2024
Merged

ilab model evaluate command and eval library usage#1369
alinaryan merged 35 commits intoinstructlab:maininstructlab/instructlab:mainfrom
cdoern:evalcdoern/instructlab:evalCopy head branch name to clipboard

Conversation

@cdoern
Copy link
Contributor

@cdoern cdoern commented Jun 14, 2024

  1. added support for ilab model evaluate which allows users to run MMLU Bench MT Bench, MMLU Branch, and MT Branch Benchmarks
  2. Add an _evaluate class to config.yaml so that users can get sane evaluation defaults that they can see and modify. These funnel directly into the evaluation flags as training now does.
    a sample evaluation class looks like:
evaluate:
  branch: null
  mmlu:
    batch_size: 5
    few_shots: 2
  mmlu_branch:
    sdg_path: generated
  model_name: null
  mt:
    judge_model: prometheus-eval/prometheus-8x7b-v2.0
    max_workers: 40
    output_dir: eval_data
  mt_branch:
    taxonomy_path: taxonomy

@mergify mergify bot added the ci-failure PR has at least one CI failure label Jun 14, 2024
@cdoern
Copy link
Contributor Author

cdoern commented Jun 14, 2024

this is a placeholder for now, lmk when the library is somewhere I can access and import (with the actual code)

@nathan-weinberg
Copy link
Member

@cdoern you can install directory from test.pypy.org for testing if you wish: https://test.pypi.org/project/instructlab-eval

once we have a 0.0.1 release we'll publish that to production PyPI

@mergify mergify bot added ci-failure PR has at least one CI failure and removed ci-failure PR has at least one CI failure labels Jun 18, 2024
@russellb russellb self-requested a review June 19, 2024 00:26
src/instructlab/model/evaluate.py Outdated Show resolved Hide resolved
src/instructlab/model/evaluate.py Outdated Show resolved Hide resolved
src/instructlab/model/evaluate.py Show resolved Hide resolved
src/instructlab/model/evaluate.py Outdated Show resolved Hide resolved
@mergify mergify bot added the needs-rebase This Pull Request needs to be rebased label Jun 20, 2024
@mergify
Copy link
Contributor

mergify bot commented Jun 20, 2024

This pull request has merge conflicts that must be resolved before it can be
merged. @cdoern please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot removed the ci-failure PR has at least one CI failure label Jun 23, 2024
@mergify mergify bot added ci-failure PR has at least one CI failure and removed needs-rebase This Pull Request needs to be rebased ci-failure PR has at least one CI failure labels Jun 23, 2024
src/instructlab/model/evaluate.py Show resolved Hide resolved
src/instructlab/model/evaluate.py Outdated Show resolved Hide resolved
src/instructlab/model/evaluate.py Show resolved Hide resolved
src/instructlab/model/evaluate.py Outdated Show resolved Hide resolved
src/instructlab/model/evaluate.py Outdated Show resolved Hide resolved
src/instructlab/model/evaluate.py Outdated Show resolved Hide resolved
@mergify mergify bot added ci-failure PR has at least one CI failure and removed ci-failure PR has at least one CI failure labels Jun 25, 2024
src/instructlab/configuration.py Outdated Show resolved Hide resolved
cdoern and others added 4 commits June 29, 2024 09:55
Signed-off-by: Charlie Doern <cdoern@redhat.com>
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
Signed-off-by: Dan McPherson <dmcphers@redhat.com>
@mergify mergify bot added ci-failure PR has at least one CI failure and removed needs-rebase This Pull Request needs to be rebased one-approval PR has one approval from a maintainer labels Jun 29, 2024
Signed-off-by: Dan McPherson <dmcphers@redhat.com>
@mergify mergify bot removed the ci-failure PR has at least one CI failure label Jun 29, 2024
Copy link
Contributor

@danmcp danmcp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am approving with a few notes:

src/instructlab/model/evaluate.py Show resolved Hide resolved
src/instructlab/model/evaluate.py Show resolved Hide resolved
@mergify mergify bot added the one-approval PR has one approval from a maintainer label Jun 29, 2024
Copy link
Contributor

@alinaryan alinaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the strong work on this!! Just have a few comments/q's:

src/instructlab/configuration.py Show resolved Hide resolved
src/instructlab/configuration.py Show resolved Hide resolved
src/instructlab/configuration.py Show resolved Hide resolved
src/instructlab/model/evaluate.py Show resolved Hide resolved
src/instructlab/model/evaluate.py Show resolved Hide resolved
src/instructlab/model/evaluate.py Show resolved Hide resolved
src/instructlab/model/evaluate.py Show resolved Hide resolved
@cdoern
Copy link
Contributor Author

cdoern commented Jun 30, 2024

made follow up issues!

@cdoern
Copy link
Contributor Author

cdoern commented Jun 30, 2024

Given that tomorrow morning (7/1) is a deadline. This PR needs to be merged before then to add some form of evaluation support.

That being said, there is a stale change request on this PR, some pending reviews that haven't come in yet, etc.

We will be dismissing those in favor of deferring to follow up issues but if any of the reviewers have immediate follow up concerns please feel free to reach out!!!!

@russellb
Copy link
Contributor

I filed #1540 as a follow-up to get this tested in e2e CI

@ktam3 ktam3 added this to the 0.18.0 milestone Jul 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

testing Relates to testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Epic] RHEL AI backend commands [Epic] CLI Integration (July 15 GA)

8 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.