feat: DK-Bench ilab implementation by alimaredia · Pull Request #2940 · instructlab/instructlab

alimaredia · Jan 18, 2025

This commit adds Domain Knowledge bench (DK-Bench) as an evaluation option as part of
ilab model evaluate.

This commit adds the cli flags to enable run the
benchmark.

Checklist:

Commit Message Formatting: Commit titles and messages follow guidelines in the
conventional commits.
Changelog updated with breaking and/or notable changes for the next minor release.
Documentation has been updated, if necessary.
Unit tests have been added, if necessary.
Functional tests have been added, if necessary.
E2E Workflow tests have been added, if necessary.

RobotSail · Jan 18, 2025

src/instructlab/cli/model/evaluate.py

+    help="Comma-separated list of file formats for results of the DK-Bench evaluation. Valid options are csv, jsonl, and xlsx. If this option is not provided the results are written as a .jsonl file",
+)
+@click.option(
+    "--model-prompt",


Would it make more sense for this to be --system-prompt?

I thought about that but I thought maybe a user new to AI would understand that the --model-prompt would be for the model specifically being tested in case they didn't know what that these prompts are generally known as system prompts.

If you really think it should be --system prompt I don't have any objection to changing it.

--model-prompt would imply it's what we're prompting the model with, i.e., the user prompt. We should stick with --system-prompt in order to be consistent.

RobotSail · Jan 18, 2025

src/instructlab/cli/model/evaluate.py

+    help="Temperature for the model when getting responses in the DK-Bench evaluation",
+)
+@click.option(
+    "--dk-bench-model",


How is this different from the --model flag that already exists in this CLI command?

The way this PR is currently written there are two things the command does.

If a user doesn't pass in a --dk-bench-model and the .jsonl file has response, reference, and user input then the command just does the evaluation on the file.

If you pass in the dk_bench_model then responses are generated from the model for the questions.

I had started by using the --model flag, but ran into some issues with Click always passing a value into model because it's set in every profile even though the default in Pydantic is None. This made mode number 1 impossible to execute.

I'm thinking about require a student model in every evaluation run since you'd expect a model to be evaluated in the ilab model evaluate command, thus getting rid of the first functionality above. One of the many benefits of this is we could go back to use the --model flag.

If in the future users want to just be able to do evaluation on a file with responses then we can tweak or add to the command then.

@RobotSail what do you think?

@alimaredia Okay, I see your challenge.

What you could probably do is check if the JSONL file contains response fields already, and if so then you can just pass None in for the student model. Otherwise, you can just try to proceed with what's set at the student model.

This way you would still get the desired functionality but without needing to create a separate field.

Otherwise, the user will be burdened to have a different model set depending on which eval they run which will be very confusing for them.

src/instructlab/cli/model/evaluate.py

RobotSail · Jan 18, 2025

src/instructlab/model/dk_bench_utils.py

+    except OpenAIError as exc:
+        raise ClientException(f"Connection Error {exc}") from exc
+
+    if len(models.data) != 1:


Why would we error if the server has more than one models actively running? What if I want to be evaluating multiple models concurrently?

My thinking here was that most OpenAI compatible endpoints, or at least in my research those served with vLLM only had one model served per endpoint.

I'd like it for the user to not have to think about passing in a model name at all and just get it for the server since getting the model name right is required to get the responses from the student model in the Eval library. I even had a flag for a model name that and then removed it for that reason.

That being said in my next iteration of the PR I'm thinking of re-adding the --model-name flag and checking if that model_name is in the list of model names returned by the server. If it's not I'll throw an error. If no model_name is passed in and there are more than 1 models on the server I'll throw an error telling the user to pass in the model name from the ones returned by the server.

@RobotSail what do you think?

@alimaredia since you're already performing logic to check if the student model exists in the list of served models, you can either delete this check entirely or have it only check that the server has at least one model being served.

In either of the proposed cases, you will have the same outcome of knowing whether or not a model exists. But it might help with UX if you can raise the error that the active server isn't hosting any models.

RobotSail · Jan 18, 2025

src/instructlab/model/dk_bench_utils.py

+            backend=backend,
+            enable_serving_output=enable_serving_output,
+        )
+    else:


Are we sure that we have enough information here to know that a model is an endpoint?

I could add a check that curls the endpoint or something like that. I figured get_endpoint_model_name would throw an error if the endpoint wasn't valid anyway.

In this case you have only ruled out that the model either doesn't exist locally or is an invalid format - not necessarily that it's an endpoint.

For the most part I think this logic is fine, but you should check with the above that it's not the case that they're trying to pass in a local model that simply doesn't exist.

RobotSail · Jan 18, 2025

src/instructlab/model/dk_bench_utils.py

+            enable_serving_output=enable_serving_output,
+        )
+    else:
+        logger.debug("DK-Bench model is an endpoint")


Why are we calling it a dk-bench model? How is this different from a student model?

It's the same thing, but I'll rephrase this to Model being evaluated in DK-Bench is an endpoint.

Yes I like that.

RobotSail · Jan 18, 2025

src/instructlab/model/dk_bench_utils.py

+
+        api_key = os.environ.get("STUDENT_ENDPOINT_API_KEY", None)
+        if api_key is None:
+            logger.debug(


If no API key was set then you should set it to be an empty string, not None.

Good point. Per this function: https://github.com/instructlab/eval/blob/main/src/instructlab/eval/mt_bench_common.py#L369 it looks like "NO_API_KEY" should be the value I'm passing in and checking for in this case.

You can set the API key to whatever string you want if one is set, provided that the student model is running in vLLM and not on an hosted OpenAI-compatible server which expects an API key. I'd say it would be worthwhile to simply set it to the default you suggested for consistency, and then add a debug log indicating that you've done so because none was provided.

RobotSail · Jan 18, 2025

src/instructlab/model/dk_bench_utils.py

+    gpus: int | None,
+    backend: str | None,
+    enable_serving_output: bool,
+) -> tuple:


When returning something like a list, tuple, dict, etc., you must use the typing library for this:

from typing import Tuple # ... def foo() -> Tuple[str, int, bool]: return ('hello', 42, False)

Suggested change

) -> tuple:

) -> Tuple:

RobotSail · Jan 18, 2025

src/instructlab/model/dk_bench_utils.py

+    QNA_YAML = "qna.yaml"
+
+
+def model_is_localand_valid(model: str) -> bool:


typo

Suggested change

def model_is_localand_valid(model: str) -> bool:

def model_is_local_and_valid(model: str) -> bool:

RobotSail · Jan 18, 2025

src/instructlab/model/dk_bench_utils.py

+
+def make_run_dir(output_dir: str) -> str:
+    now = datetime.now()
+    timestamp = now.strftime("%m-%d-%Y_%H-%M-%S-%f")


It seems like we are using some magical string formatting here. If you don't want to use the built-in string formatting then could you please leave a comment explaining why we're doing it this way?

What's the default format for a timestamp that you mentioned below?

If the user runs this bench mark over and over I didn't want results files piling up in DEFAULTS.EVAL_DIR, so I'm creating a directory with the timestamp of the run in the name. I wanted to have a timestamp that goes down to the microsecond level just so that two runs couldn't possibly have the same directory.

@alimaredia You can use the datetime.isoformat() method if you're looking for a string. Otherwise you can use datetime.timestamp to get a floating point representation of the current UNIX timestamp.

Suggested change

timestamp = now.strftime("%m-%d-%Y_%H-%M-%S-%f")

timestamp = now.isoformat()

RobotSail · Jan 18, 2025

src/instructlab/model/dk_bench_utils.py

+    now = datetime.now()
+    timestamp = now.strftime("%m-%d-%Y_%H-%M-%S-%f")
+    if output_dir.endswith("/"):
+        output_dir = output_dir[:-1]


Does this give you a single element or a list?

It gives you back a single element. That being said, there's better ways to string a trailing slash from a path in python, so I'll change this in my next iteration.

@alimaredia Ah I see. I recommend using pathlib.Path for any path-related logic here. It will prevent you from needing to write your own logic for these cases.

RobotSail · Jan 18, 2025

src/instructlab/model/dk_bench_utils.py

+    return run_dir
+
+
+def print_header():


Were you planning on expanding this function?

This is just the beginning of the output. I'll get rid of the function for the time being since it's so small.

RobotSail · Jan 18, 2025

src/instructlab/model/dk_bench_utils.py

+    question_num = 0
+    for score in result.scores:
+        question_num += 1
+        print(f"Question #{question_num}:     {score['domain_specific_rubrics']}/5")


Suggested change

question_num = 0

for score in result.scores:

question_num += 1

print(f"Question #{question_num}: {score['domain_specific_rubrics']}/5")

for i, score in enumerate(result.scores):

print(f"Question #{i+1}: {score['domain_specific_rubrics']}/5")

RobotSail · Jan 18, 2025

src/instructlab/model/dk_bench_utils.py

+    average = round(average, 2)
+    print("----------------------------")
+    print(f"Average Score:   {average}/5")


Instead of averaging you can simply use formatting syntax

Suggested change

average = round(average, 2)

print("----------------------------")

print(f"Average Score: {average}/5")

print("----------------------------")

print(f"Average Score: {average:.2f}/5")

src/instructlab/model/dk_bench_utils.py

RobotSail · Jan 18, 2025

src/instructlab/model/dk_bench_utils.py

+    response_df["scores"] = scores
+
+    now = datetime.now()
+    timestamp = now.strftime("%m-%d-%Y_%H-%M-%S-%f")


Not against using timestamps but if you plan on using non-standard ones then please add a comment explaining why you picked it and what the format is intended to be.

What's a standard timestamp format? Here I'm trying to have a unique identifier for each run in the data.

#2940 (comment)

RobotSail · Jan 18, 2025

src/instructlab/model/dk_bench_utils.py

+    timestamp = now.strftime("%m-%d-%Y_%H-%M-%S-%f")
+    response_df["evaluation_run"] = f"run-{timestamp}"
+
+    print("Responses and scores written to:")


Written to where?

The print() on line 246 outputs the full path of the file to the user. An example of the entire output right now is:

# DK-BENCH REPORT ## MODEL: finetuned Question #1: 5/5 Question #2: 5/5 ---------------------------- Average Score: 5.0/5 Total Score: 10/10 Responses and scores written to: /home/ali/.local/share/instructlab/internal/eval_data/job-01-18-2025_00-49-57-386910/responses-finetuned-01-18-2025_00-49-57-388553.jsonl INFO 2025-01-18 00:49:57,389 instructlab.model.evaluate:125: ᕦ(òᴗóˇ)ᕤ Model evaluate with DK-Bench completed! ᕦ(òᴗóˇ)ᕤ

Let's make sure the prints all happen in a single place. And also let's make sure we are using logger.info instead of printing. This way if someone runs ilab in a server and redirects the output elsewhere, it will go to its intended source versus STDOUT.

We discussed this elsewhere - the results will continue to use print while log messages will use logger

RobotSail · Jan 18, 2025

src/instructlab/model/dk_bench_utils.py

+            )
+            create_excel_results_file(results_file, result)
+        else:
+            logger.debug("Output format %s is not valid", fmt)


You're supporting QNA.yaml as a file type but it seems like you silently fail here.

This function is just about output types and this initial PR is going to just have .jsonl as the input. I'll have another one open soon that supports qna.yaml as an input format and remove it from the list above in the mean time.

I see. In this case I recommend having a sane default if we didn't validate their output format and they got this far. But I think we talked above that we should just validate as early as possible.

src/instructlab/model/dk_bench_utils.py

RobotSail · Jan 18, 2025

src/instructlab/model/dk_bench_utils.py

+    server = None
+    try:
+        evaluator = RagasEvaluator()
+        if model is not None:


In your type-hinting above, model is hinted as only ever being set to str, so this check will always be True. Is there something I'm missing here?

This check would go away per the changes I'm proposing in https://github.com/instructlab/instructlab/pull/2940/files#r1921098046

Yeah makes sense.

RobotSail · Jan 18, 2025

src/instructlab/model/dk_bench_utils.py

+            )
+            model_name = "no-model-provided"
+
+    finally:


Why have we wrapped these statements in a broad try statement where we have no except statement and are only using finally? It seems like we should either handle the exception case for the exceptions you plan on being raised, or we should just get rid of the try/finally blocks.

This try -> finally pattern is used for every other benchmark in evaluate.py. Take a look at this for an example: https://github.com/instructlab/instructlab/blob/main/src/instructlab/model/evaluate.py#L98

It seems like it has to do with shutting down local servers that are started up. @danmcp is that right?

That being said this code is confusing to read and I'm going to try to split this logic out into an evaluate_local_model function so that it's in the same place as the call to launch_server and looks like the rest of the code for the other benchmarks.

@alimaredia I completely understand why you'd wrap it this way. But my point here is that we need to catch the exception here if we're wrapping it. Otherwise we are silently failing and ignoring the failure, which is not a good pattern.

Otherwise we are silently failing and ignoring the failure, which is not a good pattern.

There are multiple ways to manage the exceptions here. But as written, this wouldn't ignore any failures. They would simply bubble up to the caller.

@danmcp Yes, that's a good point, thank you for pointing that out.

What will actually happen here then is everything inside of the try will execute, if an exception is raised at any point then we automatically jump to the finally statement, and then the exception continues to bubble up.

Given that the finally statement here exists solely to shut down a server if we had created one in the first branch of the if statement, would it make sense to narrow the scope of this try catch statement + server initialization to be under the if model is not None branch? It seems like the server bits only really apply to the first branch, but the exceptions will propagate up no matter what.

For additional context, although this is syntactically fine, the problem this introduces is that if someone unfamiliar with this codebase inspects this try/finally statement, they might attribute meaning to the entire statement being wrapped where there is none.

In this sense, limiting error-handling statements to their effective scope promotes interpretability in the codebase.

would it make sense to narrow the scope of this try catch statement + server initialization to be under the if model is not None branch?

+1 from me

src/instructlab/model/evaluate.py

RobotSail

Great work on this PR so far, we're definitely getting there. I've provided feedback on the stuff that still needs to be worked on. Let me know once you've addressed this and I'll take another look. Thanks!

github-actions · Jan 24, 2025

e2e workflow succeeded on this PR: View run, congrats!

scripts/test-data/profile-l4-x1.yaml

nathan-weinberg · Jan 24, 2025

scripts/test-data/profile-l40s-x4.yaml

+    # Directory where DK-Bench evaluation results are stored.
+    output_dir: ~/.local/share/instructlab/internal/eval_data/dk_bench
+    # Comma-separated list of file formats for results of the DK-Bench evaluation.
+    output_file_formats: jsonl


Couldn't we just set this to all of them in the profiles, rather than do the flag override in the E2E script?

src/instructlab/configuration.py

src/instructlab/model/dk_bench_utils.py

mergify · Jan 24, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. @alimaredia please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

alinaryan · Jan 24, 2025

scripts/test-data/profile-l4-x1.yaml

+  system_prompt: You are an advanced AI assistant designed to provide precise and
+    accurate information. Your primary goal is to answer queries with the most up-to-date
+    and factual information available. Focus on delivering clear, concise, and correct
+    responses. If you're uncertain about any aspect of the query, state your level
+    of confidence and provide the most accurate information you can. Your responses
+    should prioritize accuracy over all other considerations.
+  # Temperature for model getting responses during DK-Bench.
+  temperature: 0.0


these aren't under the dk_bench section, so is this setting system_prompt and temperature for benchmark models other than dk-bench?

same q applies to the other test-data files too

@alinaryan Great catch! Both temperature and system_prompt are settings which are common to all benchmarks, since they control how the student model generates information. That said, @alimaredia I would avoid setting system_prompt explicitly in these files and would instead allow ilab to figure it out internally. This should be used as more of an override setting when users are trying out different ideas.

same q applies to the other test-data files too

We'll be bringing in more Ragas-based metrics that will also be reading in from the test-data files.

src/instructlab/cli/model/evaluate.py

alinaryan · Jan 24, 2025

src/instructlab/configuration.py

    base_branch: Optional[str] = Field(default=None, description="Base taxonomy branch")
+    dk_bench: _dkbench = Field(
+        default_factory=_dkbench,
+        description="Settings to run DK-Bench against a file of user created questions, reference answsers, and responses. If responses are not provided they are generated from a model",


non blocking nit: s/answsers/answers/g

alinaryan · Jan 24, 2025

src/instructlab/defaults.py

    MISTRAL_GGUF_MODEL_NAME = "mistral-7b-instruct-v0.2.Q4_K_M.gguf"
    MODEL_REPO = "instructlab/granite-7b-lab"
    JUDGE_MODEL_MT = "prometheus-eval/prometheus-8x7b-v2.0"
+    JUDGE_MODEL_DK = "gpt-4o"


Why didn’t we use the DK Judge model as Prometheus? I recall the MT judge model was also GPT-4o previously, and users had concerns back then. Won’t those concerns arise again now with the MT judge model being GPT-4o?

@alinaryan Prometheus was fine-tuned specifically for MT-Bench, based on research that the team had done to create a local model for that particular benchmark. In order to get accurate results on the domain knowledge use-case, we need to rely on SotA models to get scores which correlate to human preference. More work must be done to figure out what we can use as a local judge model that's not GPT-4.

cdoern · Jan 24, 2025

@alimaredia please make issues for the following follow up items to be worked on soon after merging:

make issues for all "generic" evaluation flags meant to apply to multiple or all benchmarks. Scope work to ensure each benchmark class we support has that option, not just DK-Bench. an example of this is --temperature. Currently it only exists in the dk_bench class so I assumed the flag should be called --dk-bench-temperature it turns out it is meant to apply to more benchmarks, but currently doesn't, which is why the went with the vague flag name
make issues to re-factor the switch statement reading from the ctx object. We should not be depending on code inside of evaluate itself to swap in different default values than what was specified in the help text of the flag. Currently a user can run ilab model evaluate --help and see None as the default for judge-model . However, new logic added in the DK-bench PR makes it so that each benchmark uses a different value for judge_model, swapping that value in from the config to the --judge-model flag even if the user does not specify it. Generally, we are trying to move away from this kind of "magic swapping in of default values" per command unless it is handled by the clickext class

@alinaryan I think this answers many of your review concerns, we have been discussing these offline and on the PR

alimaredia · Jan 24, 2025

@nathan-weinberg I've addressed all of the comments you made in your last round. I'm going to remove your request for changes.

Changes addressed

github-actions · Jan 24, 2025

E2E (NVIDIA L40S x4) workflow launched on this PR: View run

RobotSail · Jan 24, 2025

src/instructlab/model/dk_bench_utils.py

+    response_df = result.dataset.to_pandas()
+    response_df["scores"] = scores
+
+    with pd.ExcelWriter(excel_file) as writer:


We need to explicitly select openpyxl here in order to prevent the file from being corrupted.

Suggested change

with pd.ExcelWriter(excel_file) as writer:

with pd.ExcelWriter(excel_file, engine="openpyxl") as writer:

This commit adds Domain Knowledge bench (DK-Bench) as an evaluation option as part of `ilab model evaluate`. This commit adds the cli flags to enable run the benchmark. Signed-off-by: Ali Maredia <amaredia@redhat.com>

This commit does the following: - Passes in ctx.obj.config.serve and use existing tls config vars passed into evaluate_data instead of relying on the click context to fill out the values. - Moves per benchmark judge and output dir selection to cli/model/evaluate.py so that click context is separated from evaluate_data(). Signed-off-by: Ali Maredia <amaredia@redhat.com>

This commit does the following: 1. The output dir for mt_bench used to be: ~/.local/share/instructlab/internal/eval_data It is now: ~/.local/share/instructlab/internal/eval_data/mt_bench This commit explicitly changes that in all of the profiles though this default shoud apply on it's own. 2. Overrides the judge model in the config section for mt-bench-branch to match the judge model used in mt-bench for smaller hardware profiles. 3. Updates the evaluation section of the config for the custom profiles for CI. Though many of the values added are not changed from defaults they are added as a reference in case further changes want to be made in the future. Signed-off-by: Ali Maredia <amaredia@redhat.com>

This PR does the following: - Adds running of dk_bench into e2e-ci.sh - Add test questions for dk-bench - Adds the OPENAI_API_KEY to the x-large job The addition of DK-Bench requires OPENAI_API_KEY to be set before e2e-ci.sh is run. If this key is not set, the program will exit with a warning. If a user of the script does not have an OPENAI_API_KEY and wishes to run the script without running DK-Bench they make set OPENAI_API_KEY='NO_API_KEY'. Signed-off-by: Ali Maredia <amaredia@redhat.com>

RobotSail

Thank you so much for all of your hard work to push this PR forward @alimaredia. And thanks all to those who have spent their valuable time reviewing this on such short notice. LGTM.

github-actions · Jan 24, 2025

e2e workflow succeeded on this PR: View run, congrats!

mergify bot added the ci-failure PR has at least one CI failure label Jan 18, 2025

alimaredia force-pushed the dk-bench-cli-only branch from fd3906b to 2b41058 Compare January 18, 2025 04:28

alimaredia marked this pull request as ready for review January 18, 2025 04:29

alimaredia requested review from RobotSail and danmcp January 18, 2025 04:29

RobotSail reviewed Jan 18, 2025

View reviewed changes

src/instructlab/cli/model/evaluate.py Outdated Show resolved Hide resolved

RobotSail reviewed Jan 18, 2025

View reviewed changes

src/instructlab/model/dk_bench_utils.py Outdated Show resolved Hide resolved

RobotSail reviewed Jan 18, 2025

View reviewed changes

src/instructlab/model/dk_bench_utils.py Show resolved Hide resolved

RobotSail reviewed Jan 18, 2025

View reviewed changes

src/instructlab/model/dk_bench_utils.py Outdated Show resolved Hide resolved

RobotSail reviewed Jan 18, 2025

View reviewed changes

src/instructlab/model/evaluate.py Show resolved Hide resolved

RobotSail requested changes Jan 18, 2025

View reviewed changes

alimaredia force-pushed the dk-bench-cli-only branch from 2b41058 to 8628942 Compare January 20, 2025 06:07

cdoern added the hold In-progress PR. Tag should be removed before merge. label Jan 24, 2025

mergify bot added the one-approval PR has one approval from a maintainer label Jan 24, 2025

nathan-weinberg reviewed Jan 24, 2025

View reviewed changes

mergify bot added the needs-rebase This Pull Request needs to be rebased label Jan 24, 2025

alimaredia force-pushed the dk-bench-cli-only branch from 258bf1f to d55d4c8 Compare January 24, 2025 19:18

mergify bot added ci-failure PR has at least one CI failure and removed needs-rebase This Pull Request needs to be rebased labels Jan 24, 2025

alinaryan reviewed Jan 24, 2025

View reviewed changes

alimaredia force-pushed the dk-bench-cli-only branch from d55d4c8 to 715fc45 Compare January 24, 2025 19:34

mergify bot removed the ci-failure PR has at least one CI failure label Jan 24, 2025

cdoern mentioned this pull request Jan 24, 2025

feat(RAG): Announce RAG in CHANGELOG.md #2990

Merged

6 tasks

RobotSail reviewed Jan 24, 2025

View reviewed changes

alimaredia added 4 commits January 24, 2025 15:30

feat: DK-Bench ilab implementation

decdfe3

This commit adds Domain Knowledge bench (DK-Bench) as an evaluation option as part of `ilab model evaluate`. This commit adds the cli flags to enable run the benchmark. Signed-off-by: Ali Maredia <amaredia@redhat.com>

alimaredia force-pushed the dk-bench-cli-only branch from 715fc45 to 0513a87 Compare January 24, 2025 20:30

alinaryan approved these changes Jan 24, 2025

View reviewed changes

mergify bot removed the one-approval PR has one approval from a maintainer label Jan 24, 2025

cdoern requested a review from RobotSail January 24, 2025 20:50

RobotSail approved these changes Jan 24, 2025

View reviewed changes

RobotSail removed the hold In-progress PR. Tag should be removed before merge. label Jan 24, 2025

mergify bot merged commit 0c0036f into instructlab:main Jan 24, 2025
32 checks passed

		QNA_YAML = "qna.yaml"


		def model_is_localand_valid(model: str) -> bool:

	def model_is_localand_valid(model: str) -> bool:
	def model_is_local_and_valid(model: str) -> bool:

	timestamp = now.strftime("%m-%d-%Y_%H-%M-%S-%f")
	timestamp = now.isoformat()

	with pd.ExcelWriter(excel_file) as writer:
	with pd.ExcelWriter(excel_file, engine="openpyxl") as writer:

Search code, repositories, users, issues, pull requests...

Comments

Conversation

alimaredia commented Jan 18, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

alimaredia Jan 18, 2025 •

edited

Loading