Baselines #244

Jan 10, 2024

JanAthmer
Jan 10, 2024

Hi, I was wondering if anybody knows how to pass args correctly to captum. You can forward args using the attribution_args dict but Captum gives an error that there is already a value for baselines. Say for the purpose of testing I try to set the baseline to 0 using Occlusion like so:

import inseq
attribution_model = inseq.load_model("gpt2", "occlusion")

out = attribution_model.attribute(
"You can tell by the color of the fins if you can eat this",
"You can tell by the color of the fins if you can eat this fish.",
attributed_fn="contrast_prob_diff",
contrast_targets="You can tell by the color of the fins if you can eat this steak.",
step_scores=["contrast_prob_diff"],
attribution_args = {'baselines' : 0}
)
out.show()

I get the following error:

TypeError: captum.attr._core.occlusion.Occlusion.attribute() got multiple values for keyword argument 'baselines'

Answered by gsarti

Jan 10, 2024

Hi @JanAthmer, thank you for your question!

Currently, there is no support to customize the baseline passed to the underlying Captum method, and the UNK token is used by default.

We have an open issue proposing the introduction of customizable baselines for methods such as occlusion and IG: #123. However, this is not currently among our priorities for the next release. If you want to test out the 0-vector baseline rapidly, I suggest cloning the repo, installing it locally and changing the following line to use 0s instead of unk_token_id:

inseq/inseq/models/huggingface_model.py

Line 267 in dfea66f

     baseline_ids_non_eos = batch["input_ids"].ne(self.eos_token_id).long() * self.tok…

View full answer

Jan 10, 2024

gsarti
Jan 10, 2024
Maintainer

Hi @JanAthmer, thank you for your question!

Currently, there is no support to customize the baseline passed to the underlying Captum method, and the UNK token is used by default.

We have an open issue proposing the introduction of customizable baselines for methods such as occlusion and IG: #123. However, this is not currently among our priorities for the next release. If you want to test out the 0-vector baseline rapidly, I suggest cloning the repo, installing it locally and changing the following line to use 0s instead of unk_token_id:

inseq/inseq/models/huggingface_model.py

Line 267 in dfea66f

    
           baseline_ids_non_eos = batch["input_ids"].ne(self.eos_token_id).long() * self.tokenizer.unk_token_id

We are looking for contributors to the development of the toolkit, so if you'd be interested in attacking Issue #123 let me know!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Baselines #244

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Search code, repositories, users, issues, pull requests...

Baselines #244

Uh oh!

JanAthmer Jan 10, 2024

Replies: 1 comment

Uh oh!

gsarti Jan 10, 2024 Maintainer

JanAthmer
Jan 10, 2024

gsarti
Jan 10, 2024
Maintainer