Add tests for layernorm and add op_handler functions #3890

JulioHC00 · Oct 15, 2024

Overview

Partially closes #3438 by adding support for LayerNorm and testing it

Description of the changes proposed in this pull request:

Adds LayerNorm to the op_handler dictionary with nonlinear_1d as well as Identity as passthrough. I've kept the structure of the tests as similar as possible to previous ones for consistency.

Checklist

[ x] All pre-commit checks pass.
[ x] Unit tests added (if fixing a bug or adding a new feature)

Adds nonlinear_1d as the op_handler for LayerNorm and passthrough for Identity

Testing for a model with LayerNorm both with it being in the first layer and not being the first layer, plus testing for the case where the background input matches the test input exactly in one of the features

CloseChoice · Oct 18, 2024

shap/explainers/_deep/deep_pytorch.py

@@ -400,6 +401,7 @@ def nonlinear_1d(module, grad_input, grad_output):
 op_handler["BatchNorm2d"] = linear_1d
 op_handler["BatchNorm3d"] = linear_1d

+op_handler["LayerNorm"] = nonlinear_1d


Hey, thanks for the update. Unfortunately this is wrong. Sorry, I had a thought about this and the problem here is that multiple inputs (given one output) vary. So for instance, let's have a look at the tensorflow implementation for multiplicative attribution and deduct it from the definition of shapley values:

$\phi_{i} = \sum_{S \subseteq N \setminus {i}} \frac{|S|!(|N|-|S|-1)!}{|N|!} \left( f(S \cup {i}) - f(S) \right)$

Now let's assume we have just $x, y$ (and therefore $b_{x}, b_{y}$ as baselines).
We are looking for a multiplication (or any function that takes two arguments):
$f(x, y) = x \cdot y$
Then this formula becomes:
$\phi_{x} = \frac{1}{2} (f(x, y) - f(b_{x}, y)) + \frac{1}{2} (f(x, b_{y}) - f(b_{x}, b_{y}))$
We would need to do something similar for the layer norm BUT this is computationally expensive, so we might be able to use the approach that the tensorflow implementation takes for softmax (yes, pytorch impl is wrong here too!). I don't fully understand this yet, but doesn't seem too complicated.

Sorry, I just figured this out too late. Really appreciate your effort. Let me know how you want to proceed with this.

Ah! It was too easy to be true. So, if I understand this correctly, in the original DeepLIFT paper they have the rescale rule which actually only applies to a single input. That's what's implemented in the nonlinear_1d function? And that's why this isn't valid for LayerNorm which takes several inputs. So we need to use the full equation to derive the proper function. Is this right? If so I think I understand and can try to see how to translate the way it's done in tensorflow to pytorch.

P.S. Does this mean that the softmax implementation for pytorch also needs to be fixed?

@JulioHC00 yes, that's right. And also we would need to fix the implementation for pytorch's softmax attribution. I believe that once we understand how the tensorflow softmax is implemented, we can apply this to pytorch LayerNorm and softmax as well.

@CloseChoice I guess the only part I'm confused about is how does this fit with the op_handlers? Don't these handle the gradients in the backward propagation? How does this relate to implementing the shap values
equation?

@CloseChoice When I integrate LayerNorm into a testcase and use the op_handler['LayerNorm'] = softmax, the testcase passes. Do we need to create a testcase that does show how Softmax/LayerNorm are broken? What would be in such a testcase?

JulioHC00 added 2 commits October 15, 2024 22:13

Add op_handler for LayerNorm and Identity

d9e7f61

Adds nonlinear_1d as the op_handler for LayerNorm and passthrough for Identity

Add layernorm tests

c9beab4

Testing for a model with LayerNorm both with it being in the first layer and not being the first layer, plus testing for the case where the background input matches the test input exactly in one of the features

JulioHC00 changed the title ~~Fix pytorch additivity failed~~ Add tests for layernorm and add op_handler functions Oct 15, 2024

JulioHC00 mentioned this pull request Oct 16, 2024

Add Support for More nn.Modules Layers. #3438

Open

2 tasks

CloseChoice requested changes Oct 18, 2024

View reviewed changes

CloseChoice mentioned this pull request Dec 17, 2024

BUG: DeepExplainer additivity error with HyenaDNA PyTorch model containing unregistered components #3918

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add tests for layernorm and add op_handler functions #3890

Add tests for layernorm and add op_handler functions #3890

JulioHC00 commented Oct 15, 2024

Uh oh!

CloseChoice Oct 18, 2024

Uh oh!

JulioHC00 Oct 22, 2024

Uh oh!

CloseChoice Oct 22, 2024

Uh oh!

JulioHC00 Oct 22, 2024

Uh oh!

rjbruin Mar 5, 2025 •

edited

Loading

Uh oh!

Uh oh!

Search code, repositories, users, issues, pull requests...

Add tests for layernorm and add op_handler functions #3890

Are you sure you want to change the base?

Add tests for layernorm and add op_handler functions #3890

Conversation

JulioHC00 commented Oct 15, 2024

Overview

Checklist

Uh oh!

CloseChoice Oct 18, 2024

Choose a reason for hiding this comment

Uh oh!

JulioHC00 Oct 22, 2024

Choose a reason for hiding this comment

Uh oh!

CloseChoice Oct 22, 2024

Choose a reason for hiding this comment

Uh oh!

JulioHC00 Oct 22, 2024

Choose a reason for hiding this comment

Uh oh!

rjbruin Mar 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rjbruin Mar 5, 2025 •

edited

Loading