You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Refactor process_inputs outside of create_aot_dispatcher_function (#130962)
This PR refactors process_inputs so that it occurs earlier outside of create_aot_dispatcher_function for the purpose of calculating a cache key with the inputs after they have been processed.
This way, if tensors have symint sizes/strides, we successfully factor that into the cache key instead of specializing on every possible size and stride. Test that utilizes this incoming.
# Guard behavior
Note that it's technically possible for tensors with symint arguments to introduce guards in aot_dispatch, if they trace through decompositions that branch on tensor size/stride. This can result in multiple graph modules with differing guards having the same key in the cache.
FXGraphCache has this same issue, and the remote FXGraphCache intentionally does not handle this: instead it only saves the first result in the cache, and cache misses if guards miss. The local FXGraphCache does handle this by storing multiple files and iterating through them, but we opt not to introduce that complexity just yet for AOTAutogradCache until we deem it necessary (i.e., models appear where saving multiple cache results with different guards but the same cache key becomes important). Instead, AOTAutogradCache will save a single entry per result, overriding it if it cache misses due to guards.
Pull Request resolved: #130962
Approved by: https://github.com/bdhirsh
0 commit comments