Open
Description
PR #7990 implements a caching pipeline which caches all steps but the last. I haven't found much discussion on this topic specifically, so I can only speculate on why the last step is not cached. With this issue, I would like to make the case for caching all pipeline steps.
Arguments pro:
- I would guess the last step is not cached because it is usually not a transformer. However, the last step may well be a transformer, for example a pipeline of preprocessing steps inside of a parent pipeline that ends in a non-transforming estimator.
- Even if the the last step is not a transformer, there could be cases where the user would want to cache the last step.
- If all steps are cached, it is easy to recreate the behaviour currently implemented in [MRG+3] ENH Caching Pipeline by memoizing transformer #7990 by simply putting all steps you want cached in a caching pipeline, and insert that pipeline into a non-caching pipeline for steps you don't want cached. Conversely, it is difficult to cache all steps in a pipeline with the current behaviour.