Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Suggestion: cache all Pipeline steps by default #9007

Copy link
Copy link
Open
@lsorber

Description

@lsorber
Issue body actions

PR #7990 implements a caching pipeline which caches all steps but the last. I haven't found much discussion on this topic specifically, so I can only speculate on why the last step is not cached. With this issue, I would like to make the case for caching all pipeline steps.

Arguments pro:

  1. I would guess the last step is not cached because it is usually not a transformer. However, the last step may well be a transformer, for example a pipeline of preprocessing steps inside of a parent pipeline that ends in a non-transforming estimator.
  2. Even if the the last step is not a transformer, there could be cases where the user would want to cache the last step.
  3. If all steps are cached, it is easy to recreate the behaviour currently implemented in [MRG+3] ENH Caching Pipeline by memoizing transformer #7990 by simply putting all steps you want cached in a caching pipeline, and insert that pipeline into a non-caching pipeline for steps you don't want cached. Conversely, it is difficult to cache all steps in a pipeline with the current behaviour.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.