Add parameter to skip saving to cache when caching is enabled #594

shaunabanana · Aug 10, 2023

For some use cases, esp. for apps using LLMs only for specific tasks, there might be a known list of few-shot examples with only the last one always changing according to user input. In such cases, caching the completions every time adds little boost to inference speed, and might actually slow down inference when used with slow memory/disk. Therefore, it is ideal to be able to cache those pre-defined of prompts at the start of the app, and then to skip caching in subsequent calls.

This PR adds a parameter (save_cache) to create_completion() and other related functions, as well as the completions API in the server. It is set to be True by default to avoid changing the default behavior, and is only used when caching is enabled.

shaunabanana added 4 commits August 10, 2023 16:24

Add parameter to skip saving to cache

4b084f7

Merge branch 'abetlen:main' into main

4cf0861

Merge branch 'abetlen:main' into main

1783320

Merge branch 'abetlen:main' into main

3158306

abetlen force-pushed the main branch 2 times, most recently from 8c93cf8 to cc0fe43 Compare November 14, 2023 20:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add parameter to skip saving to cache when caching is enabled #594

Add parameter to skip saving to cache when caching is enabled #594

Uh oh!

shaunabanana commented Aug 10, 2023

Uh oh!

Uh oh!

Search code, repositories, users, issues, pull requests...

Add parameter to skip saving to cache when caching is enabled #594

Are you sure you want to change the base?

Add parameter to skip saving to cache when caching is enabled #594

Uh oh!

Conversation

shaunabanana commented Aug 10, 2023

Uh oh!

Uh oh!