Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Add batched inference #771

Copy link
Copy link
Open
Open
Copy link
@abetlen

Description

@abetlen
Issue body actions
  • Use llama_decode instead of deprecated llama_eval in Llama class
  • Implement batched inference support for generate and create_completion methods in Llama class
  • Add support for streaming / infinite completion
giangluu352001, harry-pham-wise, JackKCWong, bb-worm, ChristianWeyer and 45 moresengiv, ArtyomZemlyak, hamishc, bioshazard, gerred and 16 moreesmeetu, robertritz, zhengzhanpeng, hamishc, ngupta10 and 12 more

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.