Xinference is a powerful and versatile library designed to serve LLMs,
speech recognition models, and multimodal models, even on your laptop.
With Xorbits Inference, you can effortlessly deploy and serve your or
state-of-the-art built-in models using just a single command.
Installation and Setup
Xinference can be installed via pip from PyPI:Copy
LLM
Xinference supports various models compatible with GGML, including chatglm, baichuan, whisper, vicuna, and orca. To view the built-in models, run the command:Copy
Wrapper for Xinference
You can start a local instance of Xinference by running:Copy
Copy
Copy
Copy
Copy
Copy
Usage
For more information and detailed examples, refer to the example for xinference LLMsEmbeddings
Xinference also supports embedding queries and documents. See example for xinference embeddings for a more detailed demo.Xinference LangChain partner package install
Install the integration package with:Copy
Chat Models
Copy
LLM
Copy