Skip to main content
These docs will be deprecated and no longer maintained with the release of LangChain v1.0 in October 2025. Visit the v1.0 alpha docs

Llama CPP

Compatibility

Only available on Node.js.

This module is based on the node-llama-cpp Node.js bindings for llama.cpp, allowing you to work with a locally running LLM. This allows you to work with a much smaller quantized model capable of running on a laptop environment, ideal for testing and scratch padding ideas without running up a bill!

Setup

You'll need to install major version 3 of the node-llama-cpp module to communicate with your local model.

  • npm
  • Yarn
  • pnpm
npm install -S node-llama-cpp@3
  • npm
  • Yarn
  • pnpm
npm install @langchain/community @langchain/core

You will also need a local Llama 3 model (or a model supported by node-llama-cpp). You will need to pass the path to this model to the LlamaCpp module as a part of the parameters (see example).

Out-of-the-box node-llama-cpp is tuned for running on a MacOS platform with support for the Metal GPU of Apple M-series of processors. If you need to turn this off or need support for the CUDA architecture then refer to the documentation at node-llama-cpp.

For advice on getting and preparing llama3 see the documentation for the LLM version of this module.

A note to LangChain.js contributors: if you want to run the tests associated with this module you will need to put the path to your local model in the environment variable LLAMA_PATH.

Usage

Basic use

We need to provide a path to our local Llama3 model, also the embeddings property is always set to true in this module.

import { LlamaCppEmbeddings } from "@langchain/community/embeddings/llama_cpp";

const llamaPath = "/Replace/with/path/to/your/model/gguf-llama3-Q4_0.bin";

const embeddings = await LlamaCppEmbeddings.initialize({
modelPath: llamaPath,
});

const res = embeddings.embedQuery("Hello Llama!");

console.log(res);

/*
[ 15043, 365, 29880, 3304, 29991 ]
*/

API Reference:

Document embedding

import { LlamaCppEmbeddings } from "@langchain/community/embeddings/llama_cpp";

const llamaPath = "/Replace/with/path/to/your/model/gguf-llama3-Q4_0.bin";

const documents = ["Hello World!", "Bye Bye!"];

const embeddings = await LlamaCppEmbeddings.initialize({
modelPath: llamaPath,
});

const res = await embeddings.embedDocuments(documents);

console.log(res);

/*
[ [ 15043, 2787, 29991 ], [ 2648, 29872, 2648, 29872, 29991 ] ]
*/

API Reference:


Was this page helpful?


You can also leave detailed feedback on GitHub.

Morty Proxy This is a proxified and sanitized view of the page, visit original site.