Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

buildvoc/GenerativeAIExamples

Open more actions menu
 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

358 Commits
358 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Chat with Llama-3.1-Nemotron-Nano-4B-v1.1

A React-based chat interface for interacting with an LLM, featuring RAG (Retrieval-Augmented Generation) capabilities and NVIDIA Dynamo backend serving NVIDIA Llama-3.1-Nemotron-Nano-4B-v1.1.

Project Structure

.
├── frontend/           # React frontend application
├── backend-rag/        # RAG service backend
└── backend-dynamo/     # NVIDIA Dynamo backend service
    └── llm-proxy/      # Proxy server for NVIDIA Dynamo

Prerequisites

  • Node.js 18 or higher
  • Python 3.8 or higher
  • NVIDIA GPU with CUDA support (for LLM serving with NVIDIA Dynamo)
  • Docker (optional, for containerized deployment)
  • Git

Configuration

Frontend

The frontend configuration is managed through YAML files in frontend/public/config/:

  • app_config.yaml: Main application configuration:
    • API endpoints
    • UI settings
    • File upload settings

See frontend/README.md

Backend

Each service has its own configuration files:

Setup

Llama-3.1-Nemotron-Nano-4B-v1.1 running on a GPU Server

This step should be performed on a machine with a GPU.

Set NVIDIA Dynamo backend running Llama-3.1-Nemotron-Nano-4B-v1.1 following the instruction backend-dynamo/README.md.

Local client with a local RAG database

These steps can be performed locally and don't require a GPU.

  1. Clone the repository:

    git clone <this-repository-url>
    cd react-llama-client
  2. Install frontend dependencies:

    cd frontend
    npm install
  3. Set up backend services:

    For Unix/macOS:

    # RAG Backend
    cd backend-rag
    python -m venv venv
    source venv/bin/activate
    pip install -r requirements.txt
    
    # LLM Proxy
    cd backend-dynamo/llm-proxy
    python -m venv venv
    source venv/bin/activate
    pip install -r requirements.txt

    For Windows:

    # RAG Backend
    cd backend-rag
    python -m venv venv
    .\venv\Scripts\activate
    pip install -r requirements.txt
    
    # LLM Proxy
    cd backend-dynamo\llm-proxy
    python -m venv venv
    .\venv\Scripts\activate
    pip install -r requirements.txt
  4. Start the services (each in a new terminal):

    For Unix/macOS:

    # Start frontend (in frontend directory)
    cd frontend
    npm start
    
    # Start RAG backend (in backend-rag directory)
    cd backend-rag
    source venv/bin/activate
    python src/app.py
    
    # Start LLM proxy (in backend-dynamo/llm-proxy directory)
    cd backend-dynamo/llm-proxy
    source venv/bin/activate
    python proxy.py

    For Windows:

    # Start frontend (in frontend directory)
    cd frontend
    npm start
    
    # Start RAG backend (in backend-rag directory)
    cd backend-rag
    .\venv\Scripts\activate
    python src\app.py
    
    # Start LLM proxy (in backend-dynamo\llm-proxy directory)
    cd backend-dynamo\llm-proxy
    .\venv\Scripts\activate
    python proxy.py

About

Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 39.2%
  • JavaScript 32.7%
  • CSS 25.1%
  • HTML 3.0%
Morty Proxy This is a proxified and sanitized view of the page, visit original site.