Experimental examples are sample code and deployments for RAG pipelines that are not tested by NVIDIA personnel.
-
NVIDIA RAG Streaming Document Ingestion Pipeline
This example demonstrate the construction of a performance-oriented pipeline that accepts a stream of heterogenous documents, divides the documents into smaller segments or chunks, computes the embedding vector for each of these chunks, and uploads the text chunks along with their associated embeddings to a Vector Database. This pipeline builds on the Morpheus SDK to take advantage of end-to-end asynchronous processing. This pipeline showcases pipeline parallelism (including CPU and GPU-accelerated nodes), as well as, a mechanism to horizontally scale out data ingestion workers.
-
NVIDIA Multimodal RAG Assistant
This example is able to ingest PDFs, PowerPoint slides, Word and other documents with complex data formats including text, images, slides and tables. It allows users to ask questions through a text interface and optionally with an image query, and it can respond with text and reference images, slides and tables in its response, along with source links and downloads.
-
Run RAG-LLM in Azure Machine Learning
This example shows the configuration changes to using Docker containers and local GPUs that are required to run the RAG-LLM pipelines in Azure Machine Learning.