prompt-testing

Here are 22 public repositories matching this topic...

promptfoo / promptfoo

Test your prompts, agents, and RAGs. AI Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.

testing ci evaluation ci-cd pentesting cicd vulnerability-scanners prompts evaluation-framework red-teaming rag llm prompt-engineering llmops prompt-testing llm-eval llm-evaluation llm-evaluation-framework

Updated Dec 6, 2025
TypeScript

msoedov / agentic_security

Star

Agentic LLM Vulnerability Scanner / AI red teaming kit 🧪

agent-framework ai-red-team prompt-testing llm-security llm-vulnerabilities llm-evaluation llm-fuzzing llm-evaluation-framework llm-guardrails llm-scanner llm-jailbreaks llm-fuzzer llm-fuzzer-aggregator agent-security

Updated Nov 30, 2025
Python

babelcloud / LLM-RGB

Star

LLM Reasoning and Generation Benchmark. Evaluate LLMs in complex scenarios systematically.

benchmark prompt llm prompt-engineering prompt-testing

Updated May 25, 2025
TypeScript

jhd3197 / Prompture

Sponsor

Star

Prompture is an API-first library for requesting structured JSON output from LLMs (or any structure), validating it against a schema, and running comparative tests between models.

openai toon json-validation structured-output pydantic llm prompt-engineering ai-testing prompt-testing

Updated Nov 22, 2025
Python

aralyekta / prompttester

Star

Test, compare, and optimize your AI prompts in minutes

prompt-testing llm-tools llm-test llm-evaluation prompt-test llm-testing

Updated Aug 13, 2025
JavaScript

prompt-foundry / typescript-sdk

Star

The prompt engineering, prompt management, and prompt evaluation tool for TypeScript, JavaScript, and NodeJS.

typescript gpt open-ai gpt-3 gpt-4 llm prompt-engineering llmops prompt-testing prompt-manager prompt-management llm-eval llm-test llm-ops llm-evaluation prompt-evaluation

Updated Nov 15, 2025
TypeScript

bluewave-labs / evalwise

Sponsor

Star

EvalWise is a developer-friendly platform for LLM evaluation and red teaming that helps test AI models for safety, compliance, and performance issues

rag llm prompt-engineering llmops prompt-testing evals llm-evaluation rag-evaluation llm-evaluation-toolkit

Updated Nov 20, 2025
Python

calibrtr / llm-prompt-test

Star

LLM Prompt Test helps you test Large Language Models (LLMs) prompts to ensure they consistently meet your expectations.

testing tdd test prompt test-automation testing-tools prompts large-language-models llm prompt-engineering prompt-testing

Updated May 22, 2024
TypeScript

syamsasi99 / prompt-evaluator

Star

prompt-evaluator is an open-source toolkit for evaluating, testing, and comparing LLM prompts. It provides a GUI-driven workflow for running prompt tests, tracking token usage, visualizing results, and ensuring reliability across models like OpenAI, Claude, and Gemini.

electron react typescript datascience developer-tools ai-evaluation llm prompt-engineering prompt-testing promptfoo ai-evaluation-tools ai-evaluation-metrics ai-evaluation-framework

Updated Dec 4, 2025
TypeScript

yukinagae / genkitx-promptfoo

Star

Community Plugin for Genkit to use Promptfoo

plugin testing firebase ai evaluation prompt prompts evaluation-framework llm llmops prompt-testing llm-eval llm-evaluation llm-evaluation-framework promptfoo genkit genkitx genkit-plugin

Updated Jan 3, 2025
TypeScript

amansoomro062 / atelier

Star

An open-source AI prompt engineering playground with live code execution. Test OpenAI & Claude prompts, execute JavaScript, and iterate in real-time.

playground ai nextjs openai developer-tools claude llm prompt-engineering prompt-testing anthropic prompt-optimization system-prompts

Updated Nov 8, 2025
TypeScript

yukinagae / promptfoo-sample

Star

Sample project demonstrates how to use Promptfoo, a test framework for evaluating the output of generative AI models

testing evaluation prompts evaluation-framework llm llmops prompt-testing llm-eval llm-evaluation llm-evaluation-framework promptfoo

Updated Sep 10, 2024

radoslaw-sz / maia

Star

A pytest-based framework for testing multi AI agents systems. It provides a flexible and extensible platform for complex multi-agent simulations. Supports many integrations like LiteLLM, CrewAI, LangChain etc.

python framework ai test agents maia llm prompt-engineering ai-testing prompt-testing agentic ai-testing-tool

Updated Sep 24, 2025
TypeScript

jairerazodev / prompt-testing

Star

prompt-testing

Updated Jan 18, 2023

abdullahkhalid00 / prompt-db

Star

A collection of prompts that I use on a day-to-day basis for work and leisure.

markdown jinja2 text prompts prompt-engineering chatgpt prompt-testing prompt-template

Updated Sep 9, 2024

ashleysally00 / promptfoo-quickstart-guide

Star

Quickstart guide for using PromptFoo to evaluate LLM prompts via CLI or Colab.

openai colab model-evaluation cli-tool llm prompt-engineering prompt-testing promptfoo

Updated Nov 23, 2025

alinaleo27 / ai-rag-eval-qa

Star

AI RAG evaluation project using Ragas. Includes RAG metrics (precision, recall, faithfulness), retrieval diagnostics, and prompt testing examples for fintech/banking LLM systems. Designed as an AI QA Specialist portfolio project.

ai-qa prompt-testing llm-evaluation rag-evaluation ragas llm-testing

Updated Nov 17, 2025
Python

srdarkseer / PromptForge

Star

Visual prompt engineering platform for creating, testing, and versioning LLM prompts across multiple providers (OpenAI, Anthropic, Mistral, Gemini).

ai-tools llm prompt-engineering prompt-testing prompt-optimization

Updated Nov 5, 2025
TypeScript

yukinagae / genkit-promptfoo-sample

Star

Sample implementation demonstrating how to use Firebase Genkit with Promptfoo

testing evaluation prompts evaluation-framework llm llmops prompt-testing llm-eval llm-evaluation llm-evaluation-framework promptfoo genkit

Updated Sep 11, 2024
TypeScript

vihanga / prompt-sandbox

Star

Testing framework for LLM prompts. Started as a weekend project after getting tired of manually testing prompts in ChatGPT. Async experiment runner with BLEU/ROUGE/BERTScore metrics.

benchmarking rouge evaluation-metrics bleu llm bertscore prompt-engineering chatgpt prompt-testing

Updated Dec 5, 2025
Python

Improve this page

Add a description, image, and links to the prompt-testing topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the prompt-testing topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prompt-testing

Here are 22 public repositories matching this topic...

promptfoo / promptfoo

msoedov / agentic_security

babelcloud / LLM-RGB

jhd3197 / Prompture

aralyekta / prompttester

prompt-foundry / typescript-sdk

bluewave-labs / evalwise

calibrtr / llm-prompt-test

syamsasi99 / prompt-evaluator

yukinagae / genkitx-promptfoo

amansoomro062 / atelier

yukinagae / promptfoo-sample

radoslaw-sz / maia

jairerazodev / prompt-testing

abdullahkhalid00 / prompt-db

ashleysally00 / promptfoo-quickstart-guide

alinaleo27 / ai-rag-eval-qa

srdarkseer / PromptForge

yukinagae / genkit-promptfoo-sample

vihanga / prompt-sandbox

Improve this page

Add this topic to your repo

Search code, repositories, users, issues, pull requests...

prompt-testing

Here are 22 public repositories matching this topic...

promptfoo / promptfoo

msoedov / agentic_security

babelcloud / LLM-RGB

jhd3197 / Prompture

aralyekta / prompttester

prompt-foundry / typescript-sdk

bluewave-labs / evalwise

calibrtr / llm-prompt-test

syamsasi99 / prompt-evaluator

yukinagae / genkitx-promptfoo

amansoomro062 / atelier

yukinagae / promptfoo-sample

radoslaw-sz / maia

jairerazodev / prompt-testing

abdullahkhalid00 / prompt-db

ashleysally00 / promptfoo-quickstart-guide

alinaleo27 / ai-rag-eval-qa

srdarkseer / PromptForge

yukinagae / genkit-promptfoo-sample

vihanga / prompt-sandbox

Improve this page

Add this topic to your repo