Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Conversation

Ayush7614
Copy link
Contributor

Updated CrewAI Evaluation Guide

This PR updates the CrewAI Guide.

  • Copy-edited and polished the CrewAI guide for clarity, consistency, and improved readability.
  • Added more test cases to the evaluation example (now 8 total), including real-world scenarios and intentional failure cases for bias detection and candidate count validation.
  • Enhanced Promptfoo Web Viewer documentation with a detailed overview of test cases, structured outputs, assertions, latency, and filtering tools.
  • improved rendering with inline GIF demonstrations (crewai-eval.gif, promptfoo-view.gif) that visually show the evaluation flow and web viewer in action.
  • Improved the whole guide, like images, code, better screenshots, and double-checked to ensure it is working or not.

@Ayush7614 Ayush7614 changed the title fix:(docs) updated crewai guide fix: docs updated crewai guide Oct 7, 2025
Copy link
Contributor

coderabbitai bot commented Oct 7, 2025

📝 Walkthrough

Walkthrough

  • Updated documentation link from joaomdmoura/crewai to crewAIInc/crewAI and cleaned up formatting.
  • Tightened OPENAI_API_KEY messaging; clarified .env usage.
  • run_recruitment_agent now normalizes outputs to text, extracts JSON from Markdown, and returns structured errors (empty output, no JSON block, parse failures).
  • Public API tweaks: run_recruitment_agent signature default now model="openai:gpt-4.1"; call_api declaration removed/inlined.
  • promptfooconfig.yaml revamped: provider references file://./agent.py, assertions loosened, many new test cases added (including intentional failures).
  • Documentation expanded with new images, step-by-step result interpretation, and updated evaluation flow.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Mixed changes across docs, a key function’s behavior, configuration, and tests. Logic edits are localized but non-trivial (output normalization, JSON parsing paths, structured errors). Broad test updates and API surface adjustments add heterogeneity without deep algorithmic complexity.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title succinctly conveys the primary change—updating the CrewAI guide documentation—and is directly related to the contents of the pull request.
Description Check ✅ Passed The description clearly outlines the scope of the guide updates, including copy edits, new test cases, enhanced viewer documentation, and rendering improvements, all of which correspond to the changes in the pull request.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 49484fe and e6a4f3e.

📒 Files selected for processing (1)
  • site/docs/guides/evaluate-crewai.md (8 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
site/docs/**/*.md

📄 CodeRabbit inference engine (.cursor/rules/docusaurus.mdc)

site/docs/**/*.md: Prioritize minimal edits when updating existing documentation; avoid creating entirely new sections or rewriting substantial portions; focus edits on improving grammar, spelling, clarity, fixing typos, and structural improvements where needed; do not modify existing headings (h1, h2, h3, etc.) as they are often linked externally.
Structure content to reveal information progressively: begin with essential actions and information, then provide deeper context as necessary; organize information from most important to least important.
Use action-oriented language: clearly outline actionable steps users should take, use concise and direct language, prefer active voice over passive voice, and use imperative mood for instructions.
Use 'eval' instead of 'evaluation' in all documentation; when referring to command line usage, use 'npx promptfoo eval' rather than 'npx promptfoo evaluation'; maintain consistency with this terminology across all examples, code blocks, and explanations.
The project name can be written as either 'Promptfoo' (capitalized) or 'promptfoo' (lowercase) depending on context: use 'Promptfoo' at the beginning of sentences or in headings, and 'promptfoo' in code examples, terminal commands, or when referring to the package name; be consistent with the chosen capitalization within each document or section.
Each markdown documentation file must include required front matter fields: 'title' (the page title shown in search results and browser tabs) and 'description' (a concise summary of the page content, ideally 150-160 characters).
Only add a title attribute to code blocks that represent complete, runnable files; do not add titles to code fragments, partial examples, or snippets that aren't meant to be used as standalone files; this applies to all code blocks regardless of language.
Use special comment directives to highlight specific lines in code blocks: 'highlight-next-line' highlights the line immediately after the comment, 'highligh...

Files:

  • site/docs/guides/evaluate-crewai.md
{site/**,examples/**}

📄 CodeRabbit inference engine (.cursor/rules/gh-cli-workflow.mdc)

Any pull request that only touches files in 'site/' or 'examples/' directories must use the 'docs:' prefix in the PR title, not 'feat:' or 'fix:'

Files:

  • site/docs/guides/evaluate-crewai.md
site/**

📄 CodeRabbit inference engine (.cursor/rules/gh-cli-workflow.mdc)

If the change is a feature, update the relevant documentation under 'site/'

For documentation-only builds/tests, you may set SKIP_OG_GENERATION=true to skip OG image generation

Files:

  • site/docs/guides/evaluate-crewai.md
site/docs/**/*.{md,mdx}

📄 CodeRabbit inference engine (site/docs/CLAUDE.md)

site/docs/**/*.{md,mdx}: Use the term "eval" not "evaluation" in documentation and examples
Capitalization: use "Promptfoo" (capitalized) in prose/headings and "promptfoo" (lowercase) in code, commands, and package names
Every doc must include required front matter: title and description
Only add title= to code blocks when showing complete runnable files
Admonitions must have empty lines around their content (Prettier requirement)
Do not modify headings; they may be externally linked
Use progressive disclosure: put essential information first
Use action-oriented, imperative mood in instructions (e.g., "Install the package")

Files:

  • site/docs/guides/evaluate-crewai.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (17)
  • GitHub Check: Build Docs
  • GitHub Check: Generate Assets
  • GitHub Check: Test on Node 24.x and windows-latest
  • GitHub Check: Run Integration Tests
  • GitHub Check: Style Check
  • GitHub Check: Test on Node 24.x and ubuntu-latest
  • GitHub Check: webui tests
  • GitHub Check: Test on Node 22.x and ubuntu-latest
  • GitHub Check: Test on Node 20.x and macOS-latest
  • GitHub Check: Test on Node 20.x and windows-latest
  • GitHub Check: Test on Node 22.x and macOS-latest
  • GitHub Check: Build on Node 22.x
  • GitHub Check: Test on Node 22.x and windows-latest
  • GitHub Check: Redteam (Production API)
  • GitHub Check: Test on Node 20.x and ubuntu-latest
  • GitHub Check: Build on Node 20.x
  • GitHub Check: Build on Node 24.x

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Morty Proxy This is a proxified and sanitized view of the page, visit original site.