Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings
Discussion options

Does anyone know of any benchmark data sets that I could use to
evaluate LlamaParse versus other existing simpler solutions.

In the example code is one example of comparing to not using LlamaParse on the PDF,
but I want to do more than just some one off comparisons.

Thanks greatly in advance

  • john
You must be logged in to vote

Replies: 1 comment

Comment options

This is a very tricky topic and something I've been grappling with too. Consider using an independent measure of your overall rag pipeline (like the BIERS llama index implementation) and asses if there are any performance differences between two different parsing strategies. For direct benchmarks, have a flick through archivx, saw this after a couple of searches - might be worth your while! https://arxiv.org/pdf/2412.07626

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
🙏
Q&A
Labels
None yet
2 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.