Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Popular repositories Loading

  1. bug-in-the-code-stack bug-in-the-code-stack Public

    Forked from techandy42/bug_in_the_code_stack

    A new benchmark for measuring LLM's capability to detect bugs in large codebase.

    Jupyter Notebook 32 4

  2. hamming-examples hamming-examples Public

    Various examples on how to use Hamming for evals + observability

    TypeScript 6 1

  3. evals-ts evals-ts Public

    TypeScript 1 2

  4. evals-py evals-py Public

    Python 1

  5. M3Exam M3Exam Public

    Forked from DAMO-NLP-SG/M3Exam

    Data and code for paper "M3Exam: A Multilingual, Multimodal, Multilevel Benchmark for Examining Large Language Models"

    Python 1

  6. combined-nextjs-test combined-nextjs-test Public

    TypeScript 1

Repositories

Loading
Type
Select type
Language
Select language
Sort
Select order
Showing 10 of 14 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…

Morty Proxy This is a proxified and sanitized view of the page, visit original site.