deepeval
confident-ai/deepeval
7.8
Evaluation & Testing
★ 14.8k◇ 1.4kPythonApache-2.0today
Ragas
explodinggradients/ragas
7.7
Evaluation & Testing
★ 13.4k◇ 1.4kPythonApache-2.01mo ago
garak
NVIDIA/garak
7.3
Evaluation & Testing
★ 7.5k◇ 877HTMLApache-2.01d ago
chinese-llm-benchmark
jeinlee1991/chinese-llm-benchmark
6.2
Evaluation & Testing
★ 5.9k◇ 2366d ago
LLM-Engineers-Handbook
PacktPublishing/LLM-Engineers-Handbook
6.7
Evaluation & Testing
★ 4.9k◇ 1.2kPythonMIT1mo ago
agenta
Agenta-AI/agenta
7.7
Evaluation & Testing
★ 4.0k◇ 508TypeScriptNOASSERTIONtoday
lmms-eval
EvolvingLMMs-Lab/lmms-eval
7.5
Evaluation & Testing
★ 4.0k◇ 560PythonNOASSERTION2d ago
AI-Infra-Guard
Tencent/AI-Infra-Guard
7.3
Evaluation & Testing
★ 3.5k◇ 345PythonApache-2.0today
trulens
truera/trulens
7.3
Evaluation & Testing
★ 3.2k◇ 262PythonMITtoday
lmnr
lmnr-ai/lmnr
6.9
Evaluation & Testing
★ 2.8k◇ 191TypeScriptApache-2.0today
aisheets
huggingface/aisheets
6.2
Evaluation & Testing
★ 1.6k◇ 136TypeScriptApache-2.05d ago
FuzzyAI
cyberark/FuzzyAI
5.6
Evaluation & Testing
★ 1.3k◇ 188Jupyter NotebookApache-2.02mo ago
prompty
microsoft/prompty
6.8
Evaluation & Testing
★ 1.2k◇ 114TypeScriptMITtoday
uqlm
cvs-health/uqlm
6.6
Evaluation & Testing
★ 1.1k◇ 119PythonApache-2.0today
judgeval
JudgmentLabs/judgeval
6.7
Evaluation & Testing
★ 1.0k◇ 90PythonApache-2.01d ago
scenario
langwatch/scenario
5.9
Evaluation & Testing
★ 834◇ 58TypeScriptMITtoday
Awesome-LLM-Eval
onejune2018/Awesome-LLM-Eval
5.0
Evaluation & Testing
★ 631◇ 55MIT4mo ago
Awesome-LLM-in-Social-Science
ValueByte-AI/Awesome-LLM-in-Social-Science
5.0
Evaluation & Testing
★ 609◇ 46MIT1mo ago
langtest
PacificAI/langtest
6.1
Evaluation & Testing
★ 555◇ 49PythonApache-2.019d ago
langtest
Pacific-AI-Corp/langtest
6.1
Evaluation & Testing
★ 555◇ 49PythonApache-2.019d ago
continuous-eval
relari-ai/continuous-eval
4.7
Evaluation & Testing
★ 516◇ 38PythonApache-2.01y ago
llm-leaderboard
JonathanChavezTamales/llm-leaderboard
4.8
Evaluation & Testing
★ 361◇ 40JavaScriptNOASSERTION5mo ago
aimock
CopilotKit/aimock
5.7
Evaluation & Testing
★ 343◇ 21TypeScriptMITtoday
palico-ai
palico-ai/palico-ai
4.5
Evaluation & Testing
★ 342◇ 28TypeScriptMIT1y ago
rhesis
rhesis-ai/rhesis
5.4
Evaluation & Testing
★ 311◇ 23PythonNOASSERTIONtoday
llms-tools
PetroIvaniuk/llms-tools
4.7
Evaluation & Testing
★ 306◇ 40Apache-2.01mo ago
athina-evals
athina-ai/athina-evals
4.1
Evaluation & Testing
★ 299◇ 21Python10mo ago
flutter-skill
ai-dashboad/flutter-skill
5.1
Evaluation & Testing
★ 190◇ 23DartMITtoday
qaskills
PramodDutta/qaskills
4.0
Evaluation & Testing
★ 102◇ 4TypeScripttoday