STACKQUADRANT

NVIDIA-NeMo/Curator

Fine-tuning Tools

Scalable data pre processing and curation toolkit for LLMs

6.2
GitHub Metrics
Stars
1.6k
Forks
278
Open Issues
230
Watchers
18
Contributors
66
Weekly Commits
0
Language
Python
License
Apache-2.0
Last Commit
Jun 3, 2026
Created
Mar 14, 2024
Latest Release
v1.2.0
Release Date
May 14, 2026
Synced: Jun 3, 2026
Quality Scores
Documentation Qualityw: 20%
5.0

No dedicated docs site. Description: 58 chars. Stars signal: 1,596. Contributors: 66. Score: 5/10

Community Healthw: 20%
5.8

Stars: 1,596. Contributors: 66. Watchers: 18. Forks: 278. Issue ratio: 14.4%. Score: 5.8/10

Maintenance Velocityw: 15%
7.6

Last commit: 0d ago. Weekly commits: 0. Latest release: v1.2.0. Maturity bonus: 2.2y old. Score: 7.6/10

API Design & DXw: 20%
5.6

Stars/issues ratio: 7. Dynamic language: Python. No dedicated API docs. Permissive license: Apache-2.0. Popularity signal: 1,596 stars. Score: 5.6/10

Production Readinessw: 15%
7.1

Battle-tested: 1,596 stars. Peer review: 66 contributors. Versioned: v1.2.0. Licensed: Apache-2.0. Age: 2.2 years. Maintenance: last commit 0d ago. Score: 7.1/10

Ecosystem Integrationw: 10%
7.5

Fork interest: 278. Major ecosystem: Python. Integration-friendly: Apache-2.0. Adoption: 1,596 stars. Score: 7.5/10

Tags
datadata-curationdata-prepdata-preparationdata-processingdata-processing-pipelinesdata-qualitydatacurationdatarecipesdeduplication
Radar
Documentation Quality
Community Health
Maintenance Velocity
API Design & DX
Production Readiness
Ecosystem Integration