STACKQUADRANT

gpustack/gpustack

Inference Engines

Performance-optimized AI inference on your GPUs. Unlock superior throughput by selecting and tuning engines like vLLM or SGLang.

GitHub Metrics
Stars
4.6k
Forks
461
Open Issues
464
Watchers
36
Contributors
39
Weekly Commits
0
Language
Python
License
Apache-2.0
Last Commit
Mar 2, 2026
Created
May 11, 2024
Latest Release
v2.0.3
Release Date
Jan 9, 2026
Synced: Mar 3, 2026
Quality Scores
Documentation Qualityw: 20%
0.0
Community Healthw: 20%
0.0
Maintenance Velocityw: 15%
0.0
API Design & DXw: 20%
0.0
Production Readinessw: 15%
0.0
Ecosystem Integrationw: 10%
0.0
Tags
ascendcudadeepseekdistributed-inferencegenaihigh-performance-inferenceinferencellamallmllm-inference
Radar
No scores yet