STACKQUADRANT

zhihu/ZhiLight

Inference Engines

A highly optimized LLM inference acceleration engine for Llama and its variants.

GitHub Metrics
Stars
906
Forks
102
Open Issues
5
Watchers
52
Contributors
9
Weekly Commits
0
Language
C++
License
Apache-2.0
Last Commit
Mar 2, 2026
Created
Dec 6, 2024
Latest Release
v0.4.8
Release Date
Dec 10, 2024
Synced: Mar 3, 2026
Quality Scores
Documentation Qualityw: 20%
0.0
Community Healthw: 20%
0.0
Maintenance Velocityw: 15%
0.0
API Design & DXw: 20%
0.0
Production Readinessw: 15%
0.0
Ecosystem Integrationw: 10%
0.0
Tags
cudadeepseek-r1gptinference-enginellamallmllm-inferencellm-servingmodel-servingpytorch
Radar
No scores yet