STACKQUADRANT

xlite-dev/Awesome-LLM-Inference

Inference Engines

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

6.5
GitHub Metrics
Stars
5.3k
Forks
381
Open Issues
Watchers
132
Contributors
39
Weekly Commits
0
Language
Python
License
GPL-3.0
Last Commit
Apr 20, 2026
Created
Aug 27, 2023
Latest Release
v2.6.20
Release Date
Jun 17, 2025
Synced: Jun 3, 2026
Quality Scores
Documentation Qualityw: 20%
5.5

No dedicated docs site. Description: 127 chars. Stars signal: 5,270. Contributors: 39. Score: 5.5/10

Community Healthw: 20%
7.3

Stars: 5,270. Contributors: 39. Watchers: 132. Forks: 381. Issue ratio: 0.0%. Score: 7.3/10

Maintenance Velocityw: 15%
4.9

Last commit: 44d ago. Weekly commits: 0. Latest release: v2.6.20. Maturity bonus: 2.8y old. Score: 4.9/10

API Design & DXw: 20%
7.0

Stars/issues ratio: 5270. Dynamic language: Python. No dedicated API docs. License: GPL-3.0. Popularity signal: 5,270 stars. Score: 7/10

Production Readinessw: 15%
7.0

Battle-tested: 5,270 stars. Peer review: 39 contributors. Versioned: v2.6.20. Licensed: GPL-3.0. Age: 2.8 years. Maintenance: last commit 44d ago. Score: 7/10

Ecosystem Integrationw: 10%
7.2

Fork interest: 381. Major ecosystem: Python. License: GPL-3.0. Adoption: 5,270 stars. Score: 7.2/10

Tags
awesome-llmdeepseekdeepseek-r1deepseek-v3flash-attentionflash-attention-3flash-mlallm-inferenceminimax-01mla
Radar
Documentation Quality
Community Health
Maintenance Velocity
API Design & DX
Production Readiness
Ecosystem Integration