Inference Engines
High-performance model inference and serving runtimes
lemonade-sdk/lemonade
7.0
★ 3.5k◇ 261C++
algorithmicsuperintelligence/optillm
6.5
★ 3.4k◇ 268Python
neuralmagic/deepsparse
6.1
★ 3.2k◇ 190Python
b4rtaz/distributed-llama
6.3
★ 2.9k◇ 225C++
spiceai/spiceai
6.9
★ 2.9k◇ 185Rust
FasterDecoding/Medusa
5.4
★ 2.7k◇ 197Jupyter Notebook
zhihu/ZhiLight
5.5
★ 904◇ 102C++
ovg-project/kvcached
5.6
★ 852◇ 98Python
nobodywho-ooo/nobodywho
6.2
★ 790◇ 55Rust
andrewkchan/yalm
3.8
★ 570◇ 59C++
zjhellofss/KuiperLLama
4.1
★ 527◇ 137C++
jjang-ai/mlxstudio
4.8
★ 477◇ 32
interestingLSY/swiftLLM
3.9
★ 323◇ 37Python
← prev2 / 2