Inference Engines
High-performance model inference and serving runtimes
algorithmicsuperintelligence/optillm
6.6
★ 4.1k◇ 355Python
predibase/lorax
6.9
★ 3.8k◇ 316Python
neuralmagic/deepsparse
5.9
★ 3.2k◇ 192Python
spiceai/spiceai
7.0
★ 2.9k◇ 197Rust
b4rtaz/distributed-llama
6.2
★ 2.9k◇ 232C++
FasterDecoding/Medusa
5.4
★ 2.7k◇ 201Jupyter Notebook
ovg-project/kvcached
5.8
★ 1.1k◇ 118Python
nobodywho-ooo/nobodywho
6.2
★ 944◇ 66Rust
zhihu/ZhiLight
5.3
★ 905◇ 102C++
jjang-ai/mlxstudio
5.3
★ 763◇ 49
andrewkchan/yalm
3.7
★ 584◇ 62C++
zjhellofss/KuiperLLama
4.0
★ 548◇ 142C++
interestingLSY/swiftLLM
3.8
★ 329◇ 38Python
← prev2 / 2