Plano is an AI-native proxy server and data plane for agentic apps - centralizing orchestration, safety, observability, and smart LLM routing so you can deliver agents faster.

★ 6.6k◇ 427Rust

algorithmicsuperintelligence/openevolve

6.7

Open-source implementation of AlphaEvolve

★ 6.5k◇ 1.0kPython

flashinfer-ai/flashinfer

7.4

FlashInfer: Kernel Library for LLM Serving

★ 5.7k◇ 1.0kPython

kserve/kserve

7.7

Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes

★ 5.5k◇ 1.5kGo

Michael-A-Kuykendall/shimmy

6.3

⚡ Python-free Rust inference server — OpenAI-API compatible. GGUF + SafeTensors, hot model swap, auto-discovery, single binary. FREE now, FREE forever.

★ 5.3k◇ 503Rust

xlite-dev/Awesome-LLM-Inference

6.5

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

★ 5.3k◇ 381Python

gpustack/gpustack

7.0

Performance-optimized AI inference on your GPUs. Unlock superior throughput by selecting and tuning engines like vLLM or SGLang.

★ 5.1k◇ 540Python

FellouAI/eko

7.0

Eko (Eko Keeps Operating) - Build Production-ready Agentic Workflow with Natural Language - eko.fellou.ai

★ 4.9k◇ 439TypeScript

lemonade-sdk/lemonade

7.1

Lemonade helps users discover and run local AI apps by serving optimized LLMs right from their own GPUs and NPUs. Join our discord: https://discord.gg/5xXzkMu8Zk

★ 4.2k◇ 330C++

ruvnet/ruvector

7.0

RuVector is a High Performance, Real-Time, Self-Learning, Vector Graph Neural Network, and Database built in Rust.

★ 4.2k◇ 544Rust

ruvnet/RuVector

7.0

RuVector is a High Performance, Real-Time, Self-Learning, Vector Graph Neural Network, and Database built in Rust.

★ 4.2k◇ 544Rust

1 / 2next →