STACKQUADRANT

zengxiao-he/tessera

Inference Engines

From teacher to tiles — a from-scratch LLM distillation & serving engine: custom Triton/CUDA kernels, FSDP distillation, paged-KV continuous batching, speculative decoding, a Rust gateway, a JAX oracle, and interpretability tooling.

4.3
GitHub Metrics
Stars
389
Forks
4
Open Issues
Watchers
4
Contributors
1
Weekly Commits
0
Language
Python
License
NOASSERTION
Last Commit
Jun 5, 2026
Created
Jun 5, 2026
Latest Release
Release Date
Synced: Jun 29, 2026
Quality Scores
Documentation Qualityw: 20%
4.0

No dedicated docs site. Description: 232 chars. Stars signal: 389. Contributors: 1. Score: 4/10

Community Healthw: 20%
2.8

Stars: 389. Contributors: 1. Watchers: 4. Forks: 4. Issue ratio: 0.0%. Score: 2.8/10

Maintenance Velocityw: 15%
5.3

Last commit: 24d ago. Weekly commits: 0. No releases published. Score: 5.3/10

API Design & DXw: 20%
6.5

Stars/issues ratio: 389. Dynamic language: Python. No dedicated API docs. License: NOASSERTION. Popularity signal: 389 stars. Score: 6.5/10

Production Readinessw: 15%
2.8

Battle-tested: 389 stars. Peer review: 1 contributors. No versioned releases. Licensed: NOASSERTION. Age: 0.1 years. Maintenance: last commit 24d ago. Score: 2.8/10

Ecosystem Integrationw: 10%
4.2

Fork interest: 4. Major ecosystem: Python. License: NOASSERTION. Adoption: 389 stars. Score: 4.2/10

Tags
cudaflash-attentionfsdpinference-enginejaxknowledge-distillationkv-cachellmmechanistic-interpretabilityml-systems
Radar
Documentation Quality
Community Health
Maintenance Velocity
API Design & DX
Production Readiness
Ecosystem Integration