By MTC Team
-
LightLLM v1.0.0--Minimal Inter-Process Communication Overhead, Fastest DeepSeek-R1 Serving Performance on Single H200, and Prototype Support for PD-Disaggregation
By MTC Team ·We are delighted to announce the release of LightLLM v1.0.0.
-
Reducing Overhead with Cuda Graph
By MTC Team ·Cuda Graph is used to reduce overhead in LightLLM.