/


#66
LMSYS Org
Open research collective (UC Berkeley, Stanford, CMU) focused on large model systems. Builds SGLang, a high-throughput inference engine for LLMs and VLMs with radix cache and chunked prefill; FastChat, an open platform for training, fine-tuning, and serving LLMs; Vicuna, a Llama-based open-source chat model; and S-LoRA for multi-adapter serving.
Categories
Subcategories
RLAGENT FRAMEWORKSMEMORY
Links
