vLLM Blog
Sep 5, 2024
vLLM v0.6.0: 2.7x Throughput Improvement and 5x Latency Reduction
Jul 25, 2024
vLLM’s Open Governance and Performance Roadmap
Jul 23, 2024
Announcing Llama 3.1 Support in vLLM
Nov 14, 2023
Notes on vLLM v.s. DeepSpeed-FastGen
Jun 20, 2023
vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention