by Vivek Gangasani, Banu Nagasundaram, Dmitry Soldatkin, Felipe Lopez, Siddharth Venkatesan • 1 hour ago
Amazon SageMaker Large Model Inference container v15 is launched, powered by vLLM 0.8.4 with V1 engine support. This version enhances LLM inference by improving performance, expanding model compatibility including multimodality, and integrating vLLM for scalable deployments. Benchmarks show significant throughput gains, particularly in high-concurrency scenarios. LMI v15 supports leading open-source models and offers easy deployment via Amazon ECR and SageMaker endpoints.