Getting Started

Welcome to GuideLLM! This section will guide you through the process of installing the tool, setting up your benchmarking environment, running your first benchmark, and analyzing the results to optimize your LLM deployment for real-world inference workloads.

GuideLLM makes it simple to evaluate and optimize your large language model deployments, helping you find the perfect balance between performance, resource utilization, and cost-effectiveness.

Quick Start Guides

Follow the guides below in sequence to get the most out of GuideLLM and optimize your LLM deployments for production use.

Installation

Learn how to install GuideLLM using pip, from source, or with specific version requirements.

Installation Guide
Start a Server

Set up an OpenAI-compatible server using vLLM or other supported backends to benchmark your LLM deployments.

Server Setup Guide
Run Benchmarks

Learn how to configure and run performance benchmarks against your LLM server under various load conditions.

Benchmarking Guide
Analyze Results

Interpret benchmark results to understand throughput, latency, reliability, and optimize your deployments.

Analysis Guide