Skip to content

Getting Started

Welcome to GuideLLM! This section will guide you through the process of installing the tool, setting up your benchmarking environment, running your first benchmark, and analyzing the results to optimize your LLM deployment for real-world inference workloads.

GuideLLM makes it simple to evaluate and optimize your large language model deployments, helping you find the perfect balance between performance, resource utilization, and cost-effectiveness.

Quick Start Guides

Follow the guides below in sequence to get the most out of GuideLLM and optimize your LLM deployments for production use.

  • Installation


    Learn how to install GuideLLM using pip, from source, or with specific version requirements.

    Installation Guide

  • Start a Server


    Set up an OpenAI-compatible server using vLLM or other supported backends to benchmark your LLM deployments.

    Server Setup Guide

  • Run Benchmarks


    Learn how to configure and run performance benchmarks against your LLM server under various load conditions.

    Benchmarking Guide

  • Analyze Results


    Interpret benchmark results to understand throughput, latency, reliability, and optimize your deployments.

    Analysis Guide