HomeCatalog🤖 AI / LLMLiteLLM
Screenshot of LiteLLM website

// screenshot of litellm.ai ↗

AI / LLM · PRO TIER

LiteLLMpro

LiteLLM is an OpenAI-compatible proxy that fronts 100+ LLM providers — OpenAI, Anthropic, Google, Mistral, Cohere, Hugging Face, Ollama, AWS Bedrock, Azure, and dozens more. Your code calls litellm.completion() (or the proxy's OpenAI-compatible REST endpoint) and LiteLLM routes to the actual provider with retries, fallbacks, cost tracking, and load balancing.

🤖 AI / LLM Min 1024 MB RAM Port 4000 (http) Tier pro
// What it is

A closer look.

LiteLLM is an OpenAI-compatible proxy that fronts 100+ LLM providers — OpenAI, Anthropic, Google, Mistral, Cohere, Hugging Face, Ollama, AWS Bedrock, Azure, and dozens more. Your code calls litellm.completion() (or the proxy's OpenAI-compatible REST endpoint) and LiteLLM routes to the actual provider with retries, fallbacks, cost tracking, and load balancing.

It's the "LLM router" pattern — the one piece of infrastructure that makes provider-switching painless.

// Use cases

What it's for.

Concrete scenarios where teams pick LiteLLM over the SaaS alternative.

Provider-agnostic LLM apps

write code once, switch backends via config

Cost optimization

route cheap queries to cheaper models automatically

High availability

fallback chains across providers when one is down

Budget enforcement

per-user / per-team spend limits with alerts

Migration

gradual swap from OpenAI to Anthropic without code changes

// Who it's for

Built for these teams.

If your team profile matches one of these, LiteLLM is a strong fit out of the box.

Profile A

AI platform teams

standardizing LLM access across the org

Profile B

Enterprises

needing audit trail + budget enforcement on LLM usage

Profile C

Multi-LLM apps

wanting to A/B test providers without code refactor

Profile D

Cost-conscious startups

routing dev traffic to cheap providers, prod to premium

Profile E

Resellers

offering "OpenAI-compatible API" while proxying to multiple backends

// Differentiators

Why teams pick LiteLLM.

When evaluating self-hosted options for this category, here are the dimensions on which LiteLLM consistently lands above the alternatives.

  • OpenAI-compatible API — every OpenAI SDK works without code changes
  • 100+ providers — most comprehensive LLM router in OSS
  • Cost + token tracking — built-in spend analytics per key
  • Routing rules — match queries to models by cost, latency, region
  • MIT license — clean for commercial / production
  • Active development — releases multiple times per week
// Integrations

Connects to.

The stack you'll plug LiteLLM into — services, protocols, and adjacent apps in the BluixApps catalog.

LLM providers
OpenAI, Anthropic, Google, AWS Bedrock, Azure, Mistral, Cohere, HuggingFace, Ollama, vLLM, custom
Observability
Langfuse, Helicone, OpenTelemetry, Prometheus metrics
Caching
Redis-backed response cache to avoid duplicate API calls
Auth
JWT, API keys with per-key rate limits + budgets
Database
Postgres for spend tracking, key management, audit log
Admin UI
built-in dashboard for keys, costs, model usage
SDK clients
Python (native), JS via OpenAI SDK pointed at proxy
// Adoption & deployment

Notable users & community

  • 15k+ GitHub stars
  • Adopted by major AI platform teams as standard LLM router
  • Featured in enterprise AI architecture guides
  • Backed by BerriAI with active commercial enterprise offering
  • Strong Discord, weekly releases, predictable roadmap

What we ship

  • Docker compose: LiteLLM proxy + Postgres + Redis
  • Pinned ghcr.io/berriai/litellm:latest (locked to release tag)
  • HTTPS via Let's Encrypt; admin UI with random master key
  • Pre-configured for Ollama detection on same VPS
  • Postgres for spend tracking + key persistence
  • Redis for response caching
  • Backup hook covers Postgres (keys + spend history)
// Tips & operations

Run it properly.

Operational guidance from running this in production — what to do before you scale, what to lock down, what surprises people.

// PERFORMANCE
Always run with database
without Postgres, spend tracking + key management don't persist
// SECURITY
Set budgets per key
without budget caps, a single buggy client can rack up huge bills
// OPERATIONS
Use response caching
Redis cache on identical prompts saves significant cost
// RELIABILITY
Monitor via Langfuse
built-in LiteLLM → Langfuse integration captures every call for debugging
// DEPLOYMENT
Health check models
LiteLLM's health endpoint pings each provider; integrate with uptime monitoring
// SCALING
Update frequently
provider APIs change; LiteLLM releases track them; stale versions = silent failures
1024
// min ram (MB)
5
// min disk (GB)
4000
// access port
http
// protocol
pro
// bluixapps tier
4000:4000 · ghcr.io/berriai/litellm:main-stable
// docker image

Project resources

Official sitelitellm.ai ↗
// Alternatives in AI / LLM

Compare with