AltHub
vllm

vllm

Efficient and high-throughput inference engine for large language models.

About vllm

A high-throughput and memory-efficient inference and serving engine for LLMs.

✅ Pros

  • + High throughput for LLM inference
  • + Memory-efficient serving engine
  • + Open source and customizable

⚠️ Cons

  • - Primarily supports Linux
  • - Requires technical expertise to deploy

Reviews

Loading reviews...

Quick Info

Pricing
Free
License
Apache-2.0
GitHub Stars
71,011
Forks
13,647
Language
Python
Platforms
linux, self-hosted

Price Alert

Get notified when vllm's pricing changes.