Home / Categories / LLM Inference

LLM Inference

Showing 17 agents

e
LLM Inference

exllama is an AI agent in the LLM Inference category. A more memory-efficient rewrite of the HF transformers implementat...

View Details Visit
F
LLM Inference

FastChat is an AI agent in the LLM Inference category. A distributed multi-model LLM serving system with web UI and Open...

View Details Visit
L
LLM Inference

LMDeploy is an AI agent in the LLM Inference category. A high-throughput and low-latency inference and serving framework...

View Details Visit
M
LLM Inference

MInference is an AI agent in the LLM Inference category. To speed up Long-context LLMs' inference, approximate and dynam...

View Details Visit
p
LLM Inference

prima.cpp is an AI agent in the LLM Inference category. A distributed implementation of llama.cpp that lets you run 70B-...

View Details Visit
S
LLM Inference

SGLang is an AI agent in the LLM Inference category. SGLang is a fast serving framework for large language models and vi...

View Details Visit
S
LLM Inference

SkyPilot is an AI agent in the LLM Inference category. Run LLMs and batch jobs on any cloud. Get maximum cost savings, h...

View Details Visit
T
LLM Inference

TGI is an AI agent in the LLM Inference category. a toolkit for deploying and serving Large Language Models (LLMs).

View Details Visit
v

vLLM

OSS
LLM Inference

vLLM is an AI agent in the LLM Inference category. A high-throughput and memory-efficient inference and serving engine f...

View Details Visit