LLM Inference AI Agents

D

DeepSpeed-Mii

OSS

DeepSpeed-Mii is an AI agent in the LLM Inference category. MII makes low-latency and high-throughput inference, similar to vLLM p…

Details

D

deploy-llms-with-ansible

LLM Inference

OSS

deploy-llms-with-ansible is an AI agent in the LLM Inference category. Easily deploy any LLM on a VM with minimal configuration, u…

Details

E

exllama

LLM Inference

OSS

exllama is an AI agent in the LLM Inference category. A more memory-efficient rewrite of the HF transformers implementation of Lla…

Details

F

FastChat

LLM Inference

OSS

FastChat is an AI agent in the LLM Inference category. A distributed multi-model LLM serving system with web UI and OpenAI-compati…

Details

F

FasterTransformer

LLM Inference

OSS

FasterTransformer is an AI agent in the LLM Inference category. NVIDIA Framework for LLM Inference(Transitioned to TensorRT-LLM)

Details

I

Infinity

LLM Inference

OSS

Infinity is an AI agent in the LLM Inference category. Inference for text-embeddings in Python

Details

L

Liger-Kernel

LLM Inference

OSS

Liger-Kernel is an AI agent in the LLM Inference category. Efficient Triton Kernels for LLM Training.

Details

L

LMDeploy

LLM Inference

OSS

LMDeploy is an AI agent in the LLM Inference category. A high-throughput and low-latency inference and serving framework for LLMs …

Details

M

MInference

LLM Inference

OSS

MInference is an AI agent in the LLM Inference category. To speed up Long-context LLMs' inference, approximate and dynamic sparse …

Details

M

mistral.rs

LLM Inference

OSS

mistral.rs is an AI agent in the LLM Inference category. Blazingly fast LLM inference.

Details

P

prima.cpp

LLM Inference

OSS

prima.cpp is an AI agent in the LLM Inference category. A distributed implementation of llama.cpp that lets you run 70B-level LLMs…

Details

S

SGLang

LLM Inference

OSS

SGLang is an AI agent in the LLM Inference category. SGLang is a fast serving framework for large language models and vision langu…

Details

S

SkyPilot

LLM Inference

OSS

SkyPilot is an AI agent in the LLM Inference category. Run LLMs and batch jobs on any cloud. Get maximum cost savings, highest GPU…

Details

T

TensorRT-LLM

LLM Inference

OSS

TensorRT-LLM is an AI agent in the LLM Inference category. Nvidia Framework for LLM Inference

Details

T

Text-Embeddings-Inference

LLM Inference

OSS

Text-Embeddings-Inference is an AI agent in the LLM Inference category. Inference for text-embeddings in Rust, HFOIL Licence.

Details

T

TGI

LLM Inference

TGI is an AI agent in the LLM Inference category. a toolkit for deploying and serving Large Language Models (LLMs).

Details

V

vLLM

LLM Inference

OSS

vLLM is an AI agent in the LLM Inference category. A high-throughput and memory-efficient inference and serving engine for LLMs.

Details