A programming framework for agentic AI. Discord: https://aka.ms/autogen-dc. Roadmap: https://aka.ms/autogen-roadmap
-
Updated
Jun 12, 2024 - Jupyter Notebook
A programming framework for agentic AI. Discord: https://aka.ms/autogen-dc. Roadmap: https://aka.ms/autogen-roadmap
Pretrain, finetune, deploy 20+ LLMs on your own data. Uses state-of-the-art techniques: flash attention, FSDP, 4-bit, LoRA, and more.
A high-performance inference system for large language models, designed for production environments.
The official evaluation suite and dynamic data release for MixEval.
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
Minimalist web-searching app with an AI assistant that runs directly from your browser. Uses Web-LLM, Ratchet-ML, Wllama and SearXNG. Demo: https://felladrin-minisearch.hf.space
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
Friendli: the fastest serving engine for generative AI
The easiest way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Multi-model Inference Graph/Pipelines, LLM/RAG apps, and more!
Semantic embedding-based system for question answering from PDFs with visual analysis tools.
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
Popular Large Language Models from scratch - 2024
AICI: Prompts as (Wasm) Programs
Implementation of Model-Distributed Inference for Large Language Models, built on top of LitGPT
FlashInfer: Kernel Library for LLM Serving
LLM Inference analyzer for different hardware platforms
Recipes for on-device voice AI and local LLM
Add a description, image, and links to the llm-inference topic page so that developers can more easily learn about it.
To associate your repository with the llm-inference topic, visit your repo's landing page and select "manage topics."