#

mixture-of-experts

Here are 102 public repositories matching this topic...

microsoft / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

machine-learning compression deep-learning gpu inference pytorch zero data-parallelism model-parallelism mixture-of-experts pipeline-parallelism billion-parameters trillion-parameters

Updated Jun 12, 2024
Python

learning-at-home / hivemind

Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.

distributed-systems machine-learning deep-learning pytorch dht neural-networks asyncio asynchronous-programming volunteer-computing hivemind distributed-training mixture-of-experts

Updated Jun 11, 2024
Python

TorchMoE / MoE-Infinity

PyTorch library for cost-effective, fast and easy serving of MoE models.

pytorch inference-engine mixture-of-experts huggingface large-language-models

Updated Jun 9, 2024
Python

relf / egobox

Efficient global optimization toolbox in Rust: bayesian optimization, mixture of gaussian processes, sampling methods

global-optimization gaussian-processes latin-hypercube-sampling surrogate-models mixture-of-experts

Updated Jun 10, 2024
Rust

dominiquegarmier / grok-pytorch

pytorch implementation of grok

pytorch transformer grok xai mixture-of-experts

Updated Jun 10, 2024
Python

SMTorg / smt

Surrogate Modeling Toolbox

machine-learning derivative sampling predictive-modeling surrogate-models mixture-of-experts multi-fidelity

Updated Jun 7, 2024
Jupyter Notebook

microsoft / tutel

Tutel MoE: An Optimized Mixture-of-Experts Implementation

nlp pytorch transformer moe mixture-of-experts

Updated Jun 6, 2024
Python

UNITES-Lab / MC-SMoE

[ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"

efficiency merging mixture-of-experts

Updated Jun 6, 2024
Python

lucidrains / st-moe-pytorch

Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch

deep-learning artificial-intelligence mixture-of-experts conditional-computation

Updated Jun 4, 2024
Python

umbertocappellazzo / PETL_AST

This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" and "Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of Adapters".

adapter transfer-learning lora audio-processing mixture-of-experts prompt-tuning audio-spectrogram-transformer parameter-efficient-fine-tuning mixture-of-adapters

Updated Jun 4, 2024
Python

LINs-lab / DynMoE

[Preprint] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models

moe bert mixture-of-experts vision-transformer stablelm multimodal-large-language-models qwen phi-2

Updated Jun 4, 2024
Python

Leeroo-AI / mergoo

A library for easily merging multiple LLM experts, and efficiently train the merged LLM.

nlp open-source transformers merge artificial-intelligence multi-model lora fine-tuning mixture-of-experts large-language-models llm generative-ai mixture-of-adapters

Updated Jun 2, 2024
Python

eduardzamfir / seemoredetails

Repository for "See More Details: Efficient Image Super-Resolution by Experts Mining", ICML 2024

computer-vision efficiency super-resolution low-level-vision mixture-of-experts low-rank-matrix-decomposition icml-2024

Updated Jun 1, 2024
Python

alexliap / greek_gpt

MoE Decoder Transformer implementation with MLX

deep-learning transformers mlx mixture-of-experts

Updated Jun 10, 2024
Python

james-oldfield / muMoE

[arXiv'24] Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization

moe mixture-of-experts

Updated May 31, 2024
Python

lolguy91 / perfect-llm-imho

The idea to create the perfect LLM currently possible came to my mind because I was watching a YouTube on GaLore, the "sequel" to LoRa, and I realized how fucking groundbreaking that tech is. I was daydreaming about pretraining my own model, this (probably impossible to implement) concept is a refined version of that model.

Updated May 31, 2024

liuqidong07 / MOELoRA-peft

[SIGIR'24] The official implementation code of MOELoRA.

multi-task peft multitask-learning mixture-of-experts large-language-models chatglm low-rank-adaptation parameter-efficient-fine-tuning peft-fine-tuning-llm

Updated May 28, 2024
Python

zjukg / MoMoK

[Paper][Preprint 2024] Mixture of Modality Knowledge Experts for Robust Multi-modal Knowledge Graph Completion

knowledge-graph knowledge-graph-completion mutual-information knowledge-graph-embeddings mixture-of-experts contrastive-learning multi-modal-fusion multi-modal-knowledge-graph

Updated May 28, 2024
Python

packit

ssube / packit

an LLM toolkit

inference ensemble agents conversational-ai mixture-of-experts llm

Updated May 27, 2024
Python

DongmingShenDS / Mistral_From_Scratch

Mistral and Mixtral (MoE) from scratch

mixture-of-experts kv-cache large-language-models mistral-7b mixtral-8x7b peft-fine-tuning-llm

Updated May 27, 2024
Python

Improve this page

Add a description, image, and links to the mixture-of-experts topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the mixture-of-experts topic, visit your repo's landing page and select "manage topics."