Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
-
Updated
Jun 3, 2024 - Python
Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
MeshXL: Neural Coordinate Field for Generative 3D Foundation Models; 3D generative fundamental models using NeurCF
State-of-the-art, multi-modal virtual assistant framework powered by LLaMA. Ame is under active development.
A paper list about large language models and multimodal models (Diffusion, VLM). From foundations to applications. It is only used to record papers for my personal needs.
ModelScope: bring the notion of Model-as-a-Service to life.
Start building LLM-empowered multi-agent applications in an easier way.
Basic implementation code for multimodal models and some applications or fine-tuning tasks based on them.
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型
GPT4V-level open-source multi-modal model based on Llama3-8B
Open Source Routing Engine for OpenStreetMap
Here we will track the latest AI Multimodal Models, including Multimodal Foundation Models, LLM, Audio, Image, Video, Music and 3D content. 🔥
A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!
Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 50+ HF models, 20+ benchmarks
[MIR-2023-Survey] A continuously updated paper list for multi-modal pre-trained big models
Enhancing Large Vision Language Models with Self-Training on Image Comprehension.
A robust, all-in-one GPT interface for Discord. ChatGPT-style conversations, image generation, AI-moderation, custom indexes/knowledgebase, youtube summarizer, and more!
MM-Diff: High-Fidelity Image Personalization via Multi-Modal Condition Integration
VisualRWKV is the visual-enhanced version of the RWKV language model, enabling RWKV to handle various visual tasks.
Time-series prediction using a multi-modal 1D Convolutional Neural Network
Add a description, image, and links to the multi-modal topic page so that developers can more easily learn about it.
To associate your repository with the multi-modal topic, visit your repo's landing page and select "manage topics."