Here are 8 cool open-source projects inspired by the potential of Gemini's multimodality
by shweta10 Updated: Dec 10, 2023
Guide Kit ย
Here are 8 cool open-source projects inspired by the potential of Gemini's multimodality
LAVISby salesforce
LAVIS - A One-stop Library for Language-Vision Intelligence
LAVISby salesforce
Python 5474 Version:v1.0.2 License: Permissive (BSD-3-Clause)
jinaby jina-ai
๐ฎ Build multimodal AI services via cloud native technologies
jinaby jina-ai
Python 18565 Version:v3.17.0 License: Permissive (Apache-2.0)
unilmby microsoft
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
unilmby microsoft
Python 12771 Version:s2s-ft.v0.3 License: Permissive (MIT)
LLaVAby haotian-liu
Large Language-and-Vision Assistant built towards multimodal GPT-4 level capabilities.
LLaVAby haotian-liu
Python 3169 Version:Current License: Permissive (Apache-2.0)
mmfby facebookresearch
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
mmfby facebookresearch
Python 5241 Version:v0.3.1 License: Others (Non-SPDX)
discoartby jina-ai
๐ชฉ Create Disco Diffusion artworks in one line
discoartby jina-ai
Python 3773 Version:v0.12.1 License: Others (Non-SPDX)
rerunby rerun-io
Log images, point clouds, etc, and visualize them effortlessly. Built in Rust using egui
rerunby rerun-io
Rust 2405 Version:prerelease License: Permissive (Apache-2.0)