Here are 8 cool open-source projects inspired by the potential of Gemini's multimodality
by shweta10 Updated: Dec 10, 2023
Guide Kit ย
Here are 8 cool open-source projects inspired by the potential of Gemini's multimodality
LAVISby salesforce
LAVIS - A One-stop Library for Language-Vision Intelligence
LAVISby salesforce
Python
5474
Version:v1.0.2
License: Permissive (BSD-3-Clause)
jinaby jina-ai
๐ฎ Build multimodal AI services via cloud native technologies
jinaby jina-ai
Python
18565
Version:v3.17.0
License: Permissive (Apache-2.0)
unilmby microsoft
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
unilmby microsoft
Python
12771
Version:s2s-ft.v0.3
License: Permissive (MIT)
LLaVAby haotian-liu
Large Language-and-Vision Assistant built towards multimodal GPT-4 level capabilities.
LLaVAby haotian-liu
Python
3169
Version:Current
License: Permissive (Apache-2.0)
mmfby facebookresearch
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
mmfby facebookresearch
Python
5241
Version:v0.3.1
License: Others (Non-SPDX)
discoartby jina-ai
๐ชฉ Create Disco Diffusion artworks in one line
discoartby jina-ai
Python
3773
Version:v0.12.1
License: Others (Non-SPDX)
rerunby rerun-io
Log images, point clouds, etc, and visualize them effortlessly. Built in Rust using egui
rerunby rerun-io
Rust
2405
Version:prerelease
License: Permissive (Apache-2.0)