nousresearch / obsidian Goto Github PK
View Code? Open in Web Editor NEWMaybe the new state of the art vision model? we'll see ๐คทโโ๏ธ
License: Apache License 2.0
Maybe the new state of the art vision model? we'll see ๐คทโโ๏ธ
License: Apache License 2.0
By the way, I think Flamingo 3B is also a multi-modal LVLM in size 3B
Hello ๐
This repository has both deepspeed and fastapi as a dependency. deepspeed doesn't support pydantic > 2.0.0
which results in below error (included in DeepSpeed/issues/3963) when I install this repository with pip:
RuntimeError: Failed to import transformers.models.llama.modeling_llama because of the following error (look up to see its traceback):
'FieldInfo' object has no attribute 'required'
For the time being, it would be nice to pin the fastapi and pydantic versions in this repository.
Addition: Also llava requires transformers 4.31.0 but Obsidian depends on Mistral integration which is >4.34.0
I didn't see in the readme how to actually use the model. I'd like to try using it as a replacement for the Llava models if that's even possible using the transformers library...
I had the misfortune of following the instructions 5 hours after release of transformers v4.35, the instructions guide to upgrade to the latest release, so I got the following error:
$ python -m llava.serve.controller --host 0.0.0.0 --port 10000 (obsidian)
[2023-11-02 21:08:06,589] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Traceback (most recent call last):
File "[...]/miniconda3/envs/obsidian/lib/python3.10/runpy.py", line 187, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
File "[...]/miniconda3/envs/obsidian/lib/python3.10/runpy.py", line 110, in _get_module_details
__import__(pkg_name)
File "[...]/Obsidian/llava/__init__.py", line 1, in <module>
from .model import LlavaLlamaForCausalLM
File "[...]/Obsidian/llava/model/__init__.py", line 3, in <module>
from .language_model.llava_mpt import LlavaMPTForCausalLM, LlavaMPTConfig
File "[...]/Obsidian/llava/model/language_model/llava_mpt.py", line 26, in <module>
from .mpt.modeling_mpt import MPTConfig, MPTForCausalLM, MPTModel
File "[...]/Obsidian/llava/model/language_model/mpt/modeling_mpt.py", line 19, in <module>
from .hf_prefixlm_converter import add_bidirectional_mask_if_missing, convert_hf_causal_lm_to_prefix_lm
File "[...]/Obsidian/llava/model/language_model/mpt/hf_prefixlm_converter.py", line 15, in <module>
from transformers.models.bloom.modeling_bloom import _expand_mask as _expand_mask_bloom
ImportError: cannot import name '_expand_mask' from 'transformers.models.bloom.modeling_bloom' ([...]/miniconda3/envs/obsidian/lib/python3.10/site-packages/transformers/models/bloom/modeling_bloom.py)
As the immediate workaround, downgrading to v4.34 (pip install --upgrade transformers==4.34.0
) works.
sh script/download_mm_projector.sh
should be sh scripts/download_mm_projector.sh
I tried it twice, confirmed the model had been fully downloaded then loaded the web server. The error when submitting the prompt is NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE
I tried it on my MPS device but it's not ready, I changed builder
and model_worker
to use MPS
rather than CUDA
and ran into the issue regarding half
tensors like float16 not working, so I tried to move it to cpu
but didn't get much further which is when I tried the Colab notebook.
Thanks for any input as to why or how to fix the Colab, thanks guys
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.