GithubHelp home page GithubHelp logo

xraygpt's Introduction

XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models.

Omkar Thawakar* , Abdelrahman Shaker* , Sahal Shaji Mullappilly* , Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan, Jorma Laaksonen, and Fahad Shahbaz Khan.

*Equal Contribution

Mohamed bin Zayed University of Artificial Intelligence, UAE

YouTube

๐Ÿš€ News


  • Aug-04 : Our paper has been accepted at BIONLP-ACL 2024 ๐Ÿ”ฅ
  • Jun-14 : Our technical report is released here. ๐Ÿ”ฅ๐Ÿ”ฅ
  • May-25 : Our technical report will be released very soon. stay tuned!.
  • May-19 : Our code, models, and pre-processed report summaries are released.

Online Demo

You can try our demo using the provided examples or by uploading your own X-ray here : Link-1 | Link-2 | Link-3 .

About XrayGPT


  • XrayGPT aims to stimulate research around automated analysis of chest radiographs based on the given x-ray.ย 
  • The LLM (Vicuna) is fine-tuned on medical data (100k real conversations between patients and doctors) and ~30k radiology conversations to acquire domain specific and relevant features.ย 
  • We generate interactive and clean summaries (~217k) from free-text radiology reports of two datasets (MIMIC-CXR and OpenI). These summaries serve to enhance the performance of LLMs through fine-tuning the linear transformation layer on high-quality data. For more details regarding our high-quality summaries, please check Dataset Creation.
  • We align frozen medical visual encoder (MedClip) with a fune-tuned LLM (Vicuna), using simple linear transformation.

overview

Getting Started

Installation

1. Prepare the code and the environment

Clone the repository and create a anaconda environment

git clone https://github.com/mbzuai-oryx/XrayGPT.git
cd XrayGPT
conda env create -f env.yml
conda activate xraygpt

OR

git clone https://github.com/mbzuai-oryx/XrayGPT.git
cd XrayGPT
conda create -n xraygpt python=3.9
conda activate xraygpt
pip install -r xraygpt_requirements.txt

Setup

1. Prepare the Datasets for training

Refer the dataset_creation for more details.

Download the preprocessed annoatations mimic & openi. Respective image folders contains the images from the dataset.

Following will be the final dataset folder structure:

dataset
โ”œโ”€โ”€ mimic
|    โ”œโ”€โ”€ image
|    |   โ”œโ”€โ”€abea5eb9-b7c32823-3a14c5ca-77868030-69c83139.jpg
|    |   โ”œโ”€โ”€427446c1-881f5cce-85191ce1-91a58ba9-0a57d3f5.jpg
|    |   .....
|    โ”œโ”€โ”€filter_cap.json
โ”œโ”€โ”€ openi
|    โ”œโ”€โ”€ image
|    |   โ”œโ”€โ”€1.jpg
|    |   โ”œโ”€โ”€2.jpg
|    |   .....
|    โ”œโ”€โ”€filter_cap.json
...   

3. Prepare the pretrained Vicuna weights

We built XrayGPT on the v1 versoin of Vicuna-7B. We finetuned Vicuna using curated radiology report samples. Download the Vicuna weights from vicuna_weights The final weights would be in a single folder in a structure similar to the following:

vicuna_weights
โ”œโ”€โ”€ config.json
โ”œโ”€โ”€ generation_config.json
โ”œโ”€โ”€ pytorch_model.bin.index.json
โ”œโ”€โ”€ pytorch_model-00001-of-00003.bin
...   

Then, set the path to the vicuna weight in the model config file "xraygpt/configs/models/xraygpt.yaml" at Line 16.

To finetune Vicuna on radiology samples please download our curated radiology and medical_healthcare conversational samples and refer the original Vicuna repo for finetune.Vicuna_Finetune

4. Download the pretrained Minigpt-4 checkpoint

Download the pretrained minigpt-4 checkpoints. ckpt

5. Training of XrayGPT

A. First mimic pretraining stage

In the first pretrained stage, the model is trained using image-text pairs from preprocessed mimic dataset.

To launch the first stage training, run the following command. In our experiments, we use 4 AMD MI250X GPUs.

torchrun --nproc-per-node NUM_GPU train.py --cfg-path train_configs/xraygpt_mimic_pretrain.yaml

2. Second openi finetuning stage

In the second stage, we use a small high quality image-text pair openi dataset preprocessed by us.

Run the following command. In our experiments, we use AMD MI250X GPU.

torchrun --nproc-per-node NUM_GPU train.py --cfg-path train_configs/xraygpt_openi_finetune.yaml

Launching Demo on local machine

Download the pretrained xraygpt checkpoints. link

Add this ckpt in "eval_configs/xraygpt_eval.yaml".

Try gradio demo.py on your local machine with following

python demo.py --cfg-path eval_configs/xraygpt_eval.yaml  --gpu-id 0

Examples

example 1 example 2
example 3 example 4

Acknowledgement


  • MiniGPT-4 Enhancing Vision-language Understanding with Advanced Large Language Models. We built our model on top of MiniGPT-4.
  • MedCLIP Contrastive Learning from Unpaired Medical Images and Texts. We used medical aware image encoder from MedCLIP.
  • BLIP2 The model architecture of XrayGPT follows BLIP-2.
  • Lavis This repository is built upon Lavis!
  • Vicuna The fantastic language ability of Vicuna is just amazing. And it is open-source!

Citation

If you're using XrayGPT in your research or applications, please cite using this BibTeX:

    @article{Omkar2023XrayGPT,
        title={XrayGPT: Chest Radiographs Summarization using Large Medical Vision-Language Models},
        author={Omkar Thawkar, Abdelrahman Shaker, Sahal Shaji Mullappilly, Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan, Jorma Laaksonen and Fahad Shahbaz Khan},
        journal={arXiv: 2306.07971},
        year={2023}
    }

License

This repository is licensed under CC BY-NC-SA. Please refer to the license terms here.

xraygpt's People

Contributors

aaekay avatar amshaker avatar omkarthawakar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

xraygpt's Issues

Questions about the paper

Dear authors,
I have some questions about the paper content:
(1) what is the MedVicuna and RadVicuna in Table 1? I cannot find them in the paper or on the Internet;
(2) According to Figure 1, it seems only the Linear Transformation Layer is trained in the whole framework, but why you mentioned in the contributions that "The LLM (Vicuna) is fine-turned on medical data"?
(3) In your work, if only the Linear Transformation Layer is trained while the LLM and MedClip are all frozen?

Cannot setup environment

I have tried setting up the environment using both pip and conda and have been unable to do so.
These are the error messages I am getting with each method:

  1. conda - ResolvePackageNotFound (for about 20 packages)
  2. pip - ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: '/croot/certifi_1671487769961/work/certifi'

My machine is running Mac OS.

Thanks for the work you're doing!

Ask a question in another language?

He kept answering me in English when I asked questions in the make it language. So what do you think about using other languages for training in fintune rather than pre-training the whole model?

Setting up environment issues

Iโ€™ve tried both requirements.txt as well as the .env files. Both are failing. Iโ€™ve made multiple changes as well.

I'm facing the same issue as well. I'm using Ubuntu 22.04.04 LTS. I modifed the @file modifier. It was giving some or other dependencies issues.

accelerate==0.15.0 aiofiles==23.1.0 aiohttp==3.8.4 aiosignal==1.3.1 albumentations==1.3.0 altair==4.2.2 antlr4-python3-runtime==4.9.3 anyio==3.6.2 appdirs==1.4.4 argon2-cffi==21.3.0 argon2-cffi-bindings==21.2.0 arrow==1.2.3 asttokens==2.2.1 async-timeout==4.0.2 attrs==22.2.0 backcall==0.2.0 beautifulsoup4==4.12.2 bitsandbytes==0.37.0 bleach==6.0.0 blis==0.7.9 braceexpand==0.1.7 brotlipy==0.7.0 cachetools==5.3.0 catalogue==2.0.8 cchardet==2.1.7 certifi cffi chardet==3.0.4 charset-normalizer cmake==3.26.3 comm==0.1.3 confection==0.0.4 contourpy==1.0.7 cryptography cycler==0.11.0 cymem==2.0.7 dataclasses==0.6 datasets debugpy==1.6.7 decorator==5.1.1 decord==0.6.0 defusedxml==0.7.1 dill docker-pycreds==0.4.0 entrypoints==0.4 et-xmlfile==1.1.0 evaluate==0.4.0 executing==1.2.0 ExifRead-nocycle==3.0.1 fairscale==0.4.13 fastapi==0.95.1 fastChat==0.1 fastjsonschema==2.16.3 ffmpy==0.3.0 filelock==3.9.0 fire flit_core fonttools==4.38.0 fqdn==1.5.1 frozenlist==1.3.3 fschat @ git+https://github.com/lm-sys/FastChat.git@f34f28cedcb8906fd026f22ec3ef41435a8e24ac fsspec gensim==4.3.1 gitdb==4.0.10 GitPython==3.1.31 googletrans==3.0.0 gradio==3.23.0 gradio_client==0.0.8 h11==0.9.0 h2==3.2.0 hiq-python==1.1.12 hpack==3.0.0 hstspreload==2023.1.1 httpcore==0.9.1 httpx==0.13.3 huggingface-hub hyperframe==5.2.0 idna==2.10 imageio==2.27.0 img2dataset importlib-metadata==6.5.0 importlib-resources==5.12.0 iopath==0.1.10 ipykernel==6.22.0 ipython==8.12.0 ipython-genutils==0.2.0 isoduration==20.11.0 jedi==0.18.2 Jinja2==3.1.2 joblib==1.2.0 jsonpointer==2.3 jsonschema==4.17.3 jupyter-events==0.6.3 jupyter_client==8.2.0 jupyter_core==5.3.0 jupyter_server==2.5.0 jupyter_server_terminals==0.4.4 jupyterlab-pygments==0.2.2 kiwisolver==1.4.4 langcodes==3.3.0 lazy_loader==0.2 linkify-it-py==2.0.0 lit==16.0.1 llvmlite==0.39.1 markdown-it-py==2.2.0 markdown2==2.4.8 MarkupSafe==2.1.2 matplotlib==3.7.0 matplotlib-inline==0.1.6 mdit-py-plugins==0.3.3 mdurl==0.1.2 MedCLIP==0.0.3 mistune==2.0.5 mkl-fft mkl-random mkl-service==2.4.0 mpmath==1.3.0 multidict==6.0.4 multiprocess murmurhash==1.0.9 nbclassic==0.5.5 nbclient==0.7.3 nbconvert==7.3.1 nbformat==5.8.0 nest-asyncio==1.5.6 networkx==3.1 nltk==3.8.1 notebook==6.5.4 notebook_shim==0.2.2 numba numpy nvidia-cublas-cu11==11.10.3.66 nvidia-cuda-cupti-cu11==11.7.101 nvidia-cuda-nvrtc-cu11==11.7.99 nvidia-cuda-runtime-cu11==11.7.99 nvidia-cudnn-cu11==8.5.0.96 nvidia-cufft-cu11==10.9.0.58 nvidia-curand-cu11==10.2.10.91 nvidia-cusolver-cu11==11.4.0.1 nvidia-cusparse-cu11==11.7.4.91 nvidia-nccl-cu11==2.14.3 nvidia-nvtx-cu11==11.7.91 omegaconf==2.3.0 openai==0.27.0 opencv-python==4.7.0.72 opencv-python-headless==4.7.0.72 openpyxl==3.1.2 orjson==3.8.10 packaging==23.0 pandas==1.5.3 pandocfilters==1.5.0 parso==0.8.3 pathtools==0.1.2 pathy==0.10.1 peft==0.2.0 pexpect==4.8.0 pickleshare==0.7.5 Pillow==9.4.0 platformdirs==3.2.0 portalocker==2.7.0 preshed==3.0.8 prometheus-client==0.16.0 promise==2.3 prompt-toolkit==3.0.38 protobuf==3.20.3 psutil==5.9.4 ptyprocess==0.7.0 pure-eval==0.2.2 py-itree==0.0.19 pyarrow pycocoevalcap==1.2 pycocotools==2.0.6 pycparser pydantic==1.10.7 pydub==0.25.1 Pygments==2.15.1 pyllama==0.0.9 pynndescent==0.5.9 pyOpenSSL pyparsing==3.0.9 pyrsistent==0.19.3 PySocks python-dateutil==2.8.2 python-json-logger==2.0.7 python-multipart==0.0.6 pytz==2023.3 PyWavelets==1.4.1 PyYAML==6.0 pyzmq==25.0.2 qudida==0.0.4 regex==2022.10.31 requests responses==0.18.0 rfc3339-validator==0.1.4 rfc3986==1.5.0 rfc3986-validator==0.1.1 rich==13.3.4 scikit-image==0.20.0 scikit-learn==1.2.2 scipy semantic-version==2.10.0 Send2Trash==1.8.0 sentence-transformers sentencepiece==0.1.97 sentry-sdk==1.19.1 setproctitle==1.3.2 shortuuid==1.0.11 six smart-open==6.3.0 smmap==5.0.0 sniffio==1.3.0 soupsieve==2.4.1 spacy==3.5.1 spacy-legacy==3.0.12 spacy-loggers==1.0.4 srsly==2.4.6 stack-data==0.6.2 starlette==0.26.1 svgwrite==1.4.3 sympy==1.11.1 tenacity==8.2.2 termcolor==2.2.0 terminado==0.17.1 textaugment==1.3.4 textblob==0.17.1 thinc==8.1.9 threadpoolctl==3.1.0 tifffile==2023.4.12 timm==0.6.13 tinycss2==1.2.1 tokenizers==0.13.2 toolz==0.12.0 torch==2.0.0 torchaudio torchvision tornado==6.3 tqdm traitlets==5.9.0 transformers triton==2.0.0 typer==0.7.0 typing_extensions tzdata==2023.3 uc-micro-py==1.0.1 umap-learn==0.5.3 uri-template==1.2.0 urllib3 uvicorn==0.21.1 wandb==0.12.21 wasabi==1.1.1 wavedrom==2.0.3.post3 wcwidth==0.2.6 webcolors==1.13 webdataset==0.2.48 webencodings==0.5.1 websocket-client==1.5.1 websockets==11.0.2 wget==3.2 xxhash==3.2.0 yarl==1.8.2 zipp==3.14.0

Currently using this. Getting numpy dependency error with mkl-fft and mkl-random. Can you kindly give me a .env or requirements.txt file.

Issue with img2dataset dependencies

The user requested pyarrow==12.0.0
datasets 2.12.0 depends on pyarrow>=8.0.0
img2dataset 1.41.0 depends on pyarrow<8 and >=6.0.1
and it is also conflicting with fire==0.5.0 as img2dataset needs fire below 0.5

pre-trained vicuna_weights

Hello, thank you for sharing your data and code, but I don't know if you missed the pre-trained vicuna_weights

The vicuna weights I used have been fine-tuned and the following problems occurred

/home/jgzn/anaconda/envs/XrayGpt/lib/python3.9/site-packages/transformers/generation/utils.py:1255: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation)
warnings.warn(
โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Traceback (most recent call last) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ /home/jgzn/PycharmProjects/Xray/XrayGPT-main/demo1.py:69 in โ”‚
โ”‚ โ”‚
โ”‚ 66 โ”‚
โ”‚ 67 question = "Could you provide a detailed description of the given x-ray โ”‚
โ”‚ 68 chat.ask(question, chat_state) โ”‚
โ”‚ โฑ 69 answer, output_token = chat.answer(chat_state, img_list, num_beams=1, โ”‚
โ”‚ 70 โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ temperature=1, โ”‚
โ”‚ 71 โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ max_new_tokens=300, โ”‚
โ”‚ 72 โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ max_length=2000) โ”‚
โ”‚ โ”‚
โ”‚ /home/jgzn/PycharmProjects/Xray/XrayGPT-main/xraygpt/conversation/conversati โ”‚
โ”‚ on.py:163 in answer โ”‚
โ”‚ โ”‚
โ”‚ 160 โ”‚ โ”‚ โ”‚
โ”‚ 161 โ”‚ โ”‚ embs = embs[:, begin_idx:] โ”‚
โ”‚ 162 โ”‚ โ”‚ โ”‚
โ”‚ โฑ 163 โ”‚ โ”‚ outputs = self.model.llama_model.generate( โ”‚
โ”‚ 164 โ”‚ โ”‚ โ”‚ inputs_embeds=embs, โ”‚
โ”‚ 165 โ”‚ โ”‚ โ”‚ max_new_tokens=max_new_tokens, โ”‚
โ”‚ 166 โ”‚ โ”‚ โ”‚ stopping_criteria=self.stopping_criteria, โ”‚
โ”‚ โ”‚
โ”‚ /home/jgzn/anaconda/envs/XrayGpt/lib/python3.9/site-packages/torch/utils/_co โ”‚
โ”‚ ntextlib.py:115 in decorate_context โ”‚
โ”‚ โ”‚
โ”‚ 112 โ”‚ @functools.wraps(func) โ”‚
โ”‚ 113 โ”‚ def decorate_context(*args, **kwargs): โ”‚
โ”‚ 114 โ”‚ โ”‚ with ctx_factory(): โ”‚
โ”‚ โฑ 115 โ”‚ โ”‚ โ”‚ return func(*args, **kwargs) โ”‚
โ”‚ 116 โ”‚ โ”‚
โ”‚ 117 โ”‚ return decorate_context โ”‚
โ”‚ 118 โ”‚
โ”‚ โ”‚
โ”‚ /home/jgzn/anaconda/envs/XrayGpt/lib/python3.9/site-packages/transformers/ge โ”‚
โ”‚ neration/utils.py:1565 in generate โ”‚
โ”‚ โ”‚
โ”‚ 1562 โ”‚ โ”‚ โ”‚ ) โ”‚
โ”‚ 1563 โ”‚ โ”‚ โ”‚ โ”‚
โ”‚ 1564 โ”‚ โ”‚ โ”‚ # 13. run sample โ”‚
โ”‚ โฑ 1565 โ”‚ โ”‚ โ”‚ return self.sample( โ”‚
โ”‚ 1566 โ”‚ โ”‚ โ”‚ โ”‚ input_ids, โ”‚
โ”‚ 1567 โ”‚ โ”‚ โ”‚ โ”‚ logits_processor=logits_processor, โ”‚
โ”‚ 1568 โ”‚ โ”‚ โ”‚ โ”‚ logits_warper=logits_warper, โ”‚
โ”‚ โ”‚
โ”‚ /home/jgzn/anaconda/envs/XrayGpt/lib/python3.9/site-packages/transformers/ge โ”‚
โ”‚ neration/utils.py:2612 in sample โ”‚
โ”‚ โ”‚
โ”‚ 2609 โ”‚ โ”‚ โ”‚ model_inputs = self.prepare_inputs_for_generation(input_i โ”‚
โ”‚ 2610 โ”‚ โ”‚ โ”‚ โ”‚
โ”‚ 2611 โ”‚ โ”‚ โ”‚ # forward pass to get next token โ”‚
โ”‚ โฑ 2612 โ”‚ โ”‚ โ”‚ outputs = self( โ”‚
โ”‚ 2613 โ”‚ โ”‚ โ”‚ โ”‚ **model_inputs, โ”‚
โ”‚ 2614 โ”‚ โ”‚ โ”‚ โ”‚ return_dict=True, โ”‚
โ”‚ 2615 โ”‚ โ”‚ โ”‚ โ”‚ output_attentions=output_attentions, โ”‚
โ”‚ โ”‚
โ”‚ /home/jgzn/anaconda/envs/XrayGpt/lib/python3.9/site-packages/torch/nn/module โ”‚
โ”‚ s/module.py:1501 in _call_impl โ”‚
โ”‚ โ”‚
โ”‚ 1498 โ”‚ โ”‚ if not (self._backward_hooks or self._backward_pre_hooks or s โ”‚
โ”‚ 1499 โ”‚ โ”‚ โ”‚ โ”‚ or _global_backward_pre_hooks or _global_backward_hoo โ”‚
โ”‚ 1500 โ”‚ โ”‚ โ”‚ โ”‚ or _global_forward_hooks or _global_forward_pre_hooks โ”‚
โ”‚ โฑ 1501 โ”‚ โ”‚ โ”‚ return forward_call(*args, **kwargs) โ”‚
โ”‚ 1502 โ”‚ โ”‚ # Do not call functions when jit is used โ”‚
โ”‚ 1503 โ”‚ โ”‚ full_backward_hooks, non_full_backward_hooks = [], [] โ”‚
โ”‚ 1504 โ”‚ โ”‚ backward_pre_hooks = [] โ”‚
โ”‚ โ”‚
โ”‚ /home/jgzn/anaconda/envs/XrayGpt/lib/python3.9/site-packages/accelerate/hook โ”‚
โ”‚ s.py:156 in new_forward โ”‚
โ”‚ โ”‚
โ”‚ 153 โ”‚ โ”‚ โ”‚ with torch.no_grad(): โ”‚
โ”‚ 154 โ”‚ โ”‚ โ”‚ โ”‚ output = old_forward(*args, **kwargs) โ”‚
โ”‚ 155 โ”‚ โ”‚ else: โ”‚
โ”‚ โฑ 156 โ”‚ โ”‚ โ”‚ output = old_forward(*args, **kwargs) โ”‚
โ”‚ 157 โ”‚ โ”‚ return module.hf_hook.post_forward(module, output) โ”‚
โ”‚ 158 โ”‚ โ”‚
โ”‚ 159 โ”‚ module.forward = new_forward โ”‚
โ”‚ โ”‚
โ”‚ /home/jgzn/PycharmProjects/Xray/XrayGPT-main/xraygpt/models/modeling_llama.p โ”‚
โ”‚ y:676 in forward โ”‚
โ”‚ โ”‚
โ”‚ 673 โ”‚ โ”‚ return_dict = return_dict if return_dict is not None else self โ”‚
โ”‚ 674 โ”‚ โ”‚ โ”‚
โ”‚ 675 โ”‚ โ”‚ # decoder outputs consists of (dec_features, layer_state, dec
โ”‚
โ”‚ โฑ 676 โ”‚ โ”‚ outputs = self.model( โ”‚
โ”‚ 677 โ”‚ โ”‚ โ”‚ input_ids=input_ids, โ”‚
โ”‚ 678 โ”‚ โ”‚ โ”‚ attention_mask=attention_mask, โ”‚
โ”‚ 679 โ”‚ โ”‚ โ”‚ position_ids=position_ids, โ”‚
โ”‚ โ”‚
โ”‚ /home/jgzn/anaconda/envs/XrayGpt/lib/python3.9/site-packages/torch/nn/module โ”‚
โ”‚ s/module.py:1501 in _call_impl โ”‚
โ”‚ โ”‚
โ”‚ 1498 โ”‚ โ”‚ if not (self._backward_hooks or self._backward_pre_hooks or s โ”‚
โ”‚ 1499 โ”‚ โ”‚ โ”‚ โ”‚ or _global_backward_pre_hooks or _global_backward_hoo โ”‚
โ”‚ 1500 โ”‚ โ”‚ โ”‚ โ”‚ or _global_forward_hooks or _global_forward_pre_hooks โ”‚
โ”‚ โฑ 1501 โ”‚ โ”‚ โ”‚ return forward_call(*args, **kwargs) โ”‚
โ”‚ 1502 โ”‚ โ”‚ # Do not call functions when jit is used โ”‚
โ”‚ 1503 โ”‚ โ”‚ full_backward_hooks, non_full_backward_hooks = [], [] โ”‚
โ”‚ 1504 โ”‚ โ”‚ backward_pre_hooks = [] โ”‚
โ”‚ โ”‚
โ”‚ /home/jgzn/anaconda/envs/XrayGpt/lib/python3.9/site-packages/accelerate/hook โ”‚
โ”‚ s.py:156 in new_forward โ”‚
โ”‚ โ”‚
โ”‚ 153 โ”‚ โ”‚ โ”‚ with torch.no_grad(): โ”‚
โ”‚ 154 โ”‚ โ”‚ โ”‚ โ”‚ output = old_forward(*args, **kwargs) โ”‚
โ”‚ 155 โ”‚ โ”‚ else: โ”‚
โ”‚ โฑ 156 โ”‚ โ”‚ โ”‚ output = old_forward(*args, **kwargs) โ”‚
โ”‚ 157 โ”‚ โ”‚ return module._hf_hook.post_forward(module, output) โ”‚
โ”‚ 158 โ”‚ โ”‚
โ”‚ 159 โ”‚ module.forward = new_forward โ”‚
โ”‚ โ”‚
โ”‚ /home/jgzn/PycharmProjects/Xray/XrayGPT-main/xraygpt/models/modeling_llama.p โ”‚
โ”‚ y:517 in forward โ”‚
โ”‚ โ”‚
โ”‚ 514 โ”‚ โ”‚ โ”‚ ) โ”‚
โ”‚ 515 โ”‚ โ”‚ โ”‚ position_ids = position_ids.unsqueeze(0).view(-1, seq_leng โ”‚
โ”‚ 516 โ”‚ โ”‚ else: โ”‚
โ”‚ โฑ 517 โ”‚ โ”‚ โ”‚ position_ids = position_ids.view(-1, seq_length).long() โ”‚
โ”‚ 518 โ”‚ โ”‚ โ”‚
โ”‚ 519 โ”‚ โ”‚ # embed positions โ”‚
โ”‚ 520 โ”‚ โ”‚ if attention_mask is None: โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
RuntimeError: shape '[-1, 104]' is invalid for input of size 105

More evaluation results available?

Dear Authors,

Thanks for your work on this project. I'm really interested in this work. I'm wondering have you done any more experiments on your generated reports? I would very much appreciate it if you could upload further results of your model.

Best!

try it on local machine

When I run the demo.py following the README,I met the problem as below:
Traceback (most recent call last):
File "E:\XrayGPT-main\demo.py", line 60, in
model = model_cls.from_config(model_config).to('cuda:{}'.format(args.gpu_id))
File "E:\envs\xraygpt\lib\site-packages\torch\nn\modules\module.py", line 1160, in to
return self._apply(convert)
File "E:\envs\xraygpt\lib\site-packages\torch\nn\modules\module.py", line 810, in _apply
module._apply(fn)
File "E:\envs\xraygpt\lib\site-packages\torch\nn\modules\module.py", line 810, in _apply
module._apply(fn)
File "E:\envs\xraygpt\lib\site-packages\torch\nn\modules\module.py", line 810, in _apply
module._apply(fn)
[Previous line repeated 3 more times]
File "E:\envs\xraygpt\lib\site-packages\torch\nn\modules\module.py", line 833, in _apply
param_applied = fn(param)
File "E:\envs\xraygpt\lib\site-packages\torch\nn\modules\module.py", line 1158, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
NotImplementedError: Cannot copy out of meta tensor; no data!

My GPU is RTX2060, I hope to know if the reason I meet the problem is the memory of gpu is not enough.
I would be appreciated if any help is given.

Conversation Data

Hi,
great work! thanks for publishing you code and data.
What exactly do the radiology conversations in Healthcare_Radiology_vicuna.json contain? It still seems to be related to the data from ChatDoctor - did you somehow filter this data or how did you generate the radiology conversation data?

Thanks!

Some questions about the XrayGPT dataset

Thank you for your wonderful work - XrayGPT. I have some questions about XrayGPT that I have to ask you.

I have downloaded the openi annoatations file from https://mbzuaiac-my.sharepoint.com/:u:/g/personal/omkar_thawakar_mbzuai_ac_ae/EVYGprPyzdhOjFlQ2aNJbykBj49SwTGBYmC1uJ7TMswaVQ?e=qdqS8U . And I've downloaded the openi PNG image file from https://openi.nlm.nih.gov/imgs/collections/NLMCXR_png.tgz .

However I found that the image_id field in filter_cap.json does not correspond to the PNG image name, what is the reason for this? How do I deal with this issue?

Or I wonder if I could just get these PNG images that match filter_cap.json, which would speed up our work.

Weight download error, cannot be used

(xraygpt) PS C:\Users\lenovo\Desktop\XrayGPT-main> python demo.py --cfg-path eval_configs/xraygpt_eval.yaml --gpu-id 0
Initializing Chat
Loading VIT
Traceback (most recent call last):
File "C:\Users\lenovo\Desktop\XrayGPT-main\demo.py", line 60, in
model = model_cls.from_config(model_config).to('cuda:{}'.format(args.gpu_id))
File "C:\Users\lenovo\Desktop\XrayGPT-main\xraygpt\models\mini_gpt4.py", line 358, in from_config
model = cls(
File "C:\Users\lenovo\Desktop\XrayGPT-main\xraygpt\models\mini_gpt4.py", line 71, in init
self.visual_encoder, self.ln_vision = self.init_vision_encoder(
File "C:\Users\lenovo\Desktop\XrayGPT-main\xraygpt\models\blip2.py", line 65, in init_vision_encoder
visual_encoder = create_eva_vit_g(
File "C:\Users\lenovo\Desktop\XrayGPT-main\xraygpt\models\eva_vit.py", line 433, in create_eva_vit_g
state_dict = torch.load(cached_file, map_location="cpu")
File "D:\anaconda3\envs\xraygpt\lib\site-packages\torch\serialization.py", line 797, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "D:\anaconda3\envs\xraygpt\lib\site-packages\torch\serialization.py", line 283, in init
super().init(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

Trying it on Local machine

HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/home/omkarthawakar/fahad/MiniGPT-4/Vicuna_Radiology_fp16/'. Use repo_type argument if needed.

How to resolve it.. Let me know we can connect on gmail

How to run test.py on the mentioned Test set in the paper?

Thanks for making this innovative work of Xray-based LLM "XrayGPT" public. I followed the provided script to learn the training of a multimodal GPT model.

I am curious about the test.py script:

  1. What would be the Test.yaml file to run the Testing?
  2. What would be the Test set to run test.py?
  3. Is there any reference data to evaluate the trained XrayGPT model?

Kind Regards,

Qformer training

Are you only training the linear projection layer? In any stage are you training Qformer layers?

Prompt for dataset prep

Great work on the paper! I am intrigued by one step of the dataset prep desribed in README-DATASET.md.

Can you share the gpt-3.5-turbo prompts that were used for

  • Elimination of sentences containing comparisons to the patient's prior medical history.
  • Removal of de-defined symbols "__" while preserving the original meaning.

This will be tremendously helpful in my learning, thank you!

ๆจกๅž‹็š„่ฎญ็ปƒๅ‚ๆ•ฐ

ๆ‚จๅฅฝ๏ผŒ้žๅธธๆ„Ÿ่ฐขๆ‚จ็š„่ฟ™ไปฝๅทฅไฝœ๏ผ
ๆˆ‘ๆƒณ่ฏท้—ฎไธ€ไธ‹๏ผŒๅฏนไบŽๆจกๅž‹็š„ๅพฎ่ฐƒ๏ผŒๆ˜ฏๅ…จๅ‚ๅพฎ่ฐƒ่ฟ˜ๆ˜ฏไฝฟ็”จไบ†loraไน‹็ฑป็š„้ซ˜ๆ•ˆๅ‚ๆ•ฐๅพฎ่ฐƒๅ‘ข๏ผŒๅพฎ่ฐƒ็š„ๅ‚ๆ•ฐ้‡ๅคงๆฆ‚ๆ˜ฏๅคšๅฐ‘
้žๅธธๆ„Ÿ่ฐขๆ‚จ็š„ๅ›žๅค

which part of the code loaded the medclip?

Dear authors,

Great work and efforts!
Quick question, I didn't find the part of the code where the medclip model is loaded as the visual encoder. Could you point me to the place where you load that?

Thank you so much in advance!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.