cuhksz-nlp / r2gen Goto Github PK

Python 99.55% Shell 0.45%

r2gen's Issues

Predict on one image

Hello! Can I use the model to make predictions over one single image? Because the checkpoints dimensions are not matching, it asks for [761,512] dimensions.
Thank you!

Caption Generation

Good Morning. How to visualize the predicted caption?

Operating environment

Hello Zhihong,
Thanks for opening your source code. It's very nice works.

when i running the code， there are always errors about multiprocessing and threading，may i know the operating environment more specifically, very thanks.

Do not have datasets access

looking forward for your reply! thanks a lot!

https://drive.google.com/file/d/1DS6NYirOXQf8qYieSVMvqNwuOlgAbM_E/view?usp=sharing

您好，因为我没有MIMIC的官方的数据集，所以我想要问一下数据集的拆分，官方给定的是拆分比例，还是说官方直接给定了拆分后的数据集
我在这篇文章《MIMIC-CXR: A large publicly available database of labeled chest radiographs》中看到，并没有公开测试集，所以数据集的拆分是官方给定的比例还是官方给拆分的数据集呢
我在好多篇论文中看见作者说自己使用了官方拆分，但是我没有官方数据集，所以很好奇，希望您帮我解答一下

The result of each run

Hi Zhihong:
Thank you for sharing the code for this meaningful work! Can you provide the results for each of the seeds?
Thanks!

The definition of ground truth report

May I ask how you processed the reports of the datasets to make the ground truth reports?
Did your model predict the 'findings' part of the reports or the 'impression' part or both?
You said in the paper, "For both datasets, we follow Li et al. (2018) to exclude the samples without reports." Yet I did not find in their paper the way to exclude samples. Moreover, they used the IU-XRay and CX-CHR dataset, but did not use the MIMIC-CXR dataset.
Looking forward to any reply.

about MIMIC dataset

Hi, in the MIMIC dataset in the link provided by R2gen to google drive, is the MIMIC image in a different format from the one provided by physionet, i.e. is it an image file without preprocessing? Or are they the same?

The number images of mimic dataset

train_image_num 270790 val_image_num 2130 test_image_num 3858

Dear Zhihong,

Your code is nice and clean. Thank you so much! Based on the json file you provided, I can only find around 270000 images which is different from table 1 in your paper. The able has around 360000 images. Do you have any selection criteria?

Kind Regards,
Donghao

Optimum epoch number to reproduce your result?

Hi,
Could you please share in which epoch you save the checkpoints?

Best,

dataset issues

Hello, when I downloaded your dataset, I found that the total number of decompressed IU-Xray pictures is 6091, which is different from the dataset description 7470 in your paper. Is the number of pictures used in your paper 7470 or 6091, if it is 7470 , can you provide the complete dataset? Thanks for your reply, thank you!

Does difference between torch and torchvision versions cause a big difference in evaluation results?

hi~
As I am running the project on RTX3090 cuda 11.6, I have configured the project to run with torch version 1.8.1 and torchvisiond version 0.9.1. However, the difference between the evaluation results and the paper after training is between 1% and 3%.
1.Do I need to follow your requirements to the letter in order to get the original results? Is it possible that different versions of torch and torchvision may cause large differences in the evaluation results?
2.Also, do you use multiple GPUs for your training? If so, can you share the settings?

Looking forward to hearing from you~

Datasets did not contain all the data

Hi,
According to your paper, the IU X-Ray dataset contains 5226, 748, and 1496 images on train, val, and test, respectively.
However, the provided dataset you had published contains only 2069, 296, and 590 images, written in the annotation.json.
Is the shared dataset you used for training and testing?
If not, could you share your splitted dataset to me?
MIMIC-CXR dataset also had a similar problem, with 270790, 2130, and 3858, for the splitted dataset, which did not match the number present in your paper.

Code to visualize the attention map

Hi, thank for sharing your code. Could you provide the code for visualizing the attentoin map?

Thanks.

Problem on installing requirements

Hello
I tried to install requirements but i have this error, having Python 3.10.12
working on google collab
any help?

A beam search error occurs during the referencing process

Hello, could you please tell me why I am getting beam search error in the "assert (beam_logprobs_sum == ys).all()" line of the caption_model.py file during the reproduction of the mimic dataset?

Problem on the visualization

Hi Zhihong,
Thank you for sharing your code.
I am interested in the Visualizations of image-text attention mapping part in the paper.
Can you share which approaches you are using for this? (Other repository or code)
I am trying to do this but didn't find a solution for the Transformer-based models.

self.examples[i]['mask'] = [1] * len(self.examples[i]['ids'])

what mask denotes and what is the use of mask?

How to calculate overall Precision, Recall and F1 with these values of 14 categories?

Hello. I'm appreciated at your excellent work in this paper. But when I'm trying to calculate precision, Recall and F1, I don't know how
do you calculate these values.

Can you tell me how do you calculate the overall precision, recall and F1 in Table 3 after calculating these values related to each of the 14 categories?

I get these values with the code at https://github.com/MIT-LCP/mimic-cxr/blob/master/txt/validation/compare_negbio_and_chexpert.ipynb
Looking forward to your reply!

Calculation of clinical accuracy for MIMIC dataset

Hello:

Thanks for sharing the code.

While going through the code, I did not find the code that calculates the clinal accuracy (Precision, Recall, and F1) metrics calculation.
Will it be possible to share that?

Thanks

Access to MIMIC-CXR

The link for the MIMIC-CXR download has been restricted. What would be the way to request the access? Is the MIMIC-CXR used in the study the same as the MIMIC-CXR-JPG so I can just download from there?

Thanks

Base Model without RM and MCLN

@zhjohnchan @GuiminChen Hey, I wanted to run the model without the Relational memory and MCLN. I tried to detach them by replacing MCLN with LN and completely ignoring the RM, but still some positional arguments error is occuring at DecoderLayer. Can you please guide me on how to do it?

Thanks.

About CE Metrics

Hello there ,

Thanks for sharing the code.

I was using the stanfordmlgroup chexpert-labeler you mentioned on your paper.
While after computing by it, there are '-1', '0', '1', and void in the result.
I would like to konw how to calculate TP, FN, FP and TN on it.

Thanks.

About Mimic Dataset

Are the data loaded from the annotation.json file in the R2Gen code implementation all frontal chest X-ray images? Is there any difference between the dataset split made by annotation.json and the split in the mimic-cxr-2.0.0-split.csv file in the MIMIC dataset?

AssertionError on MIMIC-CXR dataset

I am training the model on MIMIC-CXR dataset, but the training is always stop at epoch 11 and an AssertionError occurs. The attached file is the screenshot of the error. Thank you very much.

The MIMIC-CXR dataset

It seems that the mimic dataset has been modified and the txt file has been removed. Can you share the mimic data set with txt file?

Can't Download MIMIC-CXR Dataset

Hello,

I cant download MIMIC-CXR daaset from google drive link
https://drive.google.com/file/d/1DS6NYirOXQf8qYieSVMvqNwuOlgAbM_E/view
can anyone provide me working link ??

Thank you.

BASE Model

Hi,
Thanks for sharing the code. Your approach is very interesting.
I wonder how I can run the code in baseline mode (Base Model in Table 2).
Thanks in avdance!

Error for multiple GPUs training

Dear,

Have you tried multiple GPU training using this code? Thank you so much! I came across the following error.

RuntimeError: Expected tensor for argument #1 'input' to have the same device as tensor for argument #2 'weight'; but device 1 does not equal 0 (while checking arguments for cudnn_convolution)

Kind Regards,
Donghao

Datasets

Hello,

I cant download MIMIC-CXR daaset from google drive link
https://drive.google.com/file/d/1DS6NYirOXQf8qYieSVMvqNwuOlgAbM_E/view
can anyone provide me working link ??

Thank you.

关于COATT的引用

您好，我想问一下，为什么COATT模型原论文中的效果BLEU-1为0.517，但是R2Gen和CMN论文中引用的是0.455

MIMIC Dataset

Hello, I am very interested in your work. In the MIMIC dataset, I can't get the results in the paper according to the run_mimic_cxr.sh file. Can you provide the MIMIC data set so that I can learn more about your masterpiece? I have submitted the application for google driver for a long time

Self-critical sequence training

Hi,

First of all, thank you for sharing your excellent work and nicely-written code!

I'm wondering if you have tried implementing self-critical sequence training for this model? I am currently trying to implement it on Google Colab. But I always get a CUDA Out Of Memory issue. If you have tried, did you encounter similar issues?

Best wishes,
Stephanie Shen

Inference

Hello and thank you for sharing your work!
I was wondering how to use your code and pretrained model weights to make an inference on custom data.
Do you hace a script or function for that?
Thank you very much in advance!

Wrong number of patients on the official IU_XRAY instances in drive data link.

Hello authors. First of all, fascinating work 🚀!
However as I was studying your research as well as your drive link you provided us, I found wrong number of instances for the patients that have only two images on IU-Xray dataset.

All data your provided us:

I download the official iu-xray dataset as well, and after conducting exploratory data analysis on data, I found that the patients that have only two images are more than you provided us :

About test code

Hello Zhihong,
Thanks for opening your source code. It's very nice work.
I'd like to ask you a few questions about reproducing paper results.

I evaluate results by saving generation sentence to json file.
When I resume the model from your provide in checkpoint, and using command as follows:

CUDA_VISIBLE_DEVICES=4 python main.py \
--image_dir data/iu/images/ \
--ann_path data/iu/annotation.json \
--dataset_name iu_xray \
--max_seq_length 60 \
--threshold 3 \
--batch_size 16 \
--epochs 100 \
--save_dir results/reproduce_iu_xray \
--step_size 50 \
--gamma 0.1 \
--seed 9223 \
--resume data/model_iu_xray.pth

I see Checkpoint loaded. Resume training from epoch 15. And the model generates output JSON files.
I use pycocoevalcap to evaluate the results. The results are as follows:

Bleu_1	Bleu_2	Bleu_3	Bleu_4	CIDEr	ROUGE_L	METEOR
0.4334	0.2863	0.2069	0.1554	0.5432	0.3245	0.1945

It seems different somewhere.
Could you give me you test code or provide your generated results JSON file?

Random seed

Hi,
Could you please share the number that you used for the random seed to generate the results for 5 runs in your experiments on IU-XRAY and MIMIC?
Thanks

predict images and visual attention

Dear author：
Thank you for sharing your training code. I am also interested in the Visualizations of image-text attention mapping part in the paper,and testing new images.Have you cleaned the code?

CE MACRO or MICRO as the standard in your paper ?

Hello, I have encountered some confusion. There are two different patterns in the code for the CE indicator you provided. For example,
res_mlc['F1_MACRO'] = f1_score(gt, pred, average="macro")
res_mlc['F1_MICRO'] = f1_score(gt, pred, average="micro")
which method is the most standard in your paper?

cuhksz-nlp / r2gen Goto Github PK

r2gen's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs