GithubHelp home page GithubHelp logo

visinf / veto Goto Github PK

View Code? Open in Web Editor NEW
21.0 2.0 6.0 19.25 MB

Vision Relation Transformer for Unbiased Scene Graph Generation (ICCV 2023)

License: Apache License 2.0

Python 15.46% C 0.10% C++ 0.19% Cuda 1.21% Shell 0.01% Jupyter Notebook 83.02%

veto's Issues

About the graph constraint in the evaluation of MEET.

Under the graph constraint, the evaluator is supposed to take only one prediction for each subject-object pair.
However, in the implement of MEET, you just concatenates the pred_rel_scores and pred_rel_labels of each group. That means the evaluator will include 5 prediction for each subject-object pair, which might result in an unfair comparison.
Have you noticed this in your experiments?

for scores in triple_scores_ensemble:
total_length += len(scores)
incr_length.append(len(scores))
triple_scores_final = torch.zeros(total_length).cuda()
rel_pair_idx_final = torch.zeros(total_length, 2).cuda()
rel_class_prob_final = torch.zeros(total_length, num_rel_classes).cuda()
rel_labels_final = torch.zeros(total_length, dtype=torch.int64).cuda()
start = 0
for i, scores in enumerate(triple_scores_ensemble):
triple_scores_final[start:start+incr_length[i]] = scores
rel_pair_idx_final[start:start + incr_length[i]] = rel_pair_idx_ensemble[i]
rel_class_prob_final[start:start + incr_length[i], chosen_labels_incr[i]] = rel_class_prob_ensemble[i]
rel_labels_final[start:start + incr_length[i]] = rel_labels_ensemble[i]
start += incr_length[i]
_, sorting_idx = torch.sort(triple_scores_final.view(-1), dim=0, descending=True)
rel_pair_idx_ensemble = rel_pair_idx_final[sorting_idx]
rel_class_prob_ensemble = rel_class_prob_final[sorting_idx]
rel_labels_ensemble = rel_labels_final[sorting_idx]
boxlist.add_field('rel_pair_idxs', rel_pair_idx_ensemble) # (#rel, 2)
boxlist.add_field('pred_rel_scores', rel_class_prob_ensemble) # (#rel, #rel_class)
boxlist.add_field('pred_rel_labels', rel_labels_ensemble) # (#rel, )
results.append(boxlist)

Have you encountered any computing power issues?

I am currently using Nvidia A100 GPU, cuda is using 10.2, and pytorch is also using the corresponding GPU version of 10.2. However, I have encountered the following issue. I see that you are also using Nvidia A100 GPU, cuda is 10.1. Do you have this issue
099b0aa97dba26801ad49d5eeaca870

About the depth images

As mentioned in the article, you built the depth dataset with the AdelaiDepth tool, and the depth dataset is not currently in the repository, so the code is not available at this time, will you share the depth dataset you built later? Looking forward to your reply

about the inference.

Hi, I have some thoughts on your approach:
It seems that you ensemble the results of different models, which may cause an object pair to have more than one predicted relationship during the test, which does not meet the current unlimited sgg rules. As far as I know, all the methods you compared are with constraint, that is, only the top1 result of the current object pair is considered to be the predicate of the last evaluation.

Excuse me, I can't find the file. Can you provide the file?

"depth_img_dir": "/visinf/home/gsudhakaran/scene_graphs/Depth-VRD/VG_depth_raw_full", #"/data/user/dataset/scene_graph/images/VG/vg_depth_1024/VG_100K", #"/path/home/user/scene_graphs/Depth-VRD/VG_depth_raw_full", #,

No depth map files found

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.