GithubHelp home page GithubHelp logo

yejin0111 / add-gcn Goto Github PK

View Code? Open in Web Editor NEW
123.0 2.0 20.0 16 KB

ADD-GCN: Attention-Driven Dynamic Graph Convolutional Network for Multi-Label Image Recognition (ECCV 2020)

Home Page: https://arxiv.org/abs/2012.02994

Python 100.00%
add-gcn multi-label-image-classification pytorch computer-vision eccv2020

add-gcn's People

Contributors

yejin0111 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

add-gcn's Issues

关于静态图的构建问题?

您好,阅读了论文及代码,由于静态图的构建有所疑惑。
image
image
image

  • 根据论文中所述,静态图是对所属输入数据共享的,但是从代码中可见,静态图的邻接矩阵也是通过网络学习的,那么,它其实是变化的的,所以我不太理解这里是不是论文和代码没发匹配。

ZeroDivisionError:float division by zero

file:util.py
function:average_precision()
question1:The problem of ZeroDiVisionError:float Division by Zero often occurs in the function that calculates the average precision of each class in the util.py file.Maybe because pos_count is always zero.The program did not perform +1 operation.
How to solve excuse me?
question2:If label==1 was found in two places in the function average_precision() code. Is there an error here?

Inference pipeline

@Yejin0111 hi thanks for sharing the code base i had few queiers

  1. There is no test / inference .py file to test on the image / video? can u provide it or should we write it ourself
  2. THe output of the model is a tensor for a given image is like this
    "len(outputs[0])
    80
    outputs
    tensor([[-283.4771, 100.7004, -25.2342, -31.9534, 137.3260, -100.9781,\n 42.4637, 287.5411, -41.0544, 32.0411, -62.8640, -84.3158,\n 27.4881, -95.6437, -16.3595, 284.4416, 90.6528, 18.8440,\n 145.0567, -112.4063, -55.4868, -88.7339, -18.2153, 41.3703,\n -2.9013, -51.1030, -53.5617, 11.8468, -112.8955, 77.1685,\n 52.6853, 19.6216, -109.0874, -43.6641, -102.1985, 37.9849,\n 172.5355, -3.5291, -218.5259, 155.2649, -61.3687, 13.4691,\n 102.0848, 151.9249, 32.8274, -30.5526, -159.6230, 169.1592,\n 136.3983, -139.8331, -97.2136, -105.8271, 97.7561, -17.9792,\n -11.9392, -105.7749, -169.9196, 119.8365, 47.2773, 79.9587,\n -38.9244, -117.4937, -27.4478, -82.6502, -29.5093, -16.8091,\n 52.4464, -77.0061, 30.4399, 120.5782, 45.0289, -178.4777,\n -94.0661, 270.4139, 125.4486, -76.9411, -5.2130, -25.1496,\n -75.8140, -230.2440]], device='cuda:0')"
    How to obtain the output class for this image ?
  3. how can i visualize the intermediate layers as shown in paper
    THanks in advance

effect of identity matrix

Hi, what is the effect of the following operations? Why not just take the mean value of each class vector, like

out2=out2.mean(-1)

image

Thanks!

released

Hello, when can the source code be released

AssertionError

Hello,Thank you for your work!
question:assert os.path.exists(model_dir) == True
AssertionError
Often appear this problem, how to solve excuse me?
Do you have a more detailed README.MD?What version of CUDA/TorchVision do you use?

Question about the ADD-GCN network

Hello. Thank you for sharing your excellent work!
Since I am new in computer vision and deep learning, I have some doubts about the codes.

z = v + z

I wonder why the final z is added to v?

Another question is:
Is it feasible if I want to apply this method to regression problems?
Or can you provide some ideas about it?

Looking forward to your response.

mAP in voc2007

How to arrive the top mAP(96%)
but my best mAP is 92%. How can I overcome the problem?

args.seed = 1
args.lr = 0.05
args.image_size = 448
args.batch_size = 18 * gpu_num
args.epoch_step = [30, 40]

the test size is 576

怎么转换形状

您好,论文中说‘we simply replace SAM with a Conv-LReLU block.’,X的形状是H * W * C,D-GCN需要的形状是C*D。只用卷积好像无法降低张量的维度,这里是怎么转换形状的呢?

issue about s_m & s_r

Hi, thanks for your excellent work.

I noticed you wrote in your paper and code,"we simply average s_m and s_r to produce the final scores s".
then I experimented with only s_m as the final score, and the result mAP were similar or even higher.

Does this mean that GCN played little role in this work?

mAP in VOC2007

args.seed = 1
args.lr = 0.05
args.image_size = 448
args.batch_size = 16 * 2
args.epoch_step = [30, 40]
the test size is 576
I followed the configuration mentioned above and used the model that trained on COCO as the pre-train model for Pascal VOC,but best mAP of test of VOC2007 is 94.04%.How can I overcome the problem?

Image-size and batch-size

In voc2007, when Image-size is 576x576 and batch size of each GPU is 18(16), but the out of memory is printed. 4*2080Ti, 11G.

Your memory of GPU?

About code references

As I don't see your license on github, can I quote and modify parts of your code in my own project?

My project is related to the face task. I would like to introduce the attention part and the GCN part into my method.
If allowed, I will state the reference in the code.

Looking forward to your reply.

Turn cudnn.deterministic to False lead to faster and better trainning

Hi, thank for this great work. What I have found is that turn cudnn.deterministic to False in line 49 of file main.py make the training way more faster and sometimes leads to better accuracy of the final model. I experience this phenomena with COCO, VOC and even my private dataset. Do you think this is a bug or something ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.