GithubHelp home page GithubHelp logo

messytable's People

Contributors

caizhongang avatar cunjunyu avatar junzhezhang avatar la-tale avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

messytable's Issues

questions about detector performance and epipolar geometry

Dear authors, I have some questions about detector performance and epipolar geometry part.
Detector performance (Table 6 in Appendix)

  1. I want to use some detectors to generate bounding boxes. The original image size is 1920*1080, what is the input resolution during the training and test in your paper? Is multi-scale training or test adopted?
  2. When evaluating the detection mAP, is the PASCAL VOC metric used?
  3. When using generated bounding boxes to perform instance association, is the ASNet model trained on the ground truths and then tested on generated ones or both ASNet training and test use the generated ones?
    Epopolar geometry part ('def epipolar_soft_constraint' in utils.py)
  4. line 88: np.matmul(np.linalg.inv(T_b2r), T_a2r); I think it should be np.matmul(T_b2r, np.linalg.inv(T_a2r)).
  5. line 121: bbox1_3dpt = np.matmul(np.linalg.inv(intrin1), np.array([*bbox1_2dpt, 1])). when transforming pixel coordinates to camera coordinates, I am not sure whether it can be multiplied directly, since the Z value in camera coordinates (X, Y, Z) is unknown.
  6. what is the meaning of find_foot? I didn't understand the calculation in lines 101 ~ 103.

Thanks!

Error in code Data.py

Hi,
Thanks for your great work! But your code here:

MessyTable/src/data.py

Lines 88 to 94 in 97db60d

for sec_crop_id, sec_crop_value in content['cameras'][sec_cam]['instances'].items():
if a_crop_id != sec_crop_id and a_crop_value['subcls'] == sec_crop_value['subcls']:
n_crop_id_subcls.append((scene, main_cam, sec_cam, a_crop_id, sec_crop_id))
elif a_crop_id != sec_crop_id and a_crop_value['cls'] == sec_crop_value['cls']:
n_crop_id_cls.append((scene, main_cam, sec_cam, a_crop_id, sec_crop_id))
else:
n_crop_id_others.append((scene, main_cam, sec_cam, a_crop_id, sec_crop_id))

The last else should be

elif a_crop_id != sec_crop_id: 
        n_crop_id_others.append((scene, main_cam, sec_cam, a_crop_id, sec_crop_id)) 

shouldn't it?

Otherwise the a_crop_id maybe equal to sec_crop_id, but they are negative pair. You can check at final data sample that n_id, a_id, p_id are all equal to each other in some triplets

Can I infer the ASNet model on my own dataset?

Hi there, thank you for sharing the code.
I have a multi-view dataset for detection. And I am interested in instance association using the ASNet model.
Can you give me some instructions on inferring this model for a custom dataset?

Thanks a lot.

ImageNet dataset in paper

Hi,
Thanks for sharing your great work!

I try to find which ImageNet dataset that contain multi-view images, but I can't find which ImageNet dataset is.
Could you let me know which ImageNet dataset you use?
And is there any pretrained model and configuration of ImageNet dataset?

Thank you very much!

Two questions about the code

Hi, thank you for your good code. I have two questions. Could you give me some help?
first, Whether this is a bug?
image
second, if the pretrained=True, could I do not use this line of code:
app.load_state_dict(torch.load('resnet18.pth')).
Also, I can not find resnet18.pth, is it downloaded from pytorch?
image

Object name that subclass id represents

Hi,
Thanks for sharing your great work!

I am trying to use your dataset at object detection.
However, for all I know, there is no class name (such as Coca cola, Pepsi, nacho.. etc) of each subclass id in annotations.
Could you let me know what object the subclass id represents?

Thank you very much!

some questions about the code (including KM solver implementation)

Dear author,
Thanks for your excellent work. I have some questions about the code.
Test:

  1. I test with your uploaded ASNet weight and got 0.525 AP and 0.214 FPR, which is different from the value in the paper. (AP:0.524; FPR: 0.209). Is the weight different from the one used in the paper or should I change some parameters in the config file to get the results shown in the paper?
  2. If possible, could you please provide the c++ code about KMSolver? I hope to wrap it in python 3.6 environment.
    Train:
  3. How many GPU training hours based on your default settings?
  4. I don't understand the value setting of cam_selected_num(8) and triplet_batch_size(512 8GPUs). It seems that during the training, only part of training data are used: 8 pairs from 72 pairs and 512 triplets from a set of triplets obtained from 8 pairs. Why not use a full set? Does it follow other ReID tasks?
  5. Did you use the data augmentation during the training? I check that the default setting is false. Does it mean data augmentation will harm the performance? And if there is no rotate augmentation strategy, then in class ASNet, we don't need to conduct rotate function during the forward, is it right?
    Minor questions:
  6. In data.py, for the 'def prepare_scenes(config):', it seems that 'content' is the same as 'scene_labels', then what is the aim of line 34~36?
  7. In utils.py, line 32~33, img_feat = img_copy[int(max(0,y1-h*(zoomout_ratio-1)/2)):int(min(max_y-1,y2+1+h*(zoomout_ratio-1)/2)), int(max(0,x1-w*(zoomout_ratio-1)/2)):int(min(max_x-1,x2+1+w*(zoomout_ratio-1)/2)), :], I think max_y-1 and max_x-1should be max_y and max_x.

Thanks!

MessyTable数据集标注corners | MessyTable dataset marker corners

您好,我注意到您提供的标注数据中有标记物的6x4的corners,其中每个标记物内的4个点是按照顺时针顺序标号。请问6个标记物是否有固定的规律?我如何能够将同一场景下不同视角的corners匹配。

I noticed that in the annotation data you provided, there are 6x4 corners of markers, in which the four points in each marker are labeled in clockwise order. Do the six markers have a fixed order? How can I match corners from different views?

Visualization of feature map activations and evaluate by angle

Hi,
Thanks for sharing your great work!

I have a few questions about your great work.

First, could you share the code of visualization of feature map activations or let me know the reference code of it?

Secondly, when evaluating the metric with angle difference, do you select only the camera pair that satisfies the angle difference for evaluation?

Thank you!

Redundant in train dataset and performance issue

Hi,
Thanks for sharing your great work!

First, when I trying to train ASNet with 1 gpu, there is an error in dataset.
And I found that there is redundant data (which contains no data) in train_set. and train_set's length is 5014 (when using 1 gpu) which is not multiple of 9.
Therefore I wonder whether train_set's length is 5013 or 5014.

Secondly, when I evaluate pretrained model in google drive, the performance of that model is higher than your paper.
(AP: 0.525 for just eval / AP: 0.719 for ESC which is far higher than paper)
So I wonder whether it is right or there is some wrong code.

Finally, could you let me know which kind of GPU do you use for 8 GPUs?
Thank you!

One error about training the model using one gpu

Sorry to bother you. I'm trying to train the model with one GPU. I just changed the location where the dataset is stored, but there is always a data read error (I think) during the training, i.e., the variable samples_a_crop is an empty list. Can you give me some advice to solve this error? Thank you.

1

2

3

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.