GithubHelp home page GithubHelp logo

rpg_event_representation_learning's People

Contributors

danielgehrig18 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rpg_event_representation_learning's Issues

Question about optical flow estimation

Hi,
Thanks for sharing your code. I was studying your paper and trying to write the code of optical flow estimation part, but I failed. Is it convenient for you to share the code of your optical flow estimation part? Looking forward to your reply, and I will appreciate your reply. Thanks a lot!

Little confusion

Hi!
I am really appreciated that your great work including the code and dataset are all released. According to the paper and the released code, I have two little confusion about the implementation.

  1. The kernel function is learned by MLP, which takes the coordinates and time stamp of an event as input and produces an activation map around it. In the code, I refer this part to the "ValueLayer" implementation, which only take the t as input and produces the measurement directly. I can't figure it out how they are matched. If this implementation is the modified version, will this change improve the final performance?
  2. About the Look-up table, the "t" is assigned a float value, making the countless possibilities. And still,I am confused about the implementation, which is not present in the released code.
    I would be really appreciated that if you could reply to me and this could be really helpful to my current research work.
    Best wishes

Questions about optical flow prediction.

Hi,

Thanks for so much effort you did. Recently, I'm trying to migrate the released code to the optical flow prediction task, however bench of CUDA errors occurred and hard to solve. I am wondering if it is possible to share the code of optical flow prediction? I appreciate that and look forward to hearing from you.

Best wishes,

Tony

See the representation of events

Hi,
Since the training and test just show the accuracy, they did not provide the learned representations of events.
Is there any way to see the learned representations of events?

Another question is about the code:
idx_before_bins = x
+ W * y
+ 0
+ W * H * C * p
+ W * H * C * 2 * b
Here, what's the intention of doing this?
Thanks

Confusion about the optical flow estimation

Hi!Thanks for your sharing. There was a problem when I was doing the optical flow estimation experiment, that is, an out-of-range occurred at the code :
vox.put_(idx.long(), values, accumulate=True)
I checked this issue and found that it was a problem with the sample I generated from MVSEC dataset. I aggregated the events of the left and right event cameras, and extracted the events every 20ms as a sample. Is this what your article describes? How is the correct sample generated?
Looking forward to your reply! Thanks!

Several questions on implementation

Hi,

Thanks for sharing your reference implementation. I was studying your code and paper and come up with several questions I could not answer when refering back to the paper. I was wondering if I can get your feedback on them. The questions are as follows:

  1. What is the purpose of concatenating the index of every event in the loader?

ev = np.concatenate([d[0], i*np.ones((len(d[0]),1), dtype=np.float32)],1)

  1. What is the trilinear kernel initialization, and how should one modify it (retrain it) when using another dataset?

  2. What is C in the voxel dimensions and why is its value set to 9?

voxel_dimension=(9,180,240), # dimension of voxel will be C x 2 x H x W

It seems, based on line 114 in models.py that this is the number of bins, which is referred to as B in the paper. However, line 94 of the same file has the variable B, but it is not obvious to me how the calculations performed on it represent the bin size.

B = int((1+events[-1,-1]).item())

for i_bin in range(C):

  1. How the C value should be changed for event record lengths greated than 100ms? I have a dataset of about 1s record lengths. If C is the number of bins, it might makes sense that a greater C is necessary for larger record lengths, right?

  2. What is the purpose of the crop_and_resize_to_resolution method? Given that it is the input to the classifier, it seems that its only purpose is to satisfy the requirement that the classifier's input should at least be 224x224. Is this correct?

  3. How would you suggest to process an RGB event camera input with the learnable representation network you propose? A way that occurs to me would be to create 3 networks, one for each channel, and combine their outputs in another hidden layer whose weights should be learnt. This is of course not efficient, and does not leverage the trilinear kernel and information structure.

I know this are several questions and hope this is not a burden, I will appreciate your hints and time.

Thanks!

Enquiry on Trilinear Kernel

Dear author,
I am interested in your paper and I do not understand what is trilinear kernel is.
I found the implantation from models.py where there is one step to init the kernel.

def trilinear_kernel(self, ts, num_channels):
        gt_values = torch.zeros_like(ts)

        gt_values[ts > 0] = (1 - (num_channels-1) * ts)[ts > 0]
        gt_values[ts < 0] = ((num_channels-1) * ts + 1)[ts < 0]

        gt_values[ts < -1.0 / (num_channels-1)] = 0
        gt_values[ts > 1.0 / (num_channels-1)] = 0

        return gt_values

However, I do not figure out what the meaning of the above code. It should be the gt values of one trilinear kernel.
Then, I write one demo to test it. As follows:

import torch

def trilinear_kernel(ts, num_channels):
    gt_values = torch.zeros_like(ts)

    gt_values[ts > 0] = (1 - (num_channels - 1) * ts)[ts > 0]
    gt_values[ts < 0] = ((num_channels - 1) * ts + 1)[ts < 0]

    gt_values[ts < -1.0 / (num_channels - 1)] = 0
    gt_values[ts > 1.0 / (num_channels - 1)] = 0

    return gt_values

ts = torch.rand((1, 2000))
ts.uniform_(-1, 1)
print(trilinear_kernel(ts,9))

The problem is the result is 0 no matter how much times I run.

Request for N-Caltech Data Splitting Details

I am trying to obtain a one-to-one correspondence between the Event data from N-Caltech and the Image data from Caltech. However, it appears that you have renamed files in the validation and testing sets of N-Caltech. To maintain consistency with your data split, could you share the details of your split, such as in a JSON file format? Thank you!

Question about multi-GPU training

Hi,

Great work! Thanks for sharing your code. Due to the data format is not a standard mini-batch tensor format as BatchID x Channel x Width x Height, I am wondering whether the model can be trained on multi-GPU? If can, how should I implement it?

Looking forward to your reply, any help would be appreciated. Thanks a lot!

Best wishes,

Siqi Li

Test accuracy lower with higher batch size

Hello!
I was testing a trained model, and found that the testing accuracy reported by the code is lower for bigger batch sizes.

For instance:
python testing.py --test $TEST_PATH--checkpoint log/model_best_caltech101.pth --batch_size 12 --num_workers 12
reports:
image

While:
python testing.py --test $TEST_PATH--checkpoint log/model_best_caltech101.pth --batch_size 48 --num_workers 12
reports:
image

Any idea why this might be happening?

Edit:
It gets weirder, as running again the evaluation reports different accuracies (all tested with --batch_size 48).
Run 1:
image

Run2:
image

Run3:
image

Validation accuracy lower than expected.

Hi,

I trained the model for 100 epochs with the default parameters provided in your github code.
I observe that my training loss reduces as expected.

However, the validation loss increases with epoch. The validation accuracy is also close to 0.26, much lower than 0.817. I used N-Caltech 101 dataset as mentioned in this repo.

Is there something I am doing wrong ? Please let me know.

training

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.