Thank you for sharing your code. I am trying to reproduce and understand it.
It will be of great help if you could kindly provide some information on how to get custom inferences, i.e., providing an image, query, and dialog; and generating the answer. Thank you.
Hello author, although your article has been published for many years, I still think it is very important, so I read your code recently.
I have one doubts when I read the code, and I want to ask you for advice. At the end of the decoder, you used the logsoftmax function to calculate the probability of the candidate answer, and then input it into the nn.CrossEntropy function, but as far as I know, nn.CrossEntropy function comes with a logsoftmax calculation. What is the purpose of doing this?