Hi, I am curious about the test-std files will be released on Sept 28. Will yo

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

<a target="_blank" rel="noopener noreferrer nofollow" href="https://user-images.github

Question about test-std files about simmc HOT 9 CLOSED

facebookresearch commented on May 29, 2024

Question about test-std files

from simmc.

Comments (9)

billkunghappy commented on May 29, 2024 2

Hi, we're curious about the test-std file format.
Are there gonna be retrieval_candidates files and files contain API information for test-std?
Since the team model entry deadline is getting closer, without the detailed format of the test-std files, we are afraid that our code may not be able to directly run on the test-std files( Which is able to directly run on the devtest set).

Are we allowed to modify preprocess code after Sep.28 to run test-std ?(Only change some details to be able to successfully run it, without changing the models).

from simmc.

satwikkottur commented on May 29, 2024 1

Hello @billkunghappy,

Thanks for raising this concern.

Regarding the comment, your understanding is correct.
(a) In the API calls file, the last round on which evaluation is to be performed will be excluded.
(b) For retrieval candidates, the last round will contain the retrieval candidates but will not contain the gt_index field that gives the index of the ground truth response.

With respect to the files, I just realized that the suffixes public and private got switched for API calls. While I will fix these later, please take a look at the wrongly named furniture_devtest_dials_api_calls_teststd_format_private.json (should be public) and corresponding file for fashion.

Hope this helps!

from simmc.

satwikkottur commented on May 29, 2024 1

Hello @seo-95 ,

Thanks for raising these two important concerns. We've updated the API calls file to include these two images for the last turn on which evaluation is performed. Of course, you're not allowed to use the ground truth API calls for subtask-1 but can use these for subtask-2 (as per the table).

Apologies for the confusion earlier, hope this addresses your concerns.

from simmc.

tsehsuan1102 commented on May 29, 2024

from simmc.

seo-95 commented on May 29, 2024

Same issue here. Ground truth action is missing from the test-std draft file. Additionally, I can't even find the item focus id for each turn, only partial information of the item is reported inside visual_objects and sometimes they are not sufficient.

                {
                    "domain": "fashion", 
                    "visual_objects": {
                        "OBJECT_2": {
                            "hemLength": [
                                "mini", 
                                "knee_length"
                            ], 
                            "pattern": [
                                "chevron", 
                                "animal"
                            ], 
                            "pos": "focus", 
                            "skirtStyle": [
                                "peplum", 
                                "a_line", 
                                "body_con", 
                                "loose", 
                                "fit_and_flare"
                            ], 
                            "embellishments": [
                                "pleated"
                            ], 
                            "type": "skirt"
                        }
                    }, 
                    "system_transcript": "Here is the skirt from Pedals & Gears. It retails for $124 and is rated at 3.96.", 
                    "turn_idx": 1, 
                    "belief_state": {}, 
                    "transcript": "sure"
                }

from simmc.

shanemoon commented on May 29, 2024

Hi all, sorry that this info was not included, we will look into this and release a new file soon.

Please do note though that we are providing {domain}_devtest_dials_teststd_format_public.json just as a guide before we release the future test-std set, and the results for Phase 1 should be reported on the {domain}_devtest_dials.json, released earlier.

from simmc.

satwikkottur commented on May 29, 2024

Hello all, sorry for not including this information before.

For the test-std split, we will also release the corresponding API calls and retrieval candidates (public versions, excludes the last round on which evaluation is done) in the format of corresponding existing files.

In order to check for compatibility, we will now release devtest API calls and retrieval candidates in this format.

from simmc.

billkunghappy commented on May 29, 2024

Hello, there are still 2 questions about the submission.
First, for the Challenge Phase1, we should submit our devtest prediction files. In the Readme of the simmc Repo, it mentioned us to follow the instructions in the Submission instruction.
However there is no instruction about the submission of the devtest in Submission instruction. Should we email you the results of devtest? If so, which email address and what's the submission format?

Secondly, In the previous comment, @satwikkottur has said
For the test-std split, we will also release the corresponding API calls and retrieval candidates (public versions, excludes the last round on which evaluation is done) in the format of corresponding existing files.
What does the excludes the last round on which evaluation is done means?
I thought it means to exclude the last round API in each dialogue in the api calls file(exclude the last round of retrieval candidates is kind of weird, since we need those candidates to predict retrieval candidates scores)
But when I check into fashion_devtest_dials_api_calls_teststd_format_public.json and furniture_devtest_dials_api_calls_teststd_format_public.json, both files do include the last round API information correspond to the fashion_devtest_dials_teststd_format_public.json and furniture_devtest_dials_teststd_format_public.json

from simmc.

seo-95 commented on May 29, 2024

Hi, I have 2 questions raised from the test-std dataset release regarding the response_generation task.

Since the action and attributes annotations are not available for the current turn (differently from what was defined in TASK_INPUTS.md), are we able to slightly modify the code (not the model) to avoid using this information?
Whenever we encounter a potential SearchMemory or SearchDatabase in the k-th turn (the one on which the generation and the action prediction are evaluated) we do not have the annotation about the new focus item (during the first phase of the challenge it was included in the dials_api JSON file). Since the response of the wizard is conditioned on the item she/he is looking at, how can we generate such a response if we do not have information about the item?
An example here below (dialogue 1902):

                {
                    "domain": "fashion", 
                    "visual_objects": {}, 
                    "system_transcript": "", 
                    "turn_idx": 2, 
                    "belief_state": {}, 
                    "transcript": "Show me another coat, but one that 212 Localts more."
                }

from simmc.

Question about test-std files about simmc HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs