GithubHelp home page GithubHelp logo

Comments (9)

billkunghappy avatar billkunghappy commented on May 29, 2024 2

Hi, we're curious about the test-std file format.
Are there gonna be retrieval_candidates files and files contain API information for test-std?
Since the team model entry deadline is getting closer, without the detailed format of the test-std files, we are afraid that our code may not be able to directly run on the test-std files( Which is able to directly run on the devtest set).

Are we allowed to modify preprocess code after Sep.28 to run test-std ?(Only change some details to be able to successfully run it, without changing the models).

from simmc.

satwikkottur avatar satwikkottur commented on May 29, 2024 1

Hello @billkunghappy,

Thanks for raising this concern.

Regarding the comment, your understanding is correct.
(a) In the API calls file, the last round on which evaluation is to be performed will be excluded.
(b) For retrieval candidates, the last round will contain the retrieval candidates but will not contain the gt_index field that gives the index of the ground truth response.

With respect to the files, I just realized that the suffixes public and private got switched for API calls. While I will fix these later, please take a look at the wrongly named furniture_devtest_dials_api_calls_teststd_format_private.json (should be public) and corresponding file for fashion.

Hope this helps!

from simmc.

satwikkottur avatar satwikkottur commented on May 29, 2024 1

Hello @seo-95 ,

Thanks for raising these two important concerns. We've updated the API calls file to include these two images for the last turn on which evaluation is performed. Of course, you're not allowed to use the ground truth API calls for subtask-1 but can use these for subtask-2 (as per the table).

Apologies for the confusion earlier, hope this addresses your concerns.

from simmc.

tsehsuan1102 avatar tsehsuan1102 commented on May 29, 2024

圖片

from simmc.

seo-95 avatar seo-95 commented on May 29, 2024

Same issue here. Ground truth action is missing from the test-std draft file. Additionally, I can't even find the item focus id for each turn, only partial information of the item is reported inside visual_objects and sometimes they are not sufficient.

                {
                    "domain": "fashion", 
                    "visual_objects": {
                        "OBJECT_2": {
                            "hemLength": [
                                "mini", 
                                "knee_length"
                            ], 
                            "pattern": [
                                "chevron", 
                                "animal"
                            ], 
                            "pos": "focus", 
                            "skirtStyle": [
                                "peplum", 
                                "a_line", 
                                "body_con", 
                                "loose", 
                                "fit_and_flare"
                            ], 
                            "embellishments": [
                                "pleated"
                            ], 
                            "type": "skirt"
                        }
                    }, 
                    "system_transcript": "Here is the skirt from Pedals & Gears. It retails for $124 and is rated at 3.96.", 
                    "turn_idx": 1, 
                    "belief_state": {}, 
                    "transcript": "sure"
                }

from simmc.

shanemoon avatar shanemoon commented on May 29, 2024

Hi all, sorry that this info was not included, we will look into this and release a new file soon.

Please do note though that we are providing {domain}_devtest_dials_teststd_format_public.json just as a guide before we release the future test-std set, and the results for Phase 1 should be reported on the {domain}_devtest_dials.json, released earlier.

from simmc.

satwikkottur avatar satwikkottur commented on May 29, 2024

Hello all, sorry for not including this information before.

For the test-std split, we will also release the corresponding API calls and retrieval candidates (public versions, excludes the last round on which evaluation is done) in the format of corresponding existing files.

In order to check for compatibility, we will now release devtest API calls and retrieval candidates in this format.

from simmc.

billkunghappy avatar billkunghappy commented on May 29, 2024

Hello, there are still 2 questions about the submission.
First, for the Challenge Phase1, we should submit our devtest prediction files. In the Readme of the simmc Repo, it mentioned us to follow the instructions in the Submission instruction.
However there is no instruction about the submission of the devtest in Submission instruction. Should we email you the results of devtest? If so, which email address and what's the submission format?

Secondly, In the previous comment, @satwikkottur has said
For the test-std split, we will also release the corresponding API calls and retrieval candidates (public versions, excludes the last round on which evaluation is done) in the format of corresponding existing files.
What does the excludes the last round on which evaluation is done means?
I thought it means to exclude the last round API in each dialogue in the api calls file(exclude the last round of retrieval candidates is kind of weird, since we need those candidates to predict retrieval candidates scores)
But when I check into fashion_devtest_dials_api_calls_teststd_format_public.json and furniture_devtest_dials_api_calls_teststd_format_public.json, both files do include the last round API information correspond to the fashion_devtest_dials_teststd_format_public.json and furniture_devtest_dials_teststd_format_public.json

from simmc.

seo-95 avatar seo-95 commented on May 29, 2024

Hi, I have 2 questions raised from the test-std dataset release regarding the response_generation task.

  1. Since the action and attributes annotations are not available for the current turn (differently from what was defined in TASK_INPUTS.md), are we able to slightly modify the code (not the model) to avoid using this information?

  2. Whenever we encounter a potential SearchMemory or SearchDatabase in the k-th turn (the one on which the generation and the action prediction are evaluated) we do not have the annotation about the new focus item (during the first phase of the challenge it was included in the dials_api JSON file). Since the response of the wizard is conditioned on the item she/he is looking at, how can we generate such a response if we do not have information about the item?
    An example here below (dialogue 1902):

                {
                    "domain": "fashion", 
                    "visual_objects": {}, 
                    "system_transcript": "", 
                    "turn_idx": 2, 
                    "belief_state": {}, 
                    "transcript": "Show me another coat, but one that 212 Localts more."
                }

from simmc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.