GithubHelp home page GithubHelp logo

evaldata's Introduction

๐Ÿ“Š Exploratory Query Evaluation Dataset

๐Ÿ“‹ Overview

Welcome to the Exploratory Query Evaluation Dataset, the eval data used by our work Mining Exploratory Queries for Conversational Search[Paper], which had been accepted by WWW 2024! This dataset was created to facilitate the evaluation of our exploratory query generation task. We sampled 100 queries randomly from the MIMICS dataset to create the evaluation data.

๐Ÿ“„ Dataset Details

  • Dataset Size: This dataset comprises 100 real user queries, randomly sampled from the MIMICS-Click dataset.
  • Query Statistics: On average, each query contains 1.69 exploratory query groups and 15.48 exploratory queries. Each exploratory query group, on average, contains 9.16 exploratory queries.
  • Data Format: The dataset is available in JSON format. The name of each JSON file represents a real user query.

๐ŸŒ Data Collection Process

For each query, we aggregate exploratory queries generated by all models (including the ablation models and baselines) we want to evaluate and the human-written ones to form a pool. Note that such aggregation follows the classic Cranfield experiments, which aims to ensure a fair evaluation of all models. We hired three annotators who have a Masterโ€™s degree with compensation to help select high-quality exploratory queries to construct the final ground truth. Specifically, we ask the annotators to evaluate these exploratory queries in terms of three aspects: usefulness, faithfulness, and readability. We give them the definition (see the Table) and some examples to help them better understand these aspects. If an exploratory query satisfies all three aspects, then a Good label should be given. Otherwise, they are asked to give a Bad label. The final label of each exploratory query is determined by the majority vote among the three annotators.

Aspect Definition
Usefulness The exploratory query is parallel to the original user query and can meet usersโ€™ exploratory needs.
Faithfulness The exploratory query should make sense and provides faithful information that users can trust.
Readability The exploratory query is free of grammatical errors, smooth and easy to understand.

Finally, we manually create exploratory groups for each query (such as โ€œCartier women [MASK]โ€ for the query โ€œCartier women watchesโ€) and assign the exploratory queries whose final labels are Good in the pool to corresponding created groups as the ground-truth.

๐Ÿ“ Data Format

Here is an example from the dataset: adidas jeans.json

[
    {
        "template": "[MASK] jeans",
        "ground_truth": [
            "calvin klein jeans",
            "tommy jeans",
            "converse jeans",
            "tommy hilfiger jeans",
            "clarks jeans",
            "nike jeans",
            "puma jeans",
            "hugo boss jeans",
            "new balance jeans",
            "reebok jeans",
            "wrangler jeans",
            "levis jeans",
            "fila jeans",
            "lee jeans"
        ]
    },
    {
        "template": "adidas [MASK]",
        "ground_truth": [
            "adidas shorts",
            "adidas pajamas",
            "adidas shoes",
            "adidas sweatpants",
            "adidas coats & jackets",
            "adidas pants",
            "adidas sweaters",
            "adidas shirts",
            "adidas track pants",
            "adidas jacket",
            "adidas hoodie",
            "adidas sweatshirts",
            "adidas sneakers"
        ]
    }
]

The 'adidas jeans' is the original user query. The "ground_truth" field represents an exploratory query group, which contains a list of high-quality exploratory queries used for model evaluation. The '[MASK]' in the 'template' field represents the replaced term in the original query.

๐Ÿ“ Future Work

In the future, we will expand the size of the dataset and add an exploratory question for each exploratory query group (e.g., 'What other brands are you also interested in?' for group '[MASK] jeans') to make our dataset better suited for conversational search scenarios.

evaldata's People

Contributors

8421bcd avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.