GithubHelp home page GithubHelp logo

junhao-zhu / fusionquery Goto Github PK

View Code? Open in Web Editor NEW
7.0 1.0 1.0 24.47 MB

[VLDB 2024] Source code for FusionQuery: On-demand Fusion Queries over Multi-source Heterogeneous Data

License: Apache License 2.0

Python 100.00%

fusionquery's Introduction

FusionQuery

Python implementation of FusionQuery in paper FusionQuery: On-demand Fusion Queries over Multi-source Heterogeneous Data.

Dependencies

  • Python 3.8
  • sentence-transformers 2.2.2
  • faiss-gpu 1.7.2
  • numpy 1.23.1
  • pytorch 1.12.1

Datasets

This repo contains two datasets, Movie and Book. We released KG version of these two datasets in the data. Each data source is stored in three files. Entities in source n, are stored in ent_ids_n, relations are stored in rel_ids_n and triples are stored in triples_n. The queries conducted on the datasets are stored in query.json.

More datasets can be found in this web

Run code

Perform the entire workflow of FusionQuery.

python main.py --data_root "./data/movie" \
--data_name movie \
--fusion_model FusionQuery \
--types JSON KG CSV \
--iters 20 \
--thres_for_query 0.9 \
--thres_for_fusion 0.4

The more detailed information about arguments is listed as follows.

Arguments Explainations Default
--data_root root path of data ../data/movie
--data_name data name used in the current experiment movie
--fusion_model data fusion methods used in the framework (e.g., FusionQuery, DART, CASE, etc.) FusionQuery
--types data types used in the current experiment (a list) JSON KG CSV
--iters maximum iterations for convergence 20
--thres_for_query initial matching threshold $\tau$ 0
--thres_for_fusion threshold for data veracity 0.5
--gpu the gpu device id 0
--seed random seed 2021

fusionquery's People

Contributors

junhao-zhu avatar

Stargazers

Wenlong Wu avatar Ahoy, the Fate Weaver avatar  avatar Giovanni avatar QIU Tian avatar Pengtao Chen avatar JingyangXiang avatar

Watchers

 avatar

Forkers

zju-daily

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.