cu-dbmi / rtx-kg2-gateway Goto Github PK
View Code? Open in Web Editor NEWEnabling RTX-KG2 data access through various means.
License: BSD 3-Clause "New" or "Revised" License
Enabling RTX-KG2 data access through various means.
License: BSD 3-Clause "New" or "Revised" License
In making further queries of the Kuzu database I noticed there might be a discrepancy with multi-value LIST attributes of certain entities (mostly noticed with NODE entities). This issue highlights a need to double check these values and make any necessary adjustments to ensure these are queryable as needed.
- My hope is to generalize some of the functionality for potential reuse with property graphs and Kuzu (at least in this context).
- There's what seems like an opportunity to propose multi-dimensional property graph structures within Parquet as a strongly typed data storage alternative to JSON or TSV that may come with performance benefits. I felt the metadata storage components of Parquet were especially well-suited to shared schema and provenance understandings (along with default data citation within the files themselves).
- It's likely we could also share a Neo4J-compatible version of the data for those who may prefer it over embedded approaches.
Originally posted by @d33bs in #1 (comment)
Good that you show how to start Jupyter Lab! You might consider adding a short tutorial where you query the data, e.g. drawing their attention to a particular notebook where they can start trying queries and any setup they might need to do before running their first query. In the tutorial, you might even show a sample query, its result, and how you can do things with the result (e.g., drawing a graph of the resulting nodes).
Not necessary IMO, but you might consider giving a simple high-level overview of what Kuzu is doing, e.g. that it's creating an in-memory database on which to perform Cypher queries on the RTX-KG2 graph.
... I'd suggest having one or a few notebooks showing how to do that in detail, including getting into the schema of the dataset in the notebook. I see that you have a notebook called "example_cypher_kuzu" that shows an example query; perhaps that one could be extended to describe the data, etc. and show useful things you can do with Kuzu on the dataset?
Originally posted by @falquaddoomi in #1 (comment)
When using SQL LIMIT
and OFFSET
one must use ORDER BY
to ensure deterministic results. This issue pertains to the use of DuckDB for extracting row-chunks of node and edge data for ingest into a Kuzu database and adding ORDER BY
to ensure all results are extracted properly.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.