databrickslabs / dbignite Goto Github PK
View Code? Open in Web Editor NEWLicense: Other
License: Other
See example:
graph LR
A[A] -->|transform| B(B)
A[A] -->|transform| C(C)
B[B] -->|transform| A(A)
B[B] -->|transform| C(C)
C[C] -->|transform| A(A)
C[C] -->|transform| B(B)
Hi, I'm working on a POC to convert our FHIR bundles to OMOP as outlined in https://github.com/databrickslabs/dbignite/blob/main/notebooks/dbignite-demo.py
Simplifying the problem, I'm unable to import data_model due to how the project is structured...
%pip install git+https://github.com/databrickslabs/dbignite.git
from dbignite.data_model import *
Anyway, there is more to the notebook I would like to test, but none of the proper files come down when I install it.
Long story short, I can fork it and fix it (and submit a PR if this repo is still being maintained) ... but I'm curious as to why it's in this state. Is this intended to be just a demo but support was purposely removed, or just an oversight? This also causes the solution accelerator on the databricks website to not work.
pip install git+https://github.com/databrickslabs/dbignite.git
Collecting git+https://github.com/databrickslabs/dbignite.git
Cloning https://github.com/databrickslabs/dbignite.git to /private/var/folders/w8/wk1tpwvj6wv_fq0ltx90ng85j3k87_/T/pip-req-build-ik88df9f
Running command git clone --filter=blob:none --quiet https://github.com/databrickslabs/dbignite.git /private/var/folders/w8/wk1tpwvj6wv_fq0ltx90ng85j3k87_/T/pip-req-build-ik88df9f
Resolved https://github.com/databrickslabs/dbignite.git to commit ce29f58
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [1 lines of output]
error in dbignite setup command: 'python_requires' must be a string containing valid version specifiers; Invalid specifier: '>=3.9.*'
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
https://packaging.python.org/en/latest/specifications/version-specifiers/#version-matching
from setuptools import setup
from io import open
from os import path
import sys
DESCRIPTION = "Package for ingesting FHIR bunldes in deltalake"
this_directory = path.abspath(path.dirname(file))
with open(path.join(this_directory, "README.md"), encoding="utf-8") as f:
LONG_DESCRIPTION = f.read()
try:
exec(open("dbignite/version.py").read())
except IOError:
print("Failed to load version file for packaging.", file=sys.stderr)
sys.exit(-1)
VERSION = version
setup(
name="dbignite",
version=VERSION,
python_requires='>=3.9.*',
author="Amir Kermany, Nathan Buesgens, Rachel Sim, Aaron Zavora, William Smith, Jesse Young",
author_email="[email protected]",
description= DESCRIPTION,
long_description=LONG_DESCRIPTION,
long_description_content_type="text/markdown",
url="https://github.com/databrickslabs/dbignite",
classifiers=[
"Programming Language :: Python :: 3",
"License :: Other/Proprietary License",
"Operating System :: OS Independent",
],
packages=['dbignite', 'dbignite.omop', 'dbignite.hosp_feeds'],
package_data={'': ["schemas/*.json"]},
py_modules=['dbignite.data_model']
)
Running local Mac environment for testing https://issues.apache.org/jira/browse/SPARK-41885
Also acknowledged in Delta's project as well delta-io/delta#1059
Thinking "entry.request.url" should be "entry.resource.resourceType"
Line 31 in c0c6541
Supporting FHIR info - https://www.hl7.org/fhir/request.html#12.3.1 "This is NOT a resource. It is not part of the FHIR schema and cannot appear directly in FHIR instances."
def setup_class(self) -> None:
self.spark = (SparkSession.builder.appName("myapp")
.config("spark.jars.packages", "io.delta:delta-core_2.12:1.1.0")
.config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension")
.config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog")
.config("spark.driver.extraJavaOptions", "-Dio.netty.tryReflectionSetAccessible=true")
.config("spark.executor.extraJavaOptions", "-Dio.netty.tryReflectionSetAccessible=true")
.master("local")
.getOrCreate())
self.spark.conf.set("spark.sql.shuffle.partitions", 1)
def teardown_class(self) -> None:
self.spark.sql(f'DROP DATABASE IF EXISTS {TEST_DATABASE} CASCADE')
def assertSchemasEqual(self, schemaA: StructType, schemaB: StructType) -> None:
"""
Test that the two given schemas are equivalent (column ordering ignored)
"""
# both schemas must have the same length
assert len(schemaA.fields) == len(schemaB.fields)
# schemaA must equal schemaB
assert schemaA.simpleString() == schemaB.simpleString()
should be rewritten with pytest best practices - fixtures, top level functions, and use snake_case_method
naming for tests
As per the comparison between the FHIR R4 and dbignite schema, we have found the differences in following resources :
Composition
Encounter
MedicationStatement
Please refer to the detailed schema comparison in the attachment:
FHIR vs dbignite schema comparision.xlsx
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.