databrickslabs / dbignite Goto Github PK

View Code? Open in Web Editor NEW

22.0 22.0 10.0 7.17 MB

License: Other

Python 100.00%

dbignite's People

Contributors

Stargazers

Watchers

Forkers

vadim hlstests juanlamadrid20 willsmithdb databricks-industry-solutions sinchana-kj b8heng jesseryoung-db aashish-sapkota-abacus donike98

dbignite's Issues

Embed all mermaid diagrams into markdown files

See example:

graph LR
    A[A] -->|transform| B(B)
    A[A] -->|transform| C(C)
    B[B] -->|transform| A(A)
    B[B] -->|transform| C(C)
    C[C] -->|transform| A(A)
    C[C] -->|transform| B(B)

FHIR to OMOP Notebook Doesn't Work

Hi, I'm working on a POC to convert our FHIR bundles to OMOP as outlined in https://github.com/databrickslabs/dbignite/blob/main/notebooks/dbignite-demo.py

Simplifying the problem, I'm unable to import data_model due to how the project is structured...

%pip install git+https://github.com/databrickslabs/dbignite.git

from dbignite.data_model import *

Anyway, there is more to the notebook I would like to test, but none of the proper files come down when I install it.

Long story short, I can fork it and fix it (and submit a PR if this repo is still being maintained) ... but I'm curious as to why it's in this state. Is this intended to be just a demo but support was purposely removed, or just an oversight? This also causes the solution accelerator on the databricks website to not work.

Encountering issue when running install

pip install git+https://github.com/databrickslabs/dbignite.git
Collecting git+https://github.com/databrickslabs/dbignite.git
Cloning https://github.com/databrickslabs/dbignite.git to /private/var/folders/w8/wk1tpwvj6wv_fq0ltx90ng85j3k87_/T/pip-req-build-ik88df9f
Running command git clone --filter=blob:none --quiet https://github.com/databrickslabs/dbignite.git /private/var/folders/w8/wk1tpwvj6wv_fq0ltx90ng85j3k87_/T/pip-req-build-ik88df9f
Resolved https://github.com/databrickslabs/dbignite.git to commit ce29f58
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [1 lines of output]
error in dbignite setup command: 'python_requires' must be a string containing valid version specifiers; Invalid specifier: '>=3.9.*'
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

setup.py invalid version specification

https://packaging.python.org/en/latest/specifications/version-specifiers/#version-matching
from setuptools import setup
from io import open
from os import path
import sys

DESCRIPTION = "Package for ingesting FHIR bunldes in deltalake"
this_directory = path.abspath(path.dirname(file))

with open(path.join(this_directory, "README.md"), encoding="utf-8") as f:
LONG_DESCRIPTION = f.read()

try:
exec(open("dbignite/version.py").read())
except IOError:
print("Failed to load version file for packaging.", file=sys.stderr)
sys.exit(-1)
VERSION = version

setup(
name="dbignite",
version=VERSION,
python_requires='>=3.9.*',
author="Amir Kermany, Nathan Buesgens, Rachel Sim, Aaron Zavora, William Smith, Jesse Young",
author_email="[email protected]",
description= DESCRIPTION,
long_description=LONG_DESCRIPTION,
long_description_content_type="text/markdown",
url="https://github.com/databrickslabs/dbignite",
classifiers=[
"Programming Language :: Python :: 3",
"License :: Other/Proprietary License",
"Operating System :: OS Independent",
],
packages=['dbignite', 'dbignite.omop', 'dbignite.hosp_feeds'],
package_data={'': ["schemas/*.json"]},
py_modules=['dbignite.data_model']
)

Local Tests not building with Delta

Running local Mac environment for testing https://issues.apache.org/jira/browse/SPARK-41885

Also acknowledged in Delta's project as well delta-io/delta#1059

Identifying patient centric resources

Thinking "entry.request.url" should be "entry.resource.resourceType"

dbignite/dbignite/utils.py

Line 31 in c0c6541

entries_df.where(col("entry.request.url") == "Patient")

Supporting FHIR info - https://www.hl7.org/fhir/request.html#12.3.1 "This is NOT a resource. It is not part of the FHIR schema and cannot appear directly in FHIR instances."

make pytest more compliant

def setup_class(self) -> None:
        self.spark = (SparkSession.builder.appName("myapp")
                      .config("spark.jars.packages", "io.delta:delta-core_2.12:1.1.0")
                      .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension")
                      .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog")
                      .config("spark.driver.extraJavaOptions", "-Dio.netty.tryReflectionSetAccessible=true")
                      .config("spark.executor.extraJavaOptions", "-Dio.netty.tryReflectionSetAccessible=true")
                      .master("local")
                      .getOrCreate())
        self.spark.conf.set("spark.sql.shuffle.partitions", 1)

    def teardown_class(self) -> None:
        self.spark.sql(f'DROP DATABASE IF EXISTS {TEST_DATABASE} CASCADE')

    def assertSchemasEqual(self, schemaA: StructType, schemaB: StructType) -> None:
        """
        Test that the two given schemas are equivalent (column ordering ignored)
        """
        # both schemas must have the same length
        assert len(schemaA.fields) == len(schemaB.fields)
        # schemaA must equal schemaB
        assert schemaA.simpleString() == schemaB.simpleString()

should be rewritten with pytest best practices - fixtures, top level functions, and use snake_case_method naming for tests

difference in schema definition between FHIR R4 and dbingite resources.

As per the comparison between the FHIR R4 and dbignite schema, we have found the differences in following resources :

Composition

identifier
subject

Encounter

class
diagnosis.condition
diagnosis.use
serviceType

MedicationStatement

informationSource

Please refer to the detailed schema comparison in the attachment:
FHIR vs dbignite schema comparision.xlsx

databrickslabs / dbignite Goto Github PK

dbignite's People

Contributors

Stargazers

Watchers

Forkers

dbignite's Issues

Embed all mermaid diagrams into markdown files

FHIR to OMOP Notebook Doesn't Work

Encountering issue when running install

setup.py invalid version specification

Local Tests not building with Delta

Identifying patient centric resources

make pytest more compliant

difference in schema definition between FHIR R4 and dbingite resources.

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs