GithubHelp home page GithubHelp logo

dbignite's People

Contributors

dependabot[bot] avatar dmoore247 avatar jesseryoung-db avatar kermany avatar natb1 avatar nfx avatar rachelsim-ll avatar rachelsmc avatar rsrjohnson avatar sinchana-kj avatar vadim avatar willsmithdb avatar zavoraad avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dbignite's Issues

FHIR to OMOP Notebook Doesn't Work

Hi, I'm working on a POC to convert our FHIR bundles to OMOP as outlined in https://github.com/databrickslabs/dbignite/blob/main/notebooks/dbignite-demo.py

Simplifying the problem, I'm unable to import data_model due to how the project is structured...

%pip install git+https://github.com/databrickslabs/dbignite.git
from dbignite.data_model import *

Anyway, there is more to the notebook I would like to test, but none of the proper files come down when I install it.

image

Long story short, I can fork it and fix it (and submit a PR if this repo is still being maintained) ... but I'm curious as to why it's in this state. Is this intended to be just a demo but support was purposely removed, or just an oversight? This also causes the solution accelerator on the databricks website to not work.

Encountering issue when running install

pip install git+https://github.com/databrickslabs/dbignite.git
Collecting git+https://github.com/databrickslabs/dbignite.git
Cloning https://github.com/databrickslabs/dbignite.git to /private/var/folders/w8/wk1tpwvj6wv_fq0ltx90ng85j3k87_/T/pip-req-build-ik88df9f
Running command git clone --filter=blob:none --quiet https://github.com/databrickslabs/dbignite.git /private/var/folders/w8/wk1tpwvj6wv_fq0ltx90ng85j3k87_/T/pip-req-build-ik88df9f
Resolved https://github.com/databrickslabs/dbignite.git to commit ce29f58
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [1 lines of output]
error in dbignite setup command: 'python_requires' must be a string containing valid version specifiers; Invalid specifier: '>=3.9.*'
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

setup.py invalid version specification

https://packaging.python.org/en/latest/specifications/version-specifiers/#version-matching
from setuptools import setup
from io import open
from os import path
import sys

DESCRIPTION = "Package for ingesting FHIR bunldes in deltalake"
this_directory = path.abspath(path.dirname(file))

with open(path.join(this_directory, "README.md"), encoding="utf-8") as f:
LONG_DESCRIPTION = f.read()

try:
exec(open("dbignite/version.py").read())
except IOError:
print("Failed to load version file for packaging.", file=sys.stderr)
sys.exit(-1)
VERSION = version

setup(
name="dbignite",
version=VERSION,
python_requires='>=3.9.*',
author="Amir Kermany, Nathan Buesgens, Rachel Sim, Aaron Zavora, William Smith, Jesse Young",
author_email="[email protected]",
description= DESCRIPTION,
long_description=LONG_DESCRIPTION,
long_description_content_type="text/markdown",
url="https://github.com/databrickslabs/dbignite",
classifiers=[
"Programming Language :: Python :: 3",
"License :: Other/Proprietary License",
"Operating System :: OS Independent",
],
packages=['dbignite', 'dbignite.omop', 'dbignite.hosp_feeds'],
package_data={'': ["schemas/*.json"]},
py_modules=['dbignite.data_model']
)

make pytest more compliant

def setup_class(self) -> None:
        self.spark = (SparkSession.builder.appName("myapp")
                      .config("spark.jars.packages", "io.delta:delta-core_2.12:1.1.0")
                      .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension")
                      .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog")
                      .config("spark.driver.extraJavaOptions", "-Dio.netty.tryReflectionSetAccessible=true")
                      .config("spark.executor.extraJavaOptions", "-Dio.netty.tryReflectionSetAccessible=true")
                      .master("local")
                      .getOrCreate())
        self.spark.conf.set("spark.sql.shuffle.partitions", 1)

    def teardown_class(self) -> None:
        self.spark.sql(f'DROP DATABASE IF EXISTS {TEST_DATABASE} CASCADE')

    def assertSchemasEqual(self, schemaA: StructType, schemaB: StructType) -> None:
        """
        Test that the two given schemas are equivalent (column ordering ignored)
        """
        # both schemas must have the same length
        assert len(schemaA.fields) == len(schemaB.fields)
        # schemaA must equal schemaB
        assert schemaA.simpleString() == schemaB.simpleString()

should be rewritten with pytest best practices - fixtures, top level functions, and use snake_case_method naming for tests

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.