databrickslabs Goto Github PK
Name: Databricks Labs
Type: Organization
Bio: Labs projects to accelerate use cases on the Databricks Unified Analytics Platform
Name: Databricks Labs
Type: Organization
Bio: Labs projects to accelerate use cases on the Databricks Unified Analytics Platform
Delta Sharing + MLflow for ML model & experiment exchange (arcuate delta - a fan shaped river delta)
Toolkit for Apache Spark ML for Feature clean-up, feature Importance calculation suite, Information Gain selection, Distributed SMOTE, Model selection and training, Hyper parameter optimization and selection, Model interprability.
Baseline for Databricks Labs projects written in Python
Manage your Databricks deployments and CI with code.
Databricks SDK for R (Experimental)
An experimental tool to synchronize source Databricks deployment with a target Databricks deployment.
Extensible Rules Engine for custom Dataframe / Dataset validation
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
🧱 Databricks CLI eXtensions - aka dbx is a CLI tool for development and advanced Databricks workflows management.
DeltaOMS is a solution that help build a centralized repository of Delta Transaction logs and associated operational metrics/statistics for your Delta Lakehouse. Unity Catalog supported in the v0.7.0-rc1 release.Documentation here - https://databrickslabs.github.io/delta-oms/v0.7.0-rc1/
A Java connector for delta.io/sharing/ that allows you to easily ingest data on any JVM.
A Swiss-Army-knife for your Data Intelligence platform administration.
This is metadata driven DLT based framework for bronze/silver pipelines
Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform
Accelerator to rapidly deploy customized features for your business
Geospatial clustering at massive scale
DEPRECATED: Integrating Jupyter with Databricks via SSH
Lightweight SQL execution wrapper only on top of Databricks SDK
Old scripts for one-off ST-to-E2 migrations. Use "terraform exporter" linked in the readme.
An extension to the Apache Spark framework that allows easy and fast processing of very large geospatial datasets.
Capture deep metrics on one or all assets within a Databricks workspace
Databricks Plugin for PyLint
Cross-compiler into Databricks Lakehouse
Experimental or low-maturity things
HL7 Apache Spark Datasource
Databricks Add-on for Splunk
API for manipulating time series on top of Apache Spark: lagged time values, rolling statistics (mean, avg, sum, count, etc), AS OF joins, downsampling, and interpolation
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.