Alex's Projects
Provides an alternative way to monitor Databricks cluster utilisation using Ganglia Web Service on the driver node of each cluster. The Ganglia metrics can be exported to any Spark datasource format which can be used to analyse cluster usage.
GDAL is an open source X/MIT licensed translator library for raster and vector geospatial data formats.
Using Databricks and HuggingFace for GenAI
Generative AI on AWS
GeoMesa is a suite of tools for working with big geo-spatial data in a distributed fashion.
Tutorials and examples for working with GeoMesa
Jupyter notebooks for gaining geospatial insights
Clone all of your Github repositories from the command line using Python
Genome-wide association studies identify genetic variations associated with a target disease or trait. Researchers and clinicians can use this information to better detect, treat and prevent chronic health conditions. This Solution Accelerator notebook builds on top of Glow
GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models
GRASS GIS - free and open source Geographic Information System (GIS)
Grok open release
Hexagonal hierarchical geospatial indexing system
Hackerrank code challenges
Compute Resource Usage Analysis and Monitoring of Container Clusters
The Kubernetes Package Manager
Apache Hive
Distributed training framework for TensorFlow, Keras, PyTorch, and MXNet.
Upserts And Incremental Processing on Big Data
Sample notebooks for optimized training and inference of Hugging Face models on Azure Databricks
Apache Iceberg
Implementation of the Decentralized Identity standards such as DID and Verifiable Credentials by W3C for the IOTA Tangle
Mirror of Apache Ignite
Build a similarity-based image recommendation system for e-commerce that takes into account the visual similarity of items as an input for making product recommendations.
All work related with infrastructure.
An ongoing list of pandas quirks
A high-throughput, distributed, publish-subscribe messaging system