GithubHelp home page GithubHelp logo

sivasankarp / aws-glue-samples Goto Github PK

View Code? Open in Web Editor NEW

This project forked from aws-samples/aws-glue-samples

0.0 0.0 0.0 1.62 MB

AWS Glue code samples

License: MIT No Attribution

Shell 5.18% Python 39.46% Java 10.43% Scala 25.74% Jupyter Notebook 18.79% Dockerfile 0.40%

aws-glue-samples's Introduction

AWS Glue Samples

AWS Glue is a serverless data integration service that makes it easier to discover, prepare, move, and integrate data from multiple sources for analytics, machine learning (ML), and application development. This repository has samples that demonstrate various aspects of the AWS Glue service, as well as various AWS Glue utilities.

You can find the AWS Glue open-source Python libraries in a separate repository at: awslabs/aws-glue-libs.

Getting Started

Workshops

  • AWS Glue Learning Series

    In this comprehensive series, you'll learn everything from the basics of Glue to advanced optimization techniques.

Tutorials

General

Data migration

Open Table Format

Development, Test, and CI/CD

Cost and Performance

Glue for Ray

Glue Data Catalog

Glue Crawler

Glue Data Quality

Glue ETL Code Examples

You can run these sample job scripts on any of AWS Glue ETL jobs, container, or local environment.

  • Join and Relationalize Data in S3

    This sample ETL script shows you how to use AWS Glue to load, transform, and rewrite data in AWS S3 so that it can easily and efficiently be queried and analyzed.

  • Clean and Process

    This sample ETL script shows you how to take advantage of both Spark and AWS Glue features to clean and transform data for efficient analysis.

  • The resolveChoice Method

    This sample explores all four of the ways you can resolve choice types in a dataset using DynamicFrame's resolveChoice method.

  • Converting character encoding

    This sample ETL script shows you how to use AWS Glue job to convert character encoding.

  • Notebook using open data dake formats

    The sample iPython notebook files show you how to use open data dake formats; Apache Hudi, Delta Lake, and Apache Iceberg on AWS Glue Interactive Sessions and AWS Glue Studio Notebook.

  • Blueprint examples

    The sample Glue Blueprints show you how to implement blueprints addressing common use-cases in ETL. The samples are located under aws-glue-blueprint-libs repository.

Utilities

Glue Custom Connectors

AWS Glue provides built-in support for the most commonly used data stores such as Amazon Redshift, MySQL, MongoDB. Powered by Glue ETL Custom Connector, you can subscribe a third-party connector from AWS Marketplace or build your own connector to connect to data stores that are not natively supported.

marketplace

  • Development

    Development guide with examples of connectors with simple, intermediate, and advanced functionalities. These examples demonstrate how to implement Glue Custom Connectors based on Spark Data Source or Amazon Athena Federated Query interfaces and plug them into Glue Spark runtime.

  • Local Validation Tests

    This user guide describes validation tests that you can run locally on your laptop to integrate your connector with Glue Spark runtime.

  • Validation

    This user guide shows how to validate connectors with Glue Spark runtime in a Glue job system before deploying them for your workloads.

  • Glue Spark Script Examples

    Python scripts examples to use Spark, Amazon Athena and JDBC connectors with Glue Spark runtime.

  • Create and Publish Glue Connector to AWS Marketplace

    If you would like to partner or publish your Glue custom connector to AWS Marketplace, please refer to this guide and reach out to us at [email protected] for further details on your connector.

License Summary

This sample code is made available under the MIT-0 license. See the LICENSE file.

aws-glue-samples's People

Contributors

ben-bourdin451 avatar dangereis avatar dependabot[bot] avatar fss18 avatar haroldhenry avatar hyandell avatar jinet avatar junoha avatar leejianwei avatar lmbo-2020 avatar markatwood avatar mashah avatar mitczach avatar mohitsax avatar moomindani avatar pemmasanikrishna avatar rmattsampson avatar romnempire avatar satyapreddy avatar stewartsmith avatar sumitya avatar tomtongue avatar xy1m avatar yyolk avatar zhukovalexander avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.