GithubHelp home page GithubHelp logo

mykovalenko / docker-presto-adls-wasb Goto Github PK

View Code? Open in Web Editor NEW

This project forked from arsenvlad/docker-presto-adls-wasb

0.0 2.0 0.0 172 KB

Example of a single node Presto with Azure Data Lake Store (ADLS) and Azure Storage Blob (WASB) access via Hive metastore

License: MIT License

Dockerfile 53.45% Shell 46.55%

docker-presto-adls-wasb's Introduction

Example of a single node Presto with Azure Data Lake Store (ADLS) and Azure Blob Storage (WASB)

Click to watch video Presto with ADLS and WASB

Start local Hive metastore and Presto containers

Clone this repo

git clone https://github.com/arsenvlad/docker-presto-adls-wasb

Run Hive and Presto containers using config specified in env.conf.private

docker-compose up --build

In a separate terminal window, list currently running containers

docker ps

Connect to Hive bash

In a separate terminal window, open interactive tty bash on the Hive container

docker exec -it dockerprestoadlswasb_hive_1 bash

In the Hive container bash session, open Hive CLI pointing to itself as an external metastore. If you get an error saying "Name node is in safe mode", wait for a few minutes and try again.

hive --hiveconf hive.metastore.uris=thrift://localhost:9083

Create table using Azure Storage Blobs (change the storage account name and container name to yours)

create table wasbtable1 (id int, name varchar(255)) row format delimited fields terminated by ',' stored as textfile location 'wasb://[email protected]/wasbtable1';

Create table using Azure Data Lake Store (change the ADLS account name to yours)

create table adltable1 (id int, name varchar(255)) row format delimited fields terminated by ',' stored as textfile location 'adl://avdatalake1.azuredatalakestore.net/adltable1';

Confirm you can see the tables show tables;

Connect to Presto bash

In a separate terminal window, open interactive tty bash on the Presto container

docker exec -it dockerprestoadlswasb_presto_1 bash

Presto is configured with a single node with Hive connector as described in /etc/motd

Use Presto CLI to connect to the running Presto server

/opt/presto/presto --server http://localhost:8080

List shemas in Hive catalog

show schemas from hive;

List tables in the Hive default catalog

show tables from hive.default;

Insert data into the tables

insert into hive.default.wasbtable1 (id, name) values (1,'1');
insert into hive.default.wasbtable1 (id, name) select id, name from hive.default.wasbtable1 union all select id, name from hive.default.wasbtable1 union all select id, name from hive.default.wasbtable1;

insert into hive.default.adltable1 (id, name) values (1,'1');
insert into hive.default.adltable1 (id, name) select id, name from hive.default.adltable1 union all select id, name from hive.default.adltable1 union all select id, name from hive.default.adltable1;

Select from the table

select * from hive.default.adltable1;

When using with HDInsight

NOTE: To access Azure HDInsight Hive Thrift Service your Docker host VM must be within the same network.

To find the URLs of the HDInsight Hive Thrift Service (i.e. hive.metastore.uri), SSH into the HDInsight cluster and run this grep command:

echo $(grep -n1 "hive.metastore.uri" /etc/hive/conf/hive-site.xml | grep -o "<value>.*/value>" | sed 's:<value>::g' | sed 's:</value>::g')

Presto with Azure Data Services

See azure-data-services.md for an example showing how to configure Presto connectors to Azure Data Services to query and join data from Azure CosmosDB (using MongoDB API), Azure SQL Database, Azure MySQL, Azure PostgreSQL and store the joined results in Azure Blob Storage.

Azure CosmosDB with MongoDB API, Azure SQL Database, Azure MySQL, Azure PostgreSQL

docker-presto-adls-wasb's People

Contributors

arsenvlad avatar

Watchers

James Cloos avatar mykovalenko avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.