GithubHelp home page GithubHelp logo

geodocker / geodocker-jupyter-geopyspark Goto Github PK

View Code? Open in Web Editor NEW
38.0 38.0 17.0 694 KB

Jupyter+GeoNotebook + GeoPySpark Docker Container

License: Apache License 2.0

Makefile 11.73% Shell 12.54% Jupyter Notebook 69.59% HCL 5.30% Dockerfile 0.83%

geodocker-jupyter-geopyspark's Introduction

GeoDocker Cluster

GeoDocker is a collection of Docker images encapsulating a distributed geo-processing platform based on GeoTrellis, GeoMesa, and GeoWave. The emphasis is on providing integration between these projects and exposing geo-processing functionality in Hadoop ecosystem.

Project Status

This project is in active development. The layout and composition of GeoDocker may change as we explore this our use-case further. Despite that we're committed to maintaining sanity by providing publicly published, versioned, and tested images. Your feedback and contributions are always welcome.

Goals

  • Integrate GeoTrellis, GeoWave, and GeoMesa as a unified platform
  • Provide realistic and convenient distributed integration testing environment
  • Support deployment of GeoDocker to Amazon EMR
  • Explore and support other deployment options like DC/OS and ECS

Environment

Images

Build and Publish

It is not necessary to build and publish these containers to use them as-is. Pre-built images are available on quay.io. Building is only necessary in-order to customize and develop GeoDocker.

All images contain a Makefile which provide following targets:

  • build: Builds the container with latest tag
  • test: Runs the container tests
  • publish: Publishes the container with latest tag and tag provided by a $TAG environment variable (ex: make publish TAG=ABC123)

These targets are also used by Travis Ci as specified in .travis.yml

Docker Compose: Local Cluster

Those images which contain multiple container roles or depend on instance of other containers to function also provide a docker-compose.yml file which allows to easily bring up an local cluster. This cluster can be used for exploration, integration testing, and debugging.

# Build the latest container
~/proj/geodocker-accumulomake build
docker build -t quay.io/geodocker/accumulo:latest	.
Sending build context to Docker daemon 117.2 kB
Step 1 : FROM quay.io/geodocker/hdfs:latest
...

# Start a local multi-container cluster, use -d option to start in background mode
~/proj/geodocker-accumulodocker-compose up
Creating geodockeraccumulo_zookeeper_1
Creating geodockeraccumulo_hdfs-name_1
Creating geodockeraccumulo_hdfs-data_1
Creating geodockeraccumulo_accumulo-master_1
Creating geodockeraccumulo_accumulo-tserver_1
Creating geodockeraccumulo_accumulo-monitor_1
Attaching to geodockeraccumulo_hdfs-name_1, geodockeraccumulo_zookeeper_1, geodockeraccumulo_hdfs-data_1, geodockeraccumulo_accumulo-master_1, geodockeraccumulo_accumulo-monitor_1, geodockeraccumulo_accumulo-tserver_1
...

# Inspect running containers
~/proj/geodocker-accumulodocker-compose ps
                Name                              Command               State                     Ports
--------------------------------------------------------------------------------------------------------------------------
geodockeraccumulo_accumulo-master_1    /sbin/entrypoint.sh master ...   Up
geodockeraccumulo_accumulo-monitor_1   /sbin/entrypoint.sh monitor      Up      0.0.0.0:50095->50095/tcp
geodockeraccumulo_accumulo-tserver_1   /sbin/entrypoint.sh tserver      Up
geodockeraccumulo_hdfs-data_1          /sbin/entrypoint.sh data         Up
geodockeraccumulo_hdfs-name_1          /sbin/entrypoint.sh name         Up      0.0.0.0:50070->50070/tcp
geodockeraccumulo_zookeeper_1          /sbin/entrypoint.sh zkServ ...   Up      0.0.0.0:2181->2181/tcp, 2888/tcp, 3888/tcp

# Inspect logs from running container
~/proj/geodocker-accumulodocker-compose logs hdfs-name
hdfs-name_1         | Formatting namenode root fs in /data/hdfs/name...
hdfs-name_1         | 16/07/14 02:30:16 INFO namenode.NameNode: STARTUP_MSG:
hdfs-name_1         | /************************************************************
hdfs-name_1         | STARTUP_MSG: Starting NameNode
hdfs-name_1         | STARTUP_MSG:   host = 46c38f89156b/172.19.0.3
hdfs-name_1         | STARTUP_MSG:   args = [-format]
...

# Run a command inside the cluster container
~/proj/geodocker-accumulodocker-compose run --rm accumulo-master bash -c "set -e \
		&& source /sbin/hdfs-lib.sh \
		&& wait_until_hdfs_is_available \
		&& with_backoff hdfs dfs -test -d /accumulo \
		&& accumulo shell -p GisPwd -e 'createtable test_table'"
Safe mode is OFF
2016-07-14 02:49:25,809 [trace.DistributedTrace] INFO : SpanReceiver org.apache.accumulo.tracer.ZooTraceClient was loaded successfully.
2016-07-14 02:49:25,973 [shell.Shell] ERROR: org.apache.accumulo.core.client.TableExistsException: Table test_table exists
make: *** [test] Error 1

License

geodocker-jupyter-geopyspark's People

Contributors

echeipesh avatar jamesmcclain avatar jpolchlo avatar lossyrob avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

geodocker-jupyter-geopyspark's Issues

Latest Version of docker image breaks existing notebooks

I've had some existing code for a while and when I updated to the latest image, i received the error below. The last known image i was able to get my code working on was quay.io/geodocker/jupyter-geopyspark:e900b5f

The code was simply doing this:

querried_spatial_layer = gps.query(uri=catalog_uri,
layer_name=layer_name,
layer_zoom=0,
query_geom=county,
num_partitions=100)

`

Py4JJavaError Traceback (most recent call last)
/home/hadoop/.local/lib/python3.4/site-packages/geopyspark/geotrellis/catalog.py in init(self, uri)
299 try:
--> 300 self.wrapper = pysc._gateway.jvm.geopyspark.geotrellis.io.AttributeStoreWrapper(uri)
301 except Py4JJavaError as err:

/usr/local/spark-2.1.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py in call(self, *args)
1400 return_value = get_return_value(
-> 1401 answer, self._gateway_client, None, self._fqn)
1402

/usr/local/spark-2.1.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
318 "An error occurred while calling {0}{1}{2}.\n".
--> 319 format(target_id, ".", name), value)
320 else:

Py4JJavaError: An error occurred while calling None.geopyspark.geotrellis.io.AttributeStoreWrapper.
: java.lang.AbstractMethodError: geotrellis.spark.io.s3.S3AttributeStore.geotrellis$spark$io$AttributeCaching$setter$geotrellis$spark$io$AttributeCaching$$x$1_$eq(Lscala/Tuple2;)V
at geotrellis.spark.io.AttributeCaching$class.$init$(AttributeCaching.scala:29)
at geotrellis.spark.io.s3.S3AttributeStore.(S3AttributeStore.scala:38)
at geotrellis.spark.io.s3.S3LayerProvider.attributeStore(S3LayerProvider.scala:41)
at geotrellis.spark.io.AttributeStore$.apply(AttributeStore.scala:70)
at geotrellis.spark.io.AttributeStore$.apply(AttributeStore.scala:73)
at geopyspark.geotrellis.io.AttributeStoreWrapper.(AttributeStoreWrapper.scala:29)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:236)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:748)

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)
in ()
3 layer_zoom=0,
4 query_geom=county_wm,
----> 5 num_partitions=100)

/home/hadoop/.local/lib/python3.4/site-packages/geopyspark/geotrellis/catalog.py in query(uri, layer_name, layer_zoom, query_geom, time_intervals, query_proj, num_partitions, store)
185 store = AttributeStore.build(store)
186 else:
--> 187 store = AttributeStore.cached(uri)
188
189 pysc = get_spark_context()

/home/hadoop/.local/lib/python3.4/site-packages/geopyspark/geotrellis/catalog.py in cached(cls, uri)
326 return _cached_stores[uri]
327 else:
--> 328 store = cls(uri)
329 _cached_stores[uri] = store
330 return store

/home/hadoop/.local/lib/python3.4/site-packages/geopyspark/geotrellis/catalog.py in init(self, uri)
300 self.wrapper = pysc._gateway.jvm.geopyspark.geotrellis.io.AttributeStoreWrapper(uri)
301 except Py4JJavaError as err:
--> 302 raise ValueError(err.java_exception.getMessage())
303
304 @classmethod

ValueError: geotrellis.spark.io.s3.S3AttributeStore.geotrellis$spark$io$AttributeCaching$setter$geotrellis$spark$io$AttributeCaching$$x$1_$eq(Lscala/Tuple2;)V`

Issue with visualizing GPS map outputs through default port structure.

After running a sample demo, example, NLCD, I was unable to get the xyz tile server to be exposed thru the VM running GPS through to my local machine.

@echeipesh came up with a nice work around:

Expose a specific port link during the docker run initialization command, as per:
docker run -it --rm --name geopyspark \\n -p 8000:8000 -p 4040:4040 -p 7070:7070 \\n -v $HOME/.aws:/home/hadoop/.aws:ro \\n quay.io/geodocker/jupyter-geopyspark

And then include the 7070 port ID in the calls as per:

image

Not sure if there is permanent solution for Macs. (This is apparently not an issue for linux machines) But this works for now I suppose.

Username Password?

Hi, i know this is not really an issue but if i do

docker run -it --rm --name geopyspark \ -p 8000:8000 -p 4040:4040 \ quay.io/geodocker/jupyter-geopyspark

why would you not explain whats the credentials? or how to get there?

Decouple geopyspark and geopyspark-netcdf versions

The two do not iterate at the same rate and while we have not imposed the semver on these projects yet it would be much easier to iterate and test rather than bumping the geopyspark-netcdf version manually.

The deeper issue is that the way geopyspark-netcdf jars are being discovered during runtime are coupled to the geopyspark version.

Cannot install pip packages - permision denied

Hi there

I'm using the docker image quay.io/geodocker/jupyter-geopyspark:blog (also contains the fantastic geonotebook), and I'm trying to install with !pip install geopandas. Unfortunately I'm getting an error:

PermissionError: [Errno 13] Permission denied: '/usr/lib/python3.4/site-packages/descartes-1.1.0.dist-info'

Attempting to sudo gives /usr/bin/sh: sudo: command not found

Is there anyway I can install additional pip packages (preferably through a notebook so I get the correct python environment)?

Many thanks for these fantastic docker containers!

Consider Dropping Support For Docker on EMR

The original plan was to use the Docker image on EMR, but now that we have RPMs, it might be worthwhile to consider dropping support for using the Docker on EMR. Doing that would simplify the image and reduce its size.

In particular, the gdal-and-friends.tar.gz binary blob could be dropped from the base image, and the two Python tarballs could be dropped from the main image.

Recapture GDAL capabilities we lost with move away from native build

This set of code:

with rio.open('s3://mrgeo-source/srtm-v3-30/N00E006.hgt') as ds:
    bounds = ds.bounds
    height = ds.height
    width = ds.width
    crs = ds.get_crs()
    srs = osr.SpatialReference()
    srs.ImportFromWkt(crs.wkt)
    proj4 = srs.ExportToProj4()
    tile_cols = math.floor((width - 1) / 512) * 512
    tile_rows = math.floor((height - 1) / 512) * 512
    ws = [((x, x + 512), (y, y + 512)) for x in range(0,tile_cols, 512) \
                                          for y in range(0, tile_rows, 512)]
    print(bounds)
    print(height)
    print(width)
    print(crs)
    print(tile_cols)
    print(tile_rows)
    print(ws)

fails in the current container. In the rde/workshop-prep container, it succeeds.

I suspect this is due to the move away from a native GDAL build and towards a pip installed GDAL.

We should either figure out how to gain back those capabilities with the pip installed version, or move back to a native GDAL build.

Support populating docker image with sample notebooks

I followed this command:

docker run -it --rm --name geopyspark \
   -p 8000:8000 \
   quay.io/geodocker/jupyter-geopyspark

And the notebooks do not appear in the home directory in jupyter.

It would be extremely helpful for a first time user of GPS to be able to do a docker run and get some sample notebooks to play with.

@jpolchlo mentioned it looks like its close to enabling this, but something must be missing in terms of populating the notebooks in /home/hadoop - or wherever they need to be

Break-up Single Security Group in Terraform Setup

There is discussion in #42 and #45 about splititng the single security group used for the ECS instance, the EMR master, and the EMR worker into three separate security groups. I made two brief attempts to do that, but neither was successful.

Attempt 1

With direct dependencies between security groups shown with solid arrows and dependencies mediated by security group rules shown with dotted arrows, the following diagram shows one attempt.
The change is acceptable to Terraform and AWS at creaton-time. However, the lack of direct access between the ECS instance and the EMR workers causes spark context creation to fail.

1

The actual change is here:

diff --git a/terraform/emr.tf b/terraform/emr.tf
index 95ed648..693eddd 100644
--- a/terraform/emr.tf
+++ b/terraform/emr.tf
@@ -10,8 +10,8 @@ resource "aws_emr_cluster" "emr-spark-cluster" {
     key_name         = "${var.key_name}"
     subnet_id        = "${var.subnet}"
 
-    emr_managed_master_security_group = "${aws_security_group.security-group.id}"
-    emr_managed_slave_security_group  = "${aws_security_group.security-group.id}"
+    emr_managed_master_security_group = "${aws_security_group.emr-master.id}"
+    emr_managed_slave_security_group  = "${aws_security_group.emr-worker.id}"
   }
 
   instance_group {
diff --git a/terraform/jupyterhub.tf b/terraform/jupyterhub.tf
index e46c0e6..9147ccf 100644
--- a/terraform/jupyterhub.tf
+++ b/terraform/jupyterhub.tf
@@ -3,7 +3,7 @@ resource "aws_spot_instance_request" "jupyterhub" {
   iam_instance_profile = "${var.ecs_instance_profile}"
   instance_type        = "m3.xlarge"
   key_name             = "${var.key_name}"
-  security_groups      = ["${aws_security_group.security-group.name}"]
+  security_groups      = ["${aws_security_group.ecs-instance.name}"]
   spot_price           = "0.05"
   wait_for_fulfillment = true
 
diff --git a/terraform/security-group.tf b/terraform/security-group.tf
index 727ffc3..5f993fc 100644
--- a/terraform/security-group.tf
+++ b/terraform/security-group.tf
@@ -1,9 +1,10 @@
-resource "aws_security_group" "security-group" {
+# ECS Instance
+resource "aws_security_group" "ecs-instance" {
   ingress {
     from_port = 0
     to_port   = 0
     protocol  = "-1"
-    self      = true
+    security_groups = ["${aws_security_group.emr-master.id}"]
   }
 
   ingress {
@@ -31,3 +32,78 @@ resource "aws_security_group" "security-group" {
     create_before_destroy = true
   }
 }
+
+# EMR Master
+resource "aws_security_group" "emr-master" {
+  lifecycle {
+    create_before_destroy = true
+  }
+}
+
+resource "aws_security_group_rule" "from-jupyterhub" {
+  type                     = "ingress"
+  from_port                = 0
+  to_port                  = 0
+  protocol                 = "-1"
+  source_security_group_id = "${aws_security_group.ecs-instance.id}"
+
+  security_group_id = "${aws_security_group.emr-master.id}"
+}
+
+resource "aws_security_group_rule" "from-workers" {
+  type                     = "ingress"
+  from_port                = 0
+  to_port                  = 0
+  protocol                 = "-1"
+  source_security_group_id = "${aws_security_group.emr-worker.id}"
+
+  security_group_id = "${aws_security_group.emr-master.id}"
+}
+
+resource "aws_security_group_rule" "ssh-all" {
+  type                     = "ingress"
+  from_port                = 22
+  to_port                  = 22
+  protocol                 = "tcp"
+  cidr_blocks              = ["0.0.0.0/0"]
+
+  security_group_id = "${aws_security_group.emr-master.id}"
+}
+
+resource "aws_security_group_rule" "outgoing-all" {
+  type                     = "egress"
+  from_port                = 0
+  to_port                  = 0
+  protocol                 = "-1"
+  cidr_blocks              = ["0.0.0.0/0"]
+
+  security_group_id = "${aws_security_group.emr-master.id}"
+}
+
+# EMR Worker
+resource "aws_security_group" "emr-worker" {
+  ingress {
+    from_port = 0
+    to_port   = 0
+    protocol  = "-1"
+    security_groups = ["${aws_security_group.emr-master.id}"]
+  }
+
+  ingress {
+    from_port = 0
+    to_port   = 0
+    protocol  = "-1"
+    self      = true
+  }
+
+  egress {
+    from_port   = 0
+    to_port     = 0
+    protocol    = "-1"
+    cidr_blocks = ["0.0.0.0/0"]
+  }
+
+  lifecycle {
+    create_before_destroy = true
+  }
+}

Attempt 2

Allowing the ECS instance and EMR workers to communicate produces a cyclic dependency.
Although mediated by security group rules and therefore "grammatical" from Terraform's perspective, this fails when "terraform apply" is run. It gives a message which I recall being to the effect of "you have encountered a bug that used to exist in Terraform" (which is weird on a number of levels). Using this strategy produces mutually-interdependent security groups which Terraform cannot automatically remove and which must be removed by hand (that is why I do not have the error message pasted into this issue verbatim -- I do not want to do the manual cleanup again).

2

diff --git a/terraform/emr.tf b/terraform/emr.tf
index 95ed648..693eddd 100644
--- a/terraform/emr.tf
+++ b/terraform/emr.tf
@@ -10,8 +10,8 @@ resource "aws_emr_cluster" "emr-spark-cluster" {
     key_name         = "${var.key_name}"
     subnet_id        = "${var.subnet}"
 
-    emr_managed_master_security_group = "${aws_security_group.security-group.id}"
-    emr_managed_slave_security_group  = "${aws_security_group.security-group.id}"
+    emr_managed_master_security_group = "${aws_security_group.emr-master.id}"
+    emr_managed_slave_security_group  = "${aws_security_group.emr-worker.id}"
   }
 
   instance_group {
diff --git a/terraform/jupyterhub.tf b/terraform/jupyterhub.tf
index e46c0e6..4bb5b28 100644
--- a/terraform/jupyterhub.tf
+++ b/terraform/jupyterhub.tf
@@ -3,7 +3,7 @@ resource "aws_spot_instance_request" "jupyterhub" {
   iam_instance_profile = "${var.ecs_instance_profile}"
   instance_type        = "m3.xlarge"
   key_name             = "${var.key_name}"
-  security_groups      = ["${aws_security_group.security-group.name}"]
+  security_groups      = ["${aws_security_group.jupyterhub.name}"]
   spot_price           = "0.05"
   wait_for_fulfillment = true
 
diff --git a/terraform/security-group.tf b/terraform/security-group.tf
index 727ffc3..010bb4f 100644
--- a/terraform/security-group.tf
+++ b/terraform/security-group.tf
@@ -1,9 +1,17 @@
-resource "aws_security_group" "security-group" {
+# ECS Instance
+resource "aws_security_group" "jupyterhub" {
   ingress {
     from_port = 0
     to_port   = 0
     protocol  = "-1"
-    self      = true
+    security_groups = ["${aws_security_group.emr-master.id}"]
+  }
+
+  ingress {
+    from_port = 0
+    to_port   = 0
+    protocol  = "-1"
+    security_groups = ["${aws_security_group.emr-worker.id}"]
   }
 
   ingress {
@@ -31,3 +39,125 @@ resource "aws_security_group" "security-group" {
     create_before_destroy = true
   }
 }
+
+# EMR Master
+resource "aws_security_group" "emr-master" {
+  lifecycle {
+    create_before_destroy = true
+  }
+}
+
+resource "aws_security_group_rule" "master-jupyterhub" {
+  type                     = "ingress"
+  from_port                = 0
+  to_port                  = 0
+  protocol                 = "-1"
+  source_security_group_id = "${aws_security_group.jupyterhub.id}"
+
+  security_group_id = "${aws_security_group.emr-master.id}"
+}
+
+resource "aws_security_group_rule" "master-workers" {
+  type                     = "ingress"
+  from_port                = 0
+  to_port                  = 0
+  protocol                 = "-1"
+  source_security_group_id = "${aws_security_group.emr-worker.id}"
+
+  security_group_id = "${aws_security_group.emr-master.id}"
+}
+
+resource "aws_security_group_rule" "master-ssh" {
+  type                     = "ingress"
+  from_port                = 22
+  to_port                  = 22
+  protocol                 = "tcp"
+  cidr_blocks              = ["0.0.0.0/0"]
+
+  security_group_id = "${aws_security_group.emr-master.id}"
+}
+
+resource "aws_security_group_rule" "master-outgoing" {
+  type                     = "egress"
+  from_port                = 0
+  to_port                  = 0
+  protocol                 = "-1"
+  cidr_blocks              = ["0.0.0.0/0"]
+
+  security_group_id = "${aws_security_group.emr-master.id}"
+}
+
+# EMR Worker
+resource "aws_security_group" "emr-worker" {
+  # ingress {
+  #   from_port = 0
+  #   to_port   = 0
+  #   protocol  = "-1"
+  #   security_groups = ["${aws_security_group.jupyterhub.id}"]
+  # }
+
+  # ingress {
+  #   from_port = 0
+  #   to_port   = 0
+  #   protocol  = "-1"
+  #   security_groups = ["${aws_security_group.emr-master.id}"]
+  # }
+
+  # ingress {
+  #   from_port = 0
+  #   to_port   = 0
+  #   protocol  = "-1"
+  #   self      = true
+  # }
+
+  # egress {
+  #   from_port   = 0
+  #   to_port     = 0
+  #   protocol    = "-1"
+  #   cidr_blocks = ["0.0.0.0/0"]
+  # }
+
+  lifecycle {
+    create_before_destroy = true
+  }
+}
+
+resource "aws_security_group_rule" "worker-jupyterhub" {
+  type                     = "ingress"
+  from_port                = 0
+  to_port                  = 0
+  protocol                 = "-1"
+  source_security_group_id = "${aws_security_group.jupyterhub.id}"
+
+  security_group_id = "${aws_security_group.emr-worker.id}"
+}
+
+resource "aws_security_group_rule" "worker-master" {
+  type                     = "ingress"
+  from_port                = 0
+  to_port                  = 0
+  protocol                 = "-1"
+  source_security_group_id = "${aws_security_group.emr-master.id}"
+
+  security_group_id = "${aws_security_group.emr-worker.id}"
+}
+
+resource "aws_security_group_rule" "workers-worker" {
+  type      = "ingress"
+  from_port = 0
+  to_port   = 0
+  protocol  = "-1"
+  self      = true
+
+  security_group_id = "${aws_security_group.emr-worker.id}"
+}
+
+resource "aws_security_group_rule" "worker-outgoing" {
+  type                     = "egress"
+  from_port                = 0
+  to_port                  = 0
+  protocol                 = "-1"
+  cidr_blocks              = ["0.0.0.0/0"]
+
+  security_group_id = "${aws_security_group.emr-worker.id}"
+}

Rename stage0 stage1 and others

Dockerfile.stage0 -> Dockerfile.build
Dockerfile.stage2 -> Dockerfile

blobs -> artifacts

scripts/blob-* -> scripts/artifact-<a name>

There should also be a writeup that explains the build process and motivations for stage0 stage2.

MIA: stage1, have you seen it?

libcurl Not Compiled Into GDAL

Evidently, libcurl is not compiled into GDAL. This prevents names beginning with /vsicurl/ and /vsis3/ from being used.

Be Able to Run Script in the EMR Terminal

It would be nice if we could run scripts by sshing into EMR, uploading the desired script, and then running into. Because right now, you need to go into JupyterHub and create a terminal in there. Then once created, you'll need to export these variables before you can run the script.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.