GithubHelp home page GithubHelp logo

bigdl-ppml-azure-occlum-example's People

Contributors

hzjane avatar jenniew avatar liu-shaojun avatar patrickkz avatar qiyuangong avatar qzheng527 avatar xiangyut avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

bigdl-ppml-azure-occlum-example's Issues

PPML Azure Image with Occlum

  • MAA Image
    • MAA PR from source
    • Spark occlum in image
    • Runnable occlum instance
  • Azure NYTaxi
    • NYTaxi Docker MAA
    • NYTaxi AKS MAA
    • Benchmark with Scone
  • Final Test on Azure
    • Azure VM test
    • Azure AKS test
    • Azure Simple Query test
    • TPC-H
    • TPC-DS

Cargo build failure in maa_init

Thank you for building this example repo. I am trying to run a modified version of AzureNytaxi.scala and while building the Docker image from source bash build-docker-image.sh, the Docker build hangs with the below failure related to cargo build --release

Environment: Docker Desktop on Apple Mac M1/ARM

uname -a
Darwin sid-habu.local 20.6.0 Darwin Kernel Version 20.6.0: Mon Aug 30 06:12:20 PDT 2021; root:xnu-7195.141.6~3/RELEASE_ARM64_T8101 arm64
bash build-docker-image.sh
The JAVA_HOME environment variable is not defined correctly,
this environment variable is needed to run this program.
[+] Building 344.2s (15/19)
 => [internal] load build definition from Dockerfile                                                                                                    0.1s
 => => transferring dockerfile: 3.06kB                                                                                                                  0.0s
 => [internal] load .dockerignore                                                                                                                       0.0s
 => => transferring context: 2B                                                                                                                         0.0s
 => [internal] load metadata for docker.io/intelanalytics/bigdl-ppml-trusted-big-data-ml-scala-occlum:2.1.0                                             0.5s
 => [internal] load build context                                                                                                                       0.0s
 => => transferring context: 220B                                                                                                                       0.0s
 => [ 1/15] FROM docker.io/intelanalytics/bigdl-ppml-trusted-big-data-ml-scala-occlum:2.1.0@sha256:9c2fb08f5e53e8af66d269cff44eadaea83a86ce3b76faa5891  0.0s
 => CACHED [ 2/15] RUN mkdir -p /opt/src &&     cd /opt/src &&     git clone https://github.com/occlum/occlum.git &&     cd occlum &&     apt purge li  0.0s
 => CACHED [ 3/15] RUN echo "deb [arch=amd64] https://packages.microsoft.com/ubuntu/20.04/prod focal main" | sudo tee /etc/apt/sources.list.d/msprod.l  0.0s
 => CACHED [ 4/15] RUN cd /opt/src/occlum &&     git submodule update --init                                                                            0.0s
 => CACHED [ 5/15] RUN wget -P /opt/spark/jars/ https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-azure/3.2.0/hadoop-azure-3.2.0.jar &&     wget  0.0s
 => CACHED [ 6/15] RUN rm /opt/run_spark_on_occlum_glibc.sh &&     rm /opt/entrypoint.sh &&     rm -rf /opt/spark-source                                0.0s
 => CACHED [ 7/15] ADD ./run_spark_on_occlum_glibc.sh /opt/run_spark_on_occlum_glibc.sh                                                                 0.0s
 => CACHED [ 8/15] ADD ./entrypoint.sh /opt/entrypoint.sh                                                                                               0.0s
 => CACHED [ 9/15] ADD ./mount.sh /opt/mount.sh                                                                                                         0.0s
 => CACHED [10/15] ADD ./add_conf.sh /opt/occlum_spark/add_conf.sh                                                                                      0.0s
 => CACHED [11/15] RUN cp /opt/run_spark_on_occlum_glibc.sh /root/run_spark_on_occlum_glibc.sh                                                          0.0s
 => [12/15] RUN cd /opt/src/occlum/demos/remote_attestation/azure_attestation/maa_init/init &&     cargo clean &&     cargo build --release           343.6s
 => => #   process didn't exit successfully: `rustc --crate-name unicode_bidi --edition=2018 /root/.cargo/registry/src/github.com-1ecc6299db9ec823/unicode-b
 => => # idi-0.3.8/src/lib.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts --crate-type lib --emit=dep-info,metadata,link -C opt-level=3 -C
 => => #  embed-bitcode=no --cfg 'feature="default"' --cfg 'feature="hardcoded-data"' --cfg 'feature="std"' -C metadata=5c4faa67b27ab459 -C extra-filename=-
 => => # 5c4faa67b27ab459 --out-dir /opt/src/occlum/demos/remote_attestation/azure_attestation/maa_init/init/target/release/deps -L dependency=/opt/src/occl
 => => # um/demos/remote_attestation/azure_attestation/maa_init/init/target/release/deps --cap-lints allow` (signal: 11, SIGSEGV: invalid memory reference)
 => => # warning: build failed, waiting for other jobs to finish...

Error: AESM service is not started yet. Need to start it first

I tried to use AlibabaCloud ECS ecs.g7t.xlarge to run BigDL-PPML-Azure-Occlum-Example Spark pi example. I pulled the image and followed the command:

docker run --rm -it \
    --name=azure-ppml-example-with-occlum \
    --device=/dev/sgx/enclave \
    --device=/dev/sgx/provision \
    intelanalytics/bigdl-ppml-azure-occlum:2.1.0 bash 

cd /opt
bash run_spark_on_occlum_glibc.sh pi

but it returned:

root@67dcc332da34:/opt# bash run_spark_on_occlum_glibc.sh pi
+ BLUE='\033[1;34m'
+ NC='\033[0m'
+ occlum_glibc=/opt/occlum/glibc/lib
++ cat /etc/hosts
++ grep 67dcc332da34
++ awk '{print $1}'
+ HOST_IP=172.17.0.2
+ INSTANCE_DIR=/opt/occlum_spark
+ INIT_DIR=/opt/src/occlum/demos/remote_attestation/azure_attestation/maa_init/init
+ IMG_BOM=/opt/src/occlum/demos/remote_attestation/azure_attestation/maa_init/bom.yaml
+ INIT_BOM=/opt/src/occlum/demos/remote_attestation/azure_attestation/maa_init/init_maa.yaml
++ '[' -f '' ']'
++ echo 0
+ id=0
+ arg=pi
+ case "$arg" in
+ run_spark_pi
+ echo -e '\033[1;34mocclum run spark Pi\033[0m'
occlum run spark Pi
+ [[ -z '' ]]
+ echo 'META_SPACE not set, using default value 256m'
META_SPACE not set, using default value 256m
+ META_SPACE=256m
+ cd /opt/occlum_spark
+ bash /opt/mount.sh
+ occlum run /usr/lib/jvm/java-8-openjdk-amd64/bin/java -XX:-UseCompressedOops -XX:MaxMetaspaceSize=256m -XX:ActiveProcessorCount=4 -Divy.home=/tmp/.ivy -Dos.name=Linux -cp '/opt/spark/conf/:/opt/spark/jars/*:/bin/jars/*' -Xmx512m org.apache.spark.deploy.SparkSubmit --jars /opt/spark/examples/jars/spark-examples_2.12-3.1.2.jar,/opt/spark/examples/jars/scopt_2.12-3.7.1.jar --class org.apache.spark.examples.SparkPi spark-internal
Error: AESM service is not started yet. Need to start it first
+ cd ../

I found that the Error message is defined in occlum init phase.
I want to know how to solve this? Does it is beacuse I use AlibabaCloud ECS or I should build the image instead of pulling it from dockerhub?

Avoid occlum build and init during deployment

Occlum init and build requires enclave key which is not safe to mount in deployment env. Meanwhile, this stage takes 30-60s in 100-200s running time. Need to remove it from deployed image.

  • Base image for init and build
  • Deploy with occlum instance

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.