GithubHelp home page GithubHelp logo

apache / incubator-gluten Goto Github PK

View Code? Open in Web Editor NEW
1.0K 38.0 363.0 180.66 MB

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.

Home Page: https://gluten.apache.org/

License: Apache License 2.0

Java 5.80% Scala 67.03% Shell 1.27% C++ 24.52% CMake 1.10% Python 0.23% Dockerfile 0.03% C 0.01% Makefile 0.01% PowerShell 0.02%
clickhouse simd spark-sql vectorization velox arrow

incubator-gluten's Issues

Seperate the base layer and backend layer

The base layer will include some common code and configs used by every backend. The backend layer will include some specific code and configs used by that backend only. In this way, each backend will use its own specific layer based on the base layer. The computings for different backends will be well seperated.

Run spark-shell with gazelle-jni-jvm-1.2.0-snapshot-jar-with-dependencies.jar failed.

Run spark-shell with gazelle-jni-jvm-1.2.0-snapshot-jar-with-dependencies.jar failed.
Run this on Ububtu 20.04. The command is shown below:
1647307968(1)

Then i check the libaray libspark_columnar_jni.so with ldd. There are some undefined symbol errors.

root@ubuntu:/home/gazelle/gazelle-jni/cpp/build/releases# ldd -r libspark_columnar_jni.so
        linux-vdso.so.1 (0x00007ffcd0ae1000)
        libprotobuf.so.17 => /lib/x86_64-linux-gnu/libprotobuf.so.17 (0x00007f03889ec000)
        libdouble-conversion.so.3 => /lib/x86_64-linux-gnu/libdouble-conversion.so.3 (0x00007f03889d6000)
        libsnappy.so.1 => /lib/x86_64-linux-gnu/libsnappy.so.1 (0x00007f03889cb000)
        libglog.so.0 => /usr/local/lib/libglog.so.0 (0x00007f0388984000)
        libarrow.so.400 (0x00007f038724f000)
        libgandiva.so.400 (0x00007f0384ff7000)
        libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f0384e13000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f0384cc4000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f0384ca9000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f0384ab7000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f038b0b5000)
        libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f0384a9b000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f0384a78000)
        libgflags.so.2.2 => /usr/local/lib/libgflags.so.2.2 (0x00007f0384a49000)
        libunwind.so.8 => /lib/x86_64-linux-gnu/libunwind.so.8 (0x00007f0384a2c000)
        libcrypto.so.1.1 => /lib/x86_64-linux-gnu/libcrypto.so.1.1 (0x00007f0384756000)
        libssl.so.1.1 => /lib/x86_64-linux-gnu/libssl.so.1.1 (0x00007f03846c3000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f03846bd000)
        librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f03846b2000)
        libcurl.so.4 => /lib/x86_64-linux-gnu/libcurl.so.4 (0x00007f038461f000)
        liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f03845f6000)
        libnghttp2.so.14 => /lib/x86_64-linux-gnu/libnghttp2.so.14 (0x00007f03845cd000)
        libidn2.so.0 => /lib/x86_64-linux-gnu/libidn2.so.0 (0x00007f03845ac000)
        librtmp.so.1 => /lib/x86_64-linux-gnu/librtmp.so.1 (0x00007f038458c000)
        libssh.so.4 => /lib/x86_64-linux-gnu/libssh.so.4 (0x00007f038451c000)
        libpsl.so.5 => /lib/x86_64-linux-gnu/libpsl.so.5 (0x00007f0384509000)
        libgssapi_krb5.so.2 => /lib/x86_64-linux-gnu/libgssapi_krb5.so.2 (0x00007f03844bc000)
        libldap_r-2.4.so.2 => /lib/x86_64-linux-gnu/libldap_r-2.4.so.2 (0x00007f0384466000)
        liblber-2.4.so.2 => /lib/x86_64-linux-gnu/liblber-2.4.so.2 (0x00007f0384455000)
        libbrotlidec.so.1 => /lib/x86_64-linux-gnu/libbrotlidec.so.1 (0x00007f0384447000)
        libunistring.so.2 => /lib/x86_64-linux-gnu/libunistring.so.2 (0x00007f03842c3000)
        libgnutls.so.30 => /lib/x86_64-linux-gnu/libgnutls.so.30 (0x00007f03840ed000)
        libhogweed.so.5 => /lib/x86_64-linux-gnu/libhogweed.so.5 (0x00007f03840b6000)
        libnettle.so.7 => /lib/x86_64-linux-gnu/libnettle.so.7 (0x00007f038407c000)
        libgmp.so.10 => /lib/x86_64-linux-gnu/libgmp.so.10 (0x00007f0383ff8000)
        libkrb5.so.3 => /lib/x86_64-linux-gnu/libkrb5.so.3 (0x00007f0383f1b000)
        libk5crypto.so.3 => /lib/x86_64-linux-gnu/libk5crypto.so.3 (0x00007f0383ee8000)
        libcom_err.so.2 => /lib/x86_64-linux-gnu/libcom_err.so.2 (0x00007f0383ee1000)
        libkrb5support.so.0 => /lib/x86_64-linux-gnu/libkrb5support.so.0 (0x00007f0383ed2000)
        libresolv.so.2 => /lib/x86_64-linux-gnu/libresolv.so.2 (0x00007f0383eb6000)
        libsasl2.so.2 => /lib/x86_64-linux-gnu/libsasl2.so.2 (0x00007f0383e99000)
        libgssapi.so.3 => /lib/x86_64-linux-gnu/libgssapi.so.3 (0x00007f0383e54000)
        libbrotlicommon.so.1 => /lib/x86_64-linux-gnu/libbrotlicommon.so.1 (0x00007f0383e2f000)
        libp11-kit.so.0 => /lib/x86_64-linux-gnu/libp11-kit.so.0 (0x00007f0383cf9000)
        libtasn1.so.6 => /lib/x86_64-linux-gnu/libtasn1.so.6 (0x00007f0383ce3000)
        libkeyutils.so.1 => /lib/x86_64-linux-gnu/libkeyutils.so.1 (0x00007f0383cdc000)
        libheimntlm.so.0 => /lib/x86_64-linux-gnu/libheimntlm.so.0 (0x00007f0383cd0000)
        libkrb5.so.26 => /lib/x86_64-linux-gnu/libkrb5.so.26 (0x00007f0383c3b000)
        libasn1.so.8 => /lib/x86_64-linux-gnu/libasn1.so.8 (0x00007f0383b94000)
        libhcrypto.so.4 => /lib/x86_64-linux-gnu/libhcrypto.so.4 (0x00007f0383b5c000)
        libroken.so.18 => /lib/x86_64-linux-gnu/libroken.so.18 (0x00007f0383b43000)
        libffi.so.7 => /lib/x86_64-linux-gnu/libffi.so.7 (0x00007f0383b37000)
        libwind.so.0 => /lib/x86_64-linux-gnu/libwind.so.0 (0x00007f0383b0d000)
        libheimbase.so.1 => /lib/x86_64-linux-gnu/libheimbase.so.1 (0x00007f0383af9000)
        libhx509.so.5 => /lib/x86_64-linux-gnu/libhx509.so.5 (0x00007f0383aab000)
        libsqlite3.so.0 => /lib/x86_64-linux-gnu/libsqlite3.so.0 (0x00007f0383982000)
        libcrypt.so.1 => /lib/x86_64-linux-gnu/libcrypt.so.1 (0x00007f0383947000)
undefined symbol: _ZN3fLB10FLAGS_avx2E  (./libspark_columnar_jni.so)
undefined symbol: _ZN3fLB32FLAGS_velox_exception_stacktraceE    (./libspark_columnar_jni.so)
undefined symbol: _ZN3fLB10FLAGS_bmi2E  (./libspark_columnar_jni.so)
undefined symbol: _ZN3fLI46FLAGS_velox_exception_stacktrace_rate_limit_msE      (./libspark_columnar_jni.so)
undefined symbol: _ZN3fLB22FLAGS_velox_use_mallocE      (./libspark_columnar_jni.so)
undefined symbol: _ZNK8facebook5velox7process10StackTrace8toStringB5cxx11Ev     (./libspark_columnar_jni.so)
undefined symbol: _ZN5boost16re_detail_10710013put_mem_blockEPv (./libspark_columnar_jni.so)
undefined symbol: _ZN5boost13match_resultsIN9__gnu_cxx17__normal_iteratorIPKcNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEESaINS_9sub_matchISB_EEEE12maybe_assignERKSF_  (./libspark_columnar_jni.so)
undefined symbol: _ZN8facebook5velox8encoding6Base646encodeB5cxx11EN5folly5RangeIPKcEE  (./libspark_columnar_jni.so)
undefined symbol: _ZN8facebook5velox4dwio6common10encryptioneqERKNS3_20EncryptionPropertiesES6_ (./libspark_columnar_jni.so)
undefined symbol: _ZN8facebook5velox8encoding6Base6420calculateDecodedSizeEPKcRmb       (./libspark_columnar_jni.so)
undefined symbol: event_base_new        (./libspark_columnar_jni.so)
undefined symbol: _ZN4date11locate_zoneESt17basic_string_viewIcSt11char_traitsIcEE      (./libspark_columnar_jni.so)
undefined symbol: event_active  (./libspark_columnar_jni.so)
undefined symbol: _ZN8facebook5velox4dwrf10ProtoUtils9writeTypeERKNS0_4TypeERNS1_5proto6FooterEPNS6_4TypeE      (./libspark_columnar_jni.so)
undefined symbol: _ZN8facebook5velox7process12TraceContext10statusLineB5cxx11Ev (./libspark_columnar_jni.so)
undefined symbol: jump_fcontext (./libspark_columnar_jni.so)
undefined symbol: event_add     (./libspark_columnar_jni.so)
undefined symbol: _ZN8facebook5velox7process10StackTraceC1Ei    (./libspark_columnar_jni.so)
undefined symbol: _ZN5boost13match_resultsIPKcSaINS_9sub_matchIS2_EEEE12maybe_assignERKS6_      (./libspark_columnar_jni.so)
undefined symbol: ZSTD_getErrorName     (./libspark_columnar_jni.so)
undefined symbol: _ZN8facebook5velox7process12TraceContextD1Ev  (./libspark_columnar_jni.so)
undefined symbol: event_base_set        (./libspark_columnar_jni.so)
undefined symbol: _ZN5boost16re_detail_10710012perl_matcherIN9__gnu_cxx17__normal_iteratorIPKcNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEESaINS_9sub_matchISC_EEENS_12regex_traitsIcNS_16cpp_regex_traitsIcEEEEE14construct_initERKNS_11basic_regexIcSJ_EENS_15regex_constants12_match_flagsE   (./libspark_columnar_jni.so)
undefined symbol: _ZN8facebook5velox8encoding6Base646decodeEPKcmPc      (./libspark_columnar_jni.so)
undefined symbol: _ZN8facebook5velox8encoding6Base646encodeEPKcmPc      (./libspark_columnar_jni.so)
undefined symbol: _ZN5boost16re_detail_10710019raise_runtime_errorERKSt13runtime_error  (./libspark_columnar_jni.so)
undefined symbol: ZSTD_decompress       (./libspark_columnar_jni.so)
undefined symbol: event_base_free       (./libspark_columnar_jni.so)
undefined symbol: _ZN8facebook5velox8encoding6Base6420calculateEncodedSizeEmb   (./libspark_columnar_jni.so)
undefined symbol: _ZN5boost11basic_regexIcNS_12regex_traitsIcNS_16cpp_regex_traitsIcEEEEE9do_assignEPKcS7_j     (./libspark_columnar_jni.so)
undefined symbol: _ZN5boost16re_detail_10710013get_mem_blockEv  (./libspark_columnar_jni.so)
undefined symbol: _ZN5boost16re_detail_10710014verify_optionsEjNS_15regex_constants12_match_flagsE      (./libspark_columnar_jni.so)
undefined symbol: event_set     (./libspark_columnar_jni.so)
undefined symbol: ZSTD_getFrameContentSize      (./libspark_columnar_jni.so)
undefined symbol: _ZN8facebook5velox8encoding6Base646decodeB5cxx11EN5folly5RangeIPKcEE  (./libspark_columnar_jni.so)
undefined symbol: event_base_loop       (./libspark_columnar_jni.so)
undefined symbol: _ZN5boost16re_detail_10710012perl_matcherIPKcSaINS_9sub_matchIS3_EEENS_12regex_traitsIcNS_16cpp_regex_traitsIcEEEEE14construct_initERKNS_11basic_regexIcSA_EENS_15regex_constants12_match_flagsE       (./libspark_columnar_jni.so)
undefined symbol: event_del     (./libspark_columnar_jni.so)
undefined symbol: _ZNK5boost16re_detail_10710031cpp_regex_traits_implementationIcE17transform_primaryB5cxx11EPKcS4_     (./libspark_columnar_jni.so)
undefined symbol: _ZN8facebook5velox7process12TraceContextC1ENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEb      (./libspark_columnar_jni.so)
undefined symbol: _ZNK4date9time_zone13get_info_implENSt6chrono10time_pointINS1_3_V212system_clockENS1_8durationIlSt5ratioILl1ELl1EEEEEE        (./libspark_columnar_jni.so)
undefined symbol: ZSTD_isError  (./libspark_columnar_jni.so)
undefined symbol: _ZNK5boost16re_detail_10710031cpp_regex_traits_implementationIcE9transformB5cxx11EPKcS4_      (./libspark_columnar_jni.so)
undefined symbol: LZ4_decompress_safe   (./libspark_columnar_jni.so)
undefined symbol: _ZN8facebook5velox4dwio6common9exception18getExceptionLoggerEv        (./libspark_columnar_jni.so)
undefined symbol: event_get_version     (./libspark_columnar_jni.so)
undefined symbol: _ZN5boost16re_detail_10710024get_default_error_stringENS_15regex_constants10error_typeE       (./libspark_columnar_jni.so)
undefined symbol: event_base_loopbreak  (./libspark_columnar_jni.so)
undefined symbol: make_fcontext (./libspark_columnar_jni.so)
undefined symbol: _ZN8facebook5velox4dwio6common11compression13lzoDecompressEPKcS5_PcS6_        (./libspark_columnar_jni.so)
undefined symbol: ZSTD_getErrorCode     (./libspark_columnar_jni.so)
undefined symbol: _ZNK4date9time_zone13get_info_implENSt6chrono10time_pointINS_7local_tENS1_8durationIlSt5ratioILl1ELl1EEEEEE   (./libspark_columnar_jni.so)
undefined symbol: ZSTD_compress (./libspark_columnar_jni.so)
undefined symbol: _ZN8facebook5velox35DeserializationRegistryForSharedPtrB5cxx11Ev      (./libspark_columnar_jni.so)
undefined symbol: event_base_get_method (./libspark_columnar_jni.so)

For example undefined symbol: _ZN3fLB10FLAGS_avx2E (./libspark_columnar_jni.so), I use c++filt to show the function name:

root@ubuntu:/home/gazelle/gazelle-jni/cpp/build/releases# c++filt _ZN3fLB10FLAGS_avx2E
fLB::FLAGS_avx2

The function FLAGS_avx2 is used by velox, but I can not find the defination of it.
I have no idea what to do next. Someone can help?
I compile gazelle_jni on branch velox_dev, compile velox on branch substrait ,So @rui-mo , can you give me some help?

Remove Alias

Currently, Spark has the Alias expression to assign a new name to a computation. But due to Substrait is index-based, this expression is unneeded. Do we need to remove Alias?

Do we need to exclude Pre-Projection from Aggregate?

  • TPC-H Q6's Aggregation includes:
    Pre-Projection (Multiply)
    Aggregate (Sum)
    Post-Projection (Cast to String)

In a local development branch, I have excluded Post-Projection from Aggregate in Scala side by creating a new ProjectRel when needed. Do we need to do that for Pre-Projection?

Use unified Jni interfaces

Below parts need to be cleaned and unified:

  • ExpressionEvaluator
  • ExpressionEvaluatorJniWrapper
  • BatchIterator.java
  • JniUtils and JniInstance
  • createNativeKernelWithIterator
  • add a config to decide whether to load Gandiva, Arrow libraries

Fix some fallback issues

Currently, there are some fallback issues when SparkPlan is SerializeFromObjectExec, ObjectHashAggregateExec and V2CommandExec, for example:

val tookTimeArr = Array(12, 23, 56, 100, 500, 20)
import spark.implicits._
val df = spark.sparkContext.parallelize(tookTimeArr.toSeq, 1).toDF("time")
df.summary().show(100, false)

When executing the above code, it will return a 'null' result.

Use the unified function names with Substrait

We previously used self-defined function names, which causes difficulty for the backends to use. Therefore, we need to change to use the unified names specified in Substrait yaml files.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.