GithubHelp home page GithubHelp logo

impala-get-json-object-udf's People

Contributors

geetuji avatar illes avatar nazgul33 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

impala-get-json-object-udf's Issues

Segfault with Impala 2.6.0

When testing with Impala cdh5-2.6.0_5.8.0 on debian 7 (wheezy) x64, I get segfault on most calls:

> select json_get_object('{"name":"steven"}', '$.name');
Query: select json_get_object('{"name":"steven"}', '$.name')
Error communicating with impalad: TSocket read 0 bytes
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000000000811535, pid=8740, tid=140217639298816
#
# JRE version: Java(TM) SE Runtime Environment (7.0_80-b15) (build 1.7.0_80-b15)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  [impalad+0x411535]  rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator>::Malloc(unsigned long)+0x15
...

However, the following does not fail:

> select json_get_object('42', '$');
Query: select json_get_object('42', '$')
+------------------------------------+
| default.json_get_object('42', '$') |
+------------------------------------+
| 42                                 |
+------------------------------------+

A similar SEGFAULT was believed to be caused by multiple version rapidjson being present. Impala does include the old version 0.11 of rapidjson, while impala-get-json-object-udf seems to ship version 1.0.2. If this really is the root cause, I am wondering why I did not have such issues with Impala 2.2.0 also shipping rapidjson 0.11.

get_json_object in where condition

table data like this:

user_id real_name auth_status extend_info  
20005140 d3 3 {"kill": false, "memberType": 1}
20004911 d34 3 {"kill": false, "memberType": 1}
20005136 d44 3 {"kill": false, "killTime": "2018-02-10 10:10:54", "memberType": 3, "memberExpireTime": "2024-02-28 00:00:00"}
20004905 autotest 3 {"kill": false, "killTime": "2018-03-23 00:00:00", "memberType": 1}
20005133 autotest2 3 {"kill": false, "memberType": 1}

correctly sql:
select c1.username,c1.real_name,nvl2(c2.username,'0','1') as total,c2.user_id,c2.nn from consignor c1
left outer join
(select user_id,username, json_get_object(extend_info,'$.kill') as nn from consignor
) c2
on c1.user_id=c2.user_id where c2.username is NULL;

incorrectly sql: At the same time,if i run this sql,impala-deamon crushing.

select c1.username,c1.real_name,nvl2(c2.username,'0','1') as total,c2.user_id,c2.nn from consignor c1
left outer join
(select user_id,username, json_get_object(extend_info,'$.kill') as nn from consignor
where json_get_object(extend_info,'$.kill')='false' ) c2
on c1.user_id=c2.user_id where c2.username is NULL;


error message : Could not connect to AvatarTest2:21050 (code THRIFTTRANSPORT): TTransportException('Could not connect to AvatarTest2:21050',)

avatartest is my computer's hostname

it seems like "json function" can not in where condition??

BASE ON :
CDH 14.2
HUE 3.9
IMPALA 2.11.0

Nested JSON breaks connection to impala

Impala version 2.8
UDF breaks connection upon trying to deal with nested arrays
example JSON:
{"customer_info":[{"field_name":"family_names","field_value":"Gonzalez"},{"field_name":"given_names","field_value":"Pablo"}],"phone":null}

this works
select json_get_object('{"customer_info":[{"field_name":"family_names","field_value":"Gonzalez"},{"field_name":"given_names","field_value":"Pablo"}],"phone":null}','$.customer_info') ;

but this breaks impala
select json_get_object('{"customer_info":[{"field_name":"family_names","field_value":"Gonzalez"},{"field_name":"given_names","field_value":"Pablo"}],"phone":null}','$.customer_info.field_name') ;

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.