Comments (6)
Ok will take look tomorrow.
Thanks for being patient and continuing to test. Glad the rank function finally works.
Will take at the null value issue.
from sqlwindowing.
There were 2 issues with the query:
- you had 'r' in the select
- you have to explicitly specify the datatype of the expression, if you don't it assumes it is a double.
The query needed to be:
from emp
partition by department_id
order by department_id, salary desc
select department_id, employee_id, salary,
< lag('salary',1) - salary > as salary_gap[int]
into path='/tmp/wout';
But:
- should have given you an error message about the job failing; in fact in this case should have caught the issue in the parser. I am looking at this issue.
- should automatically cast from int to double; will look into this issue also
from sqlwindowing.
I have fixed couple of the issues:
- check for unknown columns in select list
- expose errors in MR job on console; you should see a FAIL status and message about the failure.
Have not fixed the issue about specifying the datatype explicitly. I want to first add code to analyze Groovy expressions to infer the datatype of an expression. Just adding datatype conversion to the engine to convert to the datatype of the output column may add a lot of overhead. Let me think about this.
from sqlwindowing.
OK. So I have to specify the type, in fact, the salary column is a double type column.
Any way, the following query works as I expected:
from employees
partition by department_id
order by department_id, salary desc
select department_id, employee_id, salary,
< lag('salary',1) > as salary_pre[double]
into path='/tmp/wout'
But I am not sure why the first_value function is not working. When I run the following query, no error, but still, not final file is generated in /tmp/wout:
from employees
partition by department_id
order by department_id, salary desc
select department_id, employee_id, salary,
< first_value('salary') > as high_salary[double]
into path='/tmp/wout'
Both queries are almost the same, except using different functions
from sqlwindowing.
first_value needs to be used in a windowing clause; so your query should be written as
from employees
partition by department_id
order by department_id, salary desc
with
first_value('salary') as high_salary
select department_id, employee_id, salary, high_salary
Lead and Lag are usable in both windowing clauses and in select statements. In select statements you can use them to do delta computations, ratios etc. These are calculations that happen on the output after the windowing clauses are computed.
- yes since salary is double a lag expression shouldn't have to specify the type. I'm working on this.
- I haven't uploaded a new jar with the changes that print out the FAIL status. Let me do this. You will have to replace your jar with this new one.
from sqlwindowing.
Thanks. The first_value works fine. I will close this issue.
from sqlwindowing.
Related Issues (20)
- npath observations HOT 1
- can not start the windowCli HOT 13
- Persistent Partitions failing when creating Temp directory HOT 1
- Parse Error: line 1:15 required (...)+ loop did not match anything at input 'partition' in statement HOT 10
- java.lang.reflect.InvocationTargetException HOT 20
- Service not found HOT 3
- looks for abc.<hivetable> HOT 22
- job was killed,but process running.. HOT 1
- Error when starting windowCli HOT 3
- mismatched input '/tmp/hadhive/' HOT 7
- the temp table can not find HOT 8
- enhancement the insert syntax
- Unknown partition column Symbol HOT 4
- No signature of method HOT 15
- Support for non-default db HOT 1
- Windowing function output path syntax HOT 1
- download the test data
- about com.sap.hadoop.windowing-0.0.2-SNAPSHOT.jar HOT 3
- SQL Windowing doesnt work on Hive 0.10
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sqlwindowing.