GithubHelp home page GithubHelp logo

Comments (6)

hbutani avatar hbutani commented on July 1, 2024

Ok will take look tomorrow.
Thanks for being patient and continuing to test. Glad the rank function finally works.
Will take at the null value issue.

from sqlwindowing.

hbutani avatar hbutani commented on July 1, 2024

There were 2 issues with the query:

  • you had 'r' in the select
  • you have to explicitly specify the datatype of the expression, if you don't it assumes it is a double.

The query needed to be:
from emp
partition by department_id
order by department_id, salary desc
select department_id, employee_id, salary,
< lag('salary',1) - salary > as salary_gap[int]
into path='/tmp/wout';

But:

  • should have given you an error message about the job failing; in fact in this case should have caught the issue in the parser. I am looking at this issue.
  • should automatically cast from int to double; will look into this issue also

from sqlwindowing.

hbutani avatar hbutani commented on July 1, 2024

I have fixed couple of the issues:

  • check for unknown columns in select list
  • expose errors in MR job on console; you should see a FAIL status and message about the failure.

Have not fixed the issue about specifying the datatype explicitly. I want to first add code to analyze Groovy expressions to infer the datatype of an expression. Just adding datatype conversion to the engine to convert to the datatype of the output column may add a lot of overhead. Let me think about this.

from sqlwindowing.

java8964 avatar java8964 commented on July 1, 2024

OK. So I have to specify the type, in fact, the salary column is a double type column.

Any way, the following query works as I expected:

from employees
partition by department_id
order by department_id, salary desc
select department_id, employee_id, salary,
< lag('salary',1) > as salary_pre[double]
into path='/tmp/wout'

But I am not sure why the first_value function is not working. When I run the following query, no error, but still, not final file is generated in /tmp/wout:

from employees
partition by department_id
order by department_id, salary desc
select department_id, employee_id, salary,
< first_value('salary') > as high_salary[double]
into path='/tmp/wout'

Both queries are almost the same, except using different functions

from sqlwindowing.

hbutani avatar hbutani commented on July 1, 2024

first_value needs to be used in a windowing clause; so your query should be written as

from employees
partition by department_id
order by department_id, salary desc
with
first_value('salary') as high_salary
select department_id, employee_id, salary, high_salary

Lead and Lag are usable in both windowing clauses and in select statements. In select statements you can use them to do delta computations, ratios etc. These are calculations that happen on the output after the windowing clauses are computed.

  • yes since salary is double a lag expression shouldn't have to specify the type. I'm working on this.
  • I haven't uploaded a new jar with the changes that print out the FAIL status. Let me do this. You will have to replace your jar with this new one.

from sqlwindowing.

java8964 avatar java8964 commented on July 1, 2024

Thanks. The first_value works fine. I will close this issue.

from sqlwindowing.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.