GithubHelp home page GithubHelp logo

colpivot's Introduction

Colpivot

Dynamic row to column pivotation/transpose in Postgres made simple.

Author: Hannes Landeholm [email protected]

colpivot.sql defines a single Postgres function:

create or replace function colpivot(
    out_table varchar, in_query varchar,
    key_cols varchar[], class_cols varchar[],
    value_e varchar, col_order varchar
) returns void

The colpivot() function groups data by a specific key and returns a column for every unique class with a specified value expression in a new destination temporary output table. The result is returned in a new temporary table.

This is the similar to the problem solved by crosstab. However, colpivot() can be used with completely dynamic data. With crosstab you MUST know what categories/classes you have before hand. The crosstab function is also incompatible with multiple key or category/class columns.

Overall, the benefits of colpivot() benefits are:

  • Completely dynamic, no row specification required.
  • Supports multiple rows and classes/attributes columns.
  • Gives complete control over output columns order and limit.
  • Easier to understand and use.
  • Does not require loading extra modules.
  • Does not require writing extra functions.

See also this thread on Stack Overflow.

Essentially the function transposes query results with columns like:

 key1, key2, ..., keyN, class1, class2, ... classN, *

to:

 key1, key2, ..., keyN, classC1, classC2, ..., classCN

The output is undefined if the input query has more than one row with the same (key1, key2, ..., keyN, class1, class2, ... classN) value. In most real world use cases of colpivot() it is sensible to have a primary key or unique index over these columns when the input query selects a table directly or using an input query with a distinct/group by over these columns when uniqueness is not guaranteed.

  • The output will have one row per unique key combination.
  • The output will have one class column per unique class combination.
  • The value of the corresponding class column is the result of the specified expression or null if the corresponding class combination has no associated key.

To avoid having to specify an output column definition (since there is many real world cases where this is not possible to say beforehand) the result is stored in a temporary table with the specified name. It is impossible to return a result set with dynamic columns in Postgres without using temporary tables.

Parameter reference:

  • out_table - Unquoted new temporary table to create. The table is deleted when the transaction ends (on commit drop).
  • in_query [1] - Query to run that generates source data to colpivot().
  • key_cols - Array of unquoted key columns.
  • class_cols - Array of unquoted class columns.
  • value_e [1] - Value expression. You must use the # token as an alias if you are referencing a column in the input result. For example, specify #.salary' instead of 'salary'.
  • col_order [1] - Column order. Specify as null to simply use the sorted classes. This is useful if you want another specific column order. For example, you may want to have the class with the highest salary as the first column. In that case you can specify sum(salary) desc. You can also (ab)use this parameter to limit the number of columns returned like: sum(salary) desc limit 10 to only get the 10 highest salaries.

[1] These parameters are concatenated directly in evaluated queries to allow maximum flexibility for the caller and therefore unsafe. Ensure that you are not feeding dirty/non-validated/unquoted data into these parameters as that will allow a SQL injection exploit.

Example / Test

begin;

create temp table _test (
    year int,
    month int,
    country varchar,
    state varchar,
    income int
) on commit drop;

insert into _test values
    (1985, 01, 'sweden', '', 10),
    (1985, 01, 'denmark', '', 11),
    (1985, 01, 'usa', 'washington', 13),
    (1985, 02, 'sweden', '', 20),
    (1985, 02, 'usa', 'washington', 21),
    (1985, 03, 'sweden', '', 34),
    (1985, 03, 'denmark', '', 31),
    (1985, 03, 'usa', 'washington', 39),
    (1990, 12, 'sweden', '', 42),
    (1990, 12, 'denmark', '', 43),
    (1990, 12, 'usa', 'washington', 49),
    (1990, 12, 'germany', '', 45);

select colpivot('_test_pivoted', 'select * from _test',
    array['year', 'month'], array['country', 'state'], '#.income', null);

select * from _test_pivoted order by year, month;

-- returns:
--  year | month | 'denmark', '' | 'germany', '' | 'sweden', '' | 'usa', 'washington'
-- ------+-------+---------------+---------------+--------------+---------------------
--  1985 |     1 |            11 |               |           10 |                  13
--  1985 |     2 |               |               |           20 |                  21
--  1985 |     3 |            31 |               |           34 |                  39
--  1990 |    12 |            43 |            45 |           42 |                  49
-- (4 rows)

rollback;

Licence

MPLv2 (https://www.mozilla.org/MPL/2.0/)

colpivot's People

Contributors

hnsl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

colpivot's Issues

Error when querying an empty table

Hi, I really appreciate your work since it has solved a problem at my company in which we also needed to pivot an undefined number of results to columns.

Right now I am trying to investigate how can I avoid an ERROR when the table that I am querying is empty:

ERROR: upper bound of FOR loop cannot be null CONTEXT: PL/pgSQL function colpivot(character varying,character varying,character varying[],character varying[],character varying,character varying) line 67 at FOR with integer loop variable

Can you give any help with this ?

execute in one single statement

from this code
`select * from colpivot('_output', $$
select "StudentEnrollmentID" as a,value, name, "ComponentID" as q, "order" as o
from "university"."CourseSchedulledAssessmentComponent"
LEFT JOIN "university"."AssessmentResultPerComponent" ON "AssessmentResultPerComponent"."ComponentID" = "university"."CourseSchedulledAssessmentComponent"."ID"
where "CourseSchedulledAssessmentComponent"."UniversityID" = 1001 AND "CourseSchedulledAssessmentComponent"."CourseSchedulledID" = 1008
$$, array['a'], array['q'], '#.value', 'sum(o) asc');

select * from _output
`
is it possible to execute this in single statement?
Thanks

The temporary table doesn't create

Hi,

I tried with the sample attached in documentation but it doesn't work for me. The table didn't create.
PostgreSQL 9.5.5 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.2) 5.4.0 20160609, 64-bit

Returning the select statement or temp table vs. void

I am trying to use this package, but I am running into complications.

Mainly, the colpivot function returns multiple results. Is there a way to isolate the results that are returned?

i.e if I do this...

select colpivot(params) ---> returns temp_table
select * from temp_table
END TRANSACTION

It will return multiple results --- one for colpivot and one for the temp_table.

If I try this...

with cte as (SELECT colpivot(params), * from some_table 
LEFT JOIN some_table on temp_table)
END TRANSACTION

It will not allow the left join because the table does not exist before the query begins.

The only thing I can think is to modify what is being returned from the function. However, I have no idea the return type because the columns are dynamic.

Column / field names with blank spaces

Great function! If the column to be pivoted contains blank spaces the function returns an empty set. If you would include double quotation marks in line 30:

 query := query || 'quote_literal("' || quote_ident(col) || '")' 

this issue can be circumvented.

duplicates rows

Hi there. Nice function. Just found it on Stackoverflow...
However, I get duplicate rows in some cases. Unfortunately I could not create a test case... My query:

select colpivot('_test_pivoted', 'select * from input_data.building_elements',
array['type','constr_style'], array['year'], '#.market_share', null);

select * from _test_pivoted order by type, constr_style;

Any hints would be appreciated.

exception encountered Values [well100] cannot be coerced to []

Hi,
I'm using this function with postgres vertex reactive client. vertx-pg-client.

I'm trying to call this function with parameter but I get cannot be coerced to [] exception,

Can you guide me how can I resolve this issue.

public Completable testStreamWideTagHistoryv1(HistoryReqModel req) {
        logger.info("doing stream wide tag query");

        return txBegin()
                .flatMapCompletable(tx -> 
                    tx.rxPreparedQuery(
                        "select colpivot('test_output', 'select rt.v as agg_v, rt.t as bucket, rt.tag_id as tagged from rtdata rt where rt.asset_name = $1 ' "
                                + ", array['bucket'], array['tagged'], '#.agg_v', null)",
                        Tuple.of(req.getAssetName()))
                    .flatMapCompletable(result -> tx.rxCommit()));
    }

Mixed-case columns

It would be good to include a note in the README about mixed-case columns.

The code does double-quote column names, so mixed-case columns can be used.

However, there are two cases where the user has to do the quoting themselves.

  • value_e
  • col_order

Example cross-tabulating grading results by lesson:

SELECT colpivot(
	'_test_pivoted', 
	'select * FROM n_correct_for_course_by_lesson_s(''ANON'')', 
	array['userCode'], 
	array['lessonCode'],
	'#."percentCorrect"',
	'MAX("lessonOrder")'   -- bizarre syntactical requirement
);
select * from _test_pivoted ORDER BY "userCode";

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.