cityoftoronto / bdit_pgutils Goto Github PK
View Code? Open in Web Editor NEWUseful postgresql functions for our work
License: GNU General Public License v3.0
Useful postgresql functions for our work
License: GNU General Public License v3.0
"columns_new_line" and "columns_no_new_line" in get_column_info_table
function don't include schema name.
"SELECT
v1.volume_uid,
v1.detector_id,
v1.datetime_bin,
v1.volume_15min,
v1.arterycode
FROM rescu. volumes_15min AS v1"
When trying to save the dependencies for traffic.arterydata
, we encountered an infinite recursion.
I narrowed it down to: VIEW open_data.volumes_atr_vehicles_shortterm
and MATERIALIZED VIEW open_data_staging.volumes_atr_shortterm_exceptions
which refer to each other.
DROP VIEW open_data.volumes_atr_vehicles_shortterm;
CREATE OR REPLACE VIEW open_data.volumes_atr_vehicles_shortterm
AS
SELECT flow_atr.centreline_id,
flow_atr.direction,
flow_atr.location,
flow_atr.class_type,
flow_atr.datetime_bin,
flow_atr.volume_15min
FROM open_data.flow_atr
WHERE flow_atr.station_type = 'Short Term'::text AND flow_atr.volume_15min >= 0::numeric AND NOT (EXISTS ( SELECT exceptions.datetime_bin,
exceptions.location
FROM gwolofs.volumes_atr_shortterm_exceptions exceptions
WHERE exceptions.datetime_bin = flow_atr.datetime_bin::date AND exceptions.location::text = flow_atr.location::text));
CREATE MATERIALIZED VIEW IF NOT EXISTS open_data_staging.volumes_atr_shortterm_exceptions
TABLESPACE pg_default
AS
SELECT o1.datetime_bin::date AS datetime_bin,
o1.location
FROM open_data.volumes_atr_vehicles_shortterm o1
JOIN open_data.volumes_atr_vehicles_shortterm o2 ON o1.location::text = o2.location::text
WHERE
CASE
WHEN (o1.datetime_bin - o2.datetime_bin) < '00:00:00'::interval THEN - (o1.datetime_bin - o2.datetime_bin)
ELSE o1.datetime_bin - o2.datetime_bin
END <= '01:00:00'::interval AND (o1.volume_15min > 20::numeric AND o2.volume_15min = 0::numeric OR o1.volume_15min < 450::numeric AND o2.volume_15min > 1000::numeric)
UNION
SELECT o2.datetime_bin::date AS datetime_bin,
o2.location
FROM open_data.volumes_atr_vehicles_shortterm o1
JOIN open_data.volumes_atr_vehicles_shortterm o2 ON o1.location::text = o2.location::text
WHERE
CASE
WHEN (o1.datetime_bin - o2.datetime_bin) < '00:00:00'::interval THEN - (o1.datetime_bin - o2.datetime_bin)
ELSE o1.datetime_bin - o2.datetime_bin
END <= '01:00:00'::interval AND (o1.volume_15min > 20::numeric AND o2.volume_15min = 0::numeric OR o1.volume_15min < 450::numeric AND o2.volume_15min > 1000::numeric)
UNION
SELECT v.datetime_bin::date AS datetime_bin,
v.location
FROM open_data.volumes_atr_vehicles_shortterm v
GROUP BY (v.datetime_bin::date), v.location
HAVING count(*) <= 3 AND (max(v.volume_15min) > 1000::numeric OR min(v.volume_15min) <= 5::numeric) OR avg(v.volume_15min) = 0::numeric
WITH DATA;
</details>
Create two scripts to export and import:
When applying this function on wys.mobile_sign_installations
, the order of the sql to execute was incorrect (tried to create open_data.wys_mobile_detailed
before wys.mobile_api_id
).
This commit is at least partially to blame.
I've noticed that the public.harmean
function can be around 3x slower than doing e.g.
COUNT(*) / SUM( 1 / var )
See also, on literally this exact same topic:
https://dba.stackexchange.com/questions/243804/user-defined-harmonic-mean-function-performs-worse-than-query-in-postgresql-9-6
Is it possible that the numeric type is to blame?
However, calculations on numeric values are very slow compared to the integer types, or to the floating-point types described in the next section.
from https://www.postgresql.org/docs/current/datatype-numeric.html#DATATYPE-NUMERIC-DECIMAL
Create a bash script to install/upgrade Airflow in a Python venv
. The script should:
venv
Will help to simplify join conditions like this:
https://github.com/CityofToronto/bdit_data-sources/blob/22c329d5752df539b22183d52779a955461731a1/dags/sql/select-row_count_lookback.sql#L42C5-L47C6
Create functions that take a specific timestamp and return the corresponding timestamp for that time bin:
CREATE FUNCTION bin_5min (ts timestamp)
CREATE FUNCTION bin_15min (ts timestamp)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.