GithubHelp home page GithubHelp logo

unytics / bigfunctions Goto Github PK

View Code? Open in Web Editor NEW
509.0 7.0 43.0 11.36 MB

Supercharge BigQuery with BigFunctions

Home Page: https://unytics.io/bigfunctions/

License: MIT License

CSS 3.35% HTML 22.04% Python 66.46% Dockerfile 1.82% JavaScript 6.34%
bigquery data data-analytics data-visualization data-warehouse data-engineering

bigfunctions's Introduction

Hi there πŸ‘‹, I am Paul, an open-source Data-Product builder πŸ˜ƒ



From Head of Data to open-source Data-Product builder

As Head of Data at Nickel, I scaled data-organization from 3 to 100+ data-practitioners.

My vision is to give data-power to data-analysts by making them autonomous on the whole data-chain: from data-collection to data-algorithm-deployments
πŸ‘‰ we build data products for that.

I created Unytics to go further as a personal project
πŸ‘‰ to provide open-source data products to the worldwide data-community. πŸš€

logo (2)

bigfunctions's People

Contributors

3ska avatar anatolec avatar axelthevenot avatar batou9150 avatar benjitab avatar cariflo avatar ewibrahim avatar faridasadoun avatar furcypin avatar gpivette avatar iasonastr avatar jihene-cherif avatar jprotin avatar lepinet avatar marcyves-dgc avatar matrousseau avatar pcasteran avatar qchuchu avatar shivam221098 avatar sidalisadi avatar thomasellyatt avatar unytics avatar valentincordonnier avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

bigfunctions's Issues

[new]: `json2xml(json)`

Check the idea has not already been suggested

Edit the title above with self-explanatory function name and argument names

  • The function name and the argument names I entered in the title above seems self explanatory to me.

BigFunction Description as it would appear in the documentation

Takes in JSON as a string and returns parsed XML from JSON.

Examples of (arguments, expected output) as they would appear in the documentation

const json = JSON.stringify({
    "breakfast_menu": {
        "food": [
            {
                "name": "Belgian Waffles",
                "price": "$5.95",
                "description": "Two of our famous Belgian Waffles with plenty of real maple syrup",
                "calories": 650
            },
            {
                "name": "Strawberry Belgian Waffles",
                "price": "$7.95",
                "description": "Light Belgian waffles covered with strawberries and whipped cream",
                "calories": 900
            },
            {
                "name": "Berry-Berry Belgian Waffles",
                "price": "$8.95",
                "description": "Light Belgian waffles covered with an assortment of fresh berries and whipped cream",
                "calories": 900
            },
            {
                "name": "French Toast",
                "price": "$4.50",
                "description": "Thick slices made from our homemade sourdough bread",
                "calories": 600
            },
            {
                "name": "Homestyle Breakfast",
                "price": "$6.95",
                "description": "Two eggs, bacon or sausage, toast, and our ever-popular hash browns",
                "calories": 950
            }
        ]
    }
})

json2xml(json) -> "<breakfast_menu><food><name>Belgian Waffles</name><price>$5.95</price><description>Two of our famous Belgian Waffles with plenty of real maple syrup</description><calories>650</calories></food><food><name>Strawberry Belgian Waffles</name><price>$7.95</price><description>Light Belgian waffles covered with strawberries and whipped cream</description><calories>900</calories></food><food><name>Berry-Berry Belgian Waffles</name><price>$8.95</price><description>Light Belgian waffles covered with an assortment of fresh berries and whipped cream</description><calories>900</calories></food><food><name>French Toast</name><price>$4.50</price><description>Thick slices made from our homemade sourdough bread</description><calories>600</calories></food><food><name>Homestyle Breakfast</name><price>$6.95</price><description>Two eggs, bacon or sausage, toast, and our ever-popular hash browns</description><calories>950</calories></food></breakfast_menu>"

[new]: `xml_extract(xml, xpath)`

Check the idea has not already been suggested

Edit the title above with self-explanatory function name and argument names

  • The function name and the argument names I entered in the title above seems self explanatory to me.

BigFunction Description as it would appear in the documentation

Extract part of xml using xpath.

Inspired from https://github.com/stankiewicz/bigquery-xml-parser-udf

Examples of (arguments, expected output) as they would appear in the documentation

xml_extract("<customer><name>Paul</name></customer>", "/customer/name")
--> Paul

[bug]: `remove_strings`: it does not work

Check the bug has not already been reported

Edit function_name and the short error description in title above

  • I wrote the correct function name and a short error description in the title above

What happened and what did you expect?

The function remove_strings does not work with some strings :
When used to remove brackets, it does not do it.

To be precise, here is what happens :
select bigfunctions.eu.remove_strings('test_-test test()', ['_', '-',' ','(',')']) as cleaned_string
will result in :
testtesttest()
And
select bigfunctions.eu.remove_strings('test_-test test()', ['(',')','_', '-',' ']) as cleaned_string
will result in :
test_-test test()

[new]: `send_teams_message(message, webhook_url)`

Check the idea has not already been suggested

Edit the title above with self-explanatory function name and argument names

  • The function name and the argument names I entered in the title above seems self explanatory to me.

BigFunction Description as it would appear in the documentation

duplicate send_slack_message.yaml.
Change doc and assertion in code.

Teams incoming webhook doc:
https://learn.microsoft.com/en-us/microsoftteams/platform/webhooks-and-connectors/how-to/add-incoming-webhook?tabs=javascript

Examples of (arguments, expected output) as they would appear in the documentation

[new]: `json_query(json_string, query)`

Check the idea has not already been suggested

Edit the title above with self-explanatory function name and argument names

  • The function name and the argument names I entered in the title above seems self explanatory to me.

BigFunction Description as it would appear in the documentation

wrap https://github.com/jmespath/jmespath.js

Examples of (arguments, expected output) as they would appear in the documentation

  1. json_query('{"foo": [{"first": "a", "last": "b"}, {"first": "c", "last": "d"}]}', "foo[*].first")
    --> [ 'a', 'c' ]

json_query(
    '{"foo": [{"age": 20}, {"age": 25}, {"age": 30}, {"age": 35}, {"age": 40}]}',
    "foo[?age > `30`]"
)

--> [ { age: 35 }, { age: 40 } ]

[new]: `sleep(time ms)`

Check the idea has not already been suggested

Edit the title above with self-explanatory function name and argument names

  • The function name and the argument names I entered in the title above seems self explanatory to me.

BigFunction Description as it would appear in the documentation

Returns NULL after input time (ms)

Examples of (arguments, expected output) as they would appear in the documentation

  • sleep(1000) --> Returns NULL after 1 seconds
  • sleep(60000) --> Returns NULL after 60 seconds

[new]: `timestamp_from_unix_date_time(unix_date_time INT64, date_time_part STRING)`

Check the idea has not already been suggested

Edit the title above with self-explanatory function name and argument names

  • The function name and the argument names I entered in the title above seems self explanatory to me.

BigFunction Description as it would appear in the documentation

Interprets unix_date_time as the number of date_time_part since 1970-01-01 00:00:00 UTC. Truncates higher levels of precision by rounding down to the beginning of the date_time_part.

Examples of (arguments, expected output) as they would appear in the documentation

select bigfunctions.us.timestamp_from_unix_date_time(31, "YEAR") as year_from_unix
+--------------------------+
|  year_from_unix          |
+--------------------------+
|  2001-01-01 00:00:00 UTC |
+--------------------------+

[new]: `dlp_detect(text)`

Check the idea has not already been suggested

Edit the title above with self-explanatory function name and argument names

  • The function name and the argument names I entered in the title above seems self explanatory to me.

BigFunction Description as it would appear in the documentation

returns array of detected PII data with confidence

Examples of (arguments, expected output) as they would appear in the documentation

text="Write to [email protected] for any question"
--> ([email protected], email, confidence=0.999)

(inspired from https://github.com/GoogleCloudPlatform/bigquery-dlp-remote-function)

[new]: `deidentify(text, info_types)`

Check the idea has not already been suggested

Edit the title above with self-explanatory function name and argument names

  • The function name and the argument names I entered in the title above seems self explanatory to me.

BigFunction Description as it would appear in the documentation

Redacting sensitive data from text

https://cloud.google.com/dlp/docs/redacting-sensitive-data#example_text_redaction

Examples of (arguments, expected output) as they would appear in the documentation

[new]: `find_shortest_path(point1, point2, geography)`

Check the idea has not already been suggested

Edit the title above with self-explanatory function name and argument names

  • The function name and the argument names I entered in the title above seems self explanatory to me.

BigFunction Description as it would appear in the documentation

Find shortest path between two points.

Inspired from https://github.com/francois-baptiste/bigquery-routing

Examples of (arguments, expected output) as they would appear in the documentation

see with Francois Baptiste?

[new] `json_items(json_string)`

Takes a json_string as input which has flat (no nested) key values and returns an array<struct<key string, value string>>

[new]: `drop_dataset(dataset)`

Check the idea has not already been suggested

Edit the title above with self-explanatory function name and argument names

  • The function name and the argument names I entered in the title above seems self explanatory to me.

BigFunction Description as it would appear in the documentation

get inspired from


    execute immediate 'create or replace temp table tables as (select table_name as name from `' || dataset || '`.INFORMATION_SCHEMA.TABLES where table_type != "VIEW")';
    execute immediate 'create or replace temp table views as (select table_name as name from `' || dataset || '`.INFORMATION_SCHEMA.TABLES where table_type = "VIEW")';
    execute immediate 'create or replace temp table routines as (select routine_name, routine_type from `' || dataset || '`.INFORMATION_SCHEMA.ROUTINES)';

    for record in (select * from tables) do
        execute immediate 'drop table `' || dataset || '.' || record.name || '`';
    end for;

    for record in (select * from views) do
        execute immediate 'drop view `' || dataset || '.' || record.name || '`';
    end for;

    for record in (select * from routines) do
        execute immediate 'drop ' || record.routine_type || ' `' || dataset || '.' || record.routine_name || '`';
    end for;

execute immediate "drop schema " || dataset;

Examples of (arguments, expected output) as they would appear in the documentation

  • my_dataset

[bug]: `get_latest_partition_timestamp`: has a undeclared variable

Check the bug has not already been reported

Edit function_name and the short error description in title above

  • I wrote the correct function name and a short error description in the title above

What happened and what did you expect?

No matter what table I pass into this procedure, it shows me this error.
image

As you see this bigfunction_result variable is used without its declaration.
image

[new]: `faker(what, localization, options)`

Check the idea has not already been suggested

Edit the title above with self-explanatory function name and argument names

  • The function name and the argument names I entered in the title above seems self explanatory to me.

BigFunction Description as it would appear in the documentation

Use https://faker.readthedocs.io/ to generate fake data.

Code should look something like:

import json
from faker import Faker
fake = Faker(localization)
kwargs = json.loads(options)
return getattr(fake, what)(**kwargs)

Examples of (arguments, expected output) as they would appear in the documentation

  • what='name', localization='it_IT', options=null --> Elda Palumbo

[feature]: add `test` command

Check your idea has not already been reported

Edit the title above

  • I wrote clear short description of my idea in the title above

Tell us everything

the function should deployed in a test dataset and examples should be tested against the deployed function

[bug]: a statement isn't working with new procedure

Check the bug has not already been reported

Edit function_name and the short error description in title above

  • I wrote the correct function name and a short error description in the title above

What happened and what did you expect?

When I created a new procedure and deployed it into my project. I see there is an insert statement that is concatenated in each new procedure, but my procedure is unable to find the logs table.

image

[new]: `detect_anomaly(query_or_table_or_view, timestamp_column, metric_column)`

Check the idea has not already been suggested

Edit the title above with self-explanatory function name and argument names

  • The function name and the argument names I entered in the title above seems self explanatory to me.

BigFunction Description as it would appear in the documentation

Return temporal anomalies of metric column in table

Get inspired by https://github.com/cuebook/CueObserve

Examples of (arguments, expected output) as they would appear in the documentation

a public table with anomalies --> detected anomalies

[new]: `timestamp_to_unix_date_time(timestamp_expression TIMESTAMP, date_time_part STRING)`

Check the idea has not already been suggested

Edit the title above with self-explanatory function name and argument names

  • The function name and the argument names I entered in the title above seems self explanatory to me.

BigFunction Description as it would appear in the documentation

Returns the number of date_time_part since 1970-01-01 00:00:00 UTC. Truncates higher levels of precision by rounding down to the beginning of the date_time_part.

Examples of (arguments, expected output) as they would appear in the documentation

select bigfunctions.us.timestamp_to_unix_date_time(timestamp("2001-01-01"), "YEAR") as unix_year

+------------+
|  unix_year |
+------------+
|  31        |
+------------+

[new]: `json_merge(json_string1, json_string2)`

Check the idea has not already been suggested

Edit the title above with self-explanatory function name and argument names

  • The function name and the argument names I entered in the title above seems self explanatory to me.

BigFunction Description as it would appear in the documentation

inpired from https://github.com/djdhyun/bigquery-typescript-udf

Examples of (arguments, expected output) as they would appear in the documentation

json_merge('{"k1": "v1"}', '{"k2": "v2"}')
--> {"k1":"v1","k2":"v2"}

[new]: `array_intersect(arr1, arr2)`

Check the idea has not already been suggested

Edit the title above with self-explanatory function name and argument names

  • The function name and the argument names I entered in the title above seems self explanatory to me.

BigFunction Description as it would appear in the documentation

the function array_intersect returns the intersection of two arrays.

Examples of (arguments, expected output) as they would appear in the documentation

array_intersect([1, 2, 3], [2, 6, 7]) -> [2]

[new]: `is_table_row_number_anomalous(fully_qualified_table_name)`

Check the idea has not already been suggested

Edit the title above with self-explanatory function name and argument names

  • The function name and the argument names I entered in the title above seems self explanatory to me.

BigFunction Description as it would appear in the documentation

Get table row count for the last 7 days (using time travelling feature of BigQuery) and check that current row count is not unusual compared to yesterday row count and regarding the evolution of the 7 latest days

Examples of (arguments, expected output) as they would appear in the documentation

one_public_table_name --> false

[new]: `get_value(key_value_items, key)`

Check the idea has not already been suggested

Edit the title above with self-explanatory function name and argument names

  • The function name and the argument names I entered in the title above seems self explanatory to me.

BigFunction Description as it would appear in the documentation

Return value at key key from key_value_items array

Examples of (arguments, expected output) as they would appear in the documentation

  • get_value([struct("k" as key, 1 as value)], "k") --> 1
  • get_value([struct("k" as key, 1 as value)], "a") --> null

[new] `send_google_chat_message(url, message)`

Check the idea has not already been suggested

Edit the title above with self-explanatory function name and argument names

  • The function name and the argument names I entered in the title above seems self explanatory to me.

BigFunction Description as it would appear in the documentation

Send Λ‹messageΛ‹ to google chat channel using incoming webhook Λ‹url`

see documentation here: https://developers.google.com/chat/how-tos/webhooks

Examples of (arguments, expected output) as they would appear in the documentation

[new]: `xml2json(xml)`

Check the idea has not already been suggested

Edit the title above with self-explanatory function name and argument names

  • The function name and the argument names I entered in the title above seems self explanatory to me.

BigFunction Description as it would appear in the documentation

convert xml to json string using a js library

Inspired from https://github.com/salrashid123/bq-udf-xml

Examples of (arguments, expected output) as they would appear in the documentation

<a><b>foo</b></a> --> {"a": {"b": "foo"}}

[new]: `quantize_into_fixed_width_bins(value, min_value, max_value, nb_bins)`

Check the idea has not already been suggested

Edit the title above with self-explanatory function name and argument names

  • The function name and the argument names I entered in the title above seems self explanatory to me.

BigFunction Description as it would appear in the documentation

Quantize value into a bin.

Examples of (arguments, expected output) as they would appear in the documentation

quantize_into_fixed_width_bins(-4, 0, 100, 10)
--> ]-∞, 0[

quantize_into_fixed_width_bins(5, 0, 100, 10)
--> [0, 10[

quantize_into_fixed_width_bins(97, 0, 100, 10)
--> [90, 100]

quantize_into_fixed_width_bins(123, 0, 100, 10)
--> [100, + ∞[

[new]: `get_ga4_param(param_name, event_params)`

Check the idea has not already been suggested

Edit the title above with self-explanatory function name and argument names

  • The function name and the argument names I entered in the title above seems self explanatory to me.

BigFunction Description as it would appear in the documentation

Objective: simplify the following:
https://developers.google.com/analytics/bigquery/basic-queries#values_for_a_specific_event_name

Alejandro Zielinsky proposed the following.

From my point of view, we could return the value directly and not a struct with type and value.

Examples of (arguments, expected output) as they would appear in the documentation

[feature]: add command `lint`

Check your idea has not already been reported

Edit the title above

  • I wrote clear short description of my idea in the title above

Tell us everything

add command lint to lower case the types of arguments, function name, etc
and lint the code using sqlfluff for sql and black for python

[new]: `array_concat(array1, array2)`

Check the idea has not already been suggested

Edit the title above with self-explanatory function name and argument names

  • The function name and the argument names I entered in the title above seems self explanatory to me.

BigFunction Description as it would appear in the documentation

Examples of (arguments, expected output) as they would appear in the documentation

array_concat([1], [2, 3]) --> [1, 2, 3]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.