GithubHelp home page GithubHelp logo

snowflake-labs / snowpark-python-demos Goto Github PK

View Code? Open in Web Editor NEW
244.0 12.0 145.0 227.99 MB

This repository provides various demos/examples of using Snowpark for Python.

License: Apache License 2.0

Jupyter Notebook 95.59% Python 4.39% PLpgSQL 0.01%
python snowpark dataengineering datascience machine-learning

snowpark-python-demos's Issues

batch_predict_roi UDF input type error

Hello All,
In the Snowpark_For_Python.ipynb demo, the batch_predict_roi UDF returns the error below. I assume this is because of the
expectation of a pandas df in the function:
def batch_predict_roi(budget_allocations_df: PandasDataFrame[int, int, int, int]) -> PandasSeries[float]:
but instead, it gets an array. Any help in clearing this up would be appreciated.

https://github.com/Snowflake-Labs/snowpark-python-demos/blob/main/Advertising-Spend-ROI-Prediction/Snowpark_For_Python.ipynb

Failed to execute query [queryID: 01a80045-0004-3087-002b-9e87000e507e] SELECT "SEARCH_ENGINE", "SOCIAL_MEDIA", "VIDEO", "EMAIL", batch_predict_roi(array_construct("SEARCH_ENGINE", "SOCIAL_MEDIA", "VIDEO", "EMAIL")) AS "PREDICTED_ROI" FROM ( SELECT * FROM ( VALUES (250000 :: INT, 250000 :: INT, 200000 :: INT, 450000 :: INT), (500000 :: INT, 500000 :: INT, 500000 :: INT, 500000 :: INT), (8500 :: INT, 9500 :: INT, 2000 :: INT, 500 :: INT) AS SNOWPARK_TEMP_TABLE_H7ZD6LZPZ4("SEARCH_ENGINE", "SOCIAL_MEDIA", "VIDEO", "EMAIL"))) LIMIT 10 001044 (42P13): SQL compilation error: error line 1 at position 58 Invalid argument types for function 'BATCH_PREDICT_ROI': (ARRAY)

Retail churn analytics data source is incorrect

The datasource specified does not contain the fields mentioned in the code

CREATE OR REPLACE EXTERNAL TABLE SRC_CUSTOMER
(CUSTOMER_ID VARCHAR(40) as (value:c1::varchar),
CREATED_DT DATE as (value:c2::date),
CITY VARCHAR(40) as (value:c3::varchar),
STATE VARCHAR(2) as (value:c4::varchar),
FAV_DELIVERY_DAY VARCHAR(40) as (value:c5::varchar),
REFILL NUMBER(38,0) as (value:c6::integer),
DOOR_DELIVERY NUMBER(38,0) as (value:c7::integer),
PAPERLESS NUMBER(38,0) as (value:c8::integer),
CUSTOMER_NAME VARCHAR(40) as (value:c9::varchar),
RETAINED NUMBER(38,0) as (value:c10::integer)
)

These datasets were generated for this demo using a Kaggle dataset below.

Reference: https://www.kaggle.com/uttamp/store-data

Argument mismatch error

I am trying to wrap the model training as part of a stored procedure in customer spend prediction.
However, I get the below error - I am unable to debug this
image

Below is the code for the model training definition and SP registration
image

Snowflake BUILD 2022: Sentiment Analysis Demo notebook is incomplete

@ sfc-gh-scoombes

After creating the various stages and uploading 2 files to the stages, the notebook goes straight to querying the table TRAINING_DATA. But this table was never created? Or am I missing something??

# create the stage for python and model data
session.sql('create stage if not exists scratch.raw_data').collect()
session.sql('create stage if not exists scratch.model_data').collect()
session.sql('create stage if not exists scratch.python_load').collect()

# create the directory stage for the data
session.sql('create stage if not exists scratch.raw_data_stage directory = (enable = true)').collect()

# upload the unstructured file and stop words to the stages
session.file.put('reviews__0_0_0.dat','@scratch.raw_data_stage',auto_compress=False)
session.file.put('en_core_web_sm.zip','@scratch.model_data')

# refresh the stage
session.sql('alter stage scratch.raw_data_stage refresh').collect()

session.table("TRAINING_DATA").show(30)

Thanks!

No Loading of Data in External table

For the retail-churn-analytics demo
The data doesn't get copied from the S3 bucket to the external tables created.

CREATE OR REPLACE EXTERNAL TABLE SRC_CUSTOMER
(CUSTOMER_ID VARCHAR(40) as (value:c1::varchar),
CREATED_DT DATE as (value:c2::date),
CITY VARCHAR(40) as (value:c3::varchar),
STATE VARCHAR(2) as (value:c4::varchar),
FAV_DELIVERY_DAY VARCHAR(40) as (value:c5::varchar),
REFILL NUMBER(38,0) as (value:c6::integer),
DOOR_DELIVERY NUMBER(38,0) as (value:c7::integer),
PAPERLESS NUMBER(38,0) as (value:c8::integer),
CUSTOMER_NAME VARCHAR(40) as (value:c9::varchar),
RETAINED NUMBER(38,0) as (value:c10::integer)
)
LOCATION = @churn_source_data/customer/
REFRESH_ON_CREATE = TRUE
AUTO_REFRESH = TRUE
FILE_FORMAT = ( TYPE = CSV SKIP_HEADER=1);

This is the exact syntax I have used.
@iamontheinet - please guide me on the same

Readme setup instructions `conda`

For https://github.com/Snowflake-Labs/snowpark-python-demos/tree/main/Advertising-Spend-ROI-Prediction
Setup instructions says use pip install conda then run conda ... but get following error:

conda
ERROR: The install method you used for conda--probably either `pip install conda`
or `easy_install conda`--is not compatible with using conda as an application.
If your intention is to install conda as a standalone application, currently
supported install methods include the Anaconda installer and the miniconda
installer.  You can download the miniconda installer from
https://conda.io/miniconda.html.

Therefore should update the instructions to remove pip install conda and install via miniconda instead

UDF Error with Credit Card Fraud Detection

I am running through credit card fraud detection snowpark exercises, everything looks good except I am getting an error when i try to use query with the new UDF.

Failed Query In Snowsight:
SELECT TRANSACTION_ID, TX_DATETIME, CUSTOMER_ID, TERMINAL_ID, TX_AMOUNT ,detect_fraud_batch_udf(TX_AMOUNT,TX_DURING_WEEKEND, TX_DURING_NIGHT, CUST_CNT_TX_1, CUST_AVG_AMOUNT_1, CUST_CNT_TX_7, CUST_AVG_AMOUNT_7, CUST_CNT_TX_30,CUST_AVG_AMOUNT_30, NB_TX_WINDOW_1, TERM_RISK_1, NB_TX_WINDOW_7,TERM_RISK_7, NB_TX_WINDOW_30,TERM_RISK_30) AS FRAUD_PROB
FROM CUSTOMER_TRX_FRAUD_FEATURES
WHERE TX_DATETIME > '2019-07-15 00:00:00' LIMIT 10;

100357 (P0000): Python Interpreter Error:
Traceback (most recent call last):
File "_udf_code.py", line 32, in compute
File "_udf_code.py", line 21, in wrapper
File "/var/folders/ck/ll2bz1_s3ng7w67zf6mdvqh40000gn/T/ipykernel_77660/546210058.py", line 17, in detect_fraud_batch
File "/Users/hayan/opt/anaconda3/envs/snowpark_070/lib/python3.8/site-packages/cachetools/init.py", line 641, in wrapper
File "/var/folders/ck/ll2bz1_s3ng7w67zf6mdvqh40000gn/T/ipykernel_77660/546210058.py", line 10, in read_file
File "/usr/lib/python_udf/5dd4a97c20bf6b6243e739c66c4fbfa80ab53172655f7f15cf1c55d0f462ae66/lib/python3.8/site-packages/joblib/numpy_pickle.py", line 577, in load
obj = _unpickle(fobj)
File "/usr/lib/python_udf/5dd4a97c20bf6b6243e739c66c4fbfa80ab53172655f7f15cf1c55d0f462ae66/lib/python3.8/site-packages/joblib/numpy_pickle.py", line 506, in _unpickle
obj = unpickler.load()
File "/usr/lib/python_udf/5dd4a97c20bf6b6243e739c66c4fbfa80ab53172655f7f15cf1c55d0f462ae66/lib/python3.8/pickle.py", line 1212, in load
dispatchkey[0]
KeyError: 255
in function DETECT_FRAUD_BATCH_UDF with handler compute

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.