GithubHelp home page GithubHelp logo

amazon-redshift-developer-guide's Introduction

amazon-redshift-developer-guide's People

Contributors

adam-tokarski avatar angchow-aws avatar armaseg avatar calleo avatar danpjac avatar dpnsh avatar franbulax avatar hyandell avatar jimgaws avatar jimmyboyle avatar joshbean avatar keithhc2 avatar marcfowler avatar mjalkio avatar olegfridman avatar sal-aws avatar sasanahmadi avatar snolan-uturn avatar snuyanzin avatar szuckerman avatar tommagnus avatar zacharyrsmith avatar zxwvrblv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

amazon-redshift-developer-guide's Issues

Small Spelling issue

The following sentence here:

The following example loads data from as Amazon EMR cluster

Did we meant to say:

The following example loads data from an Amazon EMR cluster

Thank you

queues and concurrency level

Hey, is the following statement true? I was told that the total concurrency for WLM regardless of number of queues is 50. Based on this statement, a WLM can support 400(=8*5) queries?

"You can define up to eight queues. Each queue can be configured with a maximum concurrency level of 50"

Using COPY to Load Data and Permissions

From reading the documentation:

I see the following: The role must have, at a minimum, the permissions listed in IAM Permissions for COPY, UNLOAD, and CREATE LIBRARY.

It would be nice to either have notes/pointers on either of these pages mentioning that if data is encrypted, permissions must also be on the KMS Key Policy and the IAM user or role needs to have KMS permissions as well.

https://docs.aws.amazon.com/redshift/latest/dg/loading-data-access-permissions.html
https://docs.aws.amazon.com/redshift/latest/dg/c_loading-encrypted-files.html

Vacuum Delete not working even if Background vacuum and manual vacuum command is executed

We have found that the Vacuum delete operation under Full vacuum or vacuum command does not work when some orphan transaction ids are still active in the cluster.
These transactions IDs can be found running before the vacuum is executed and that skips the vacuum delete operation in order to maintain the integrity of the data which the active transaction might be using.
Using below SQL, we can find the PID and transaction which might still be actve:

select *,datediff(s,txn_start,getdate())/86400||' days '||datediff(s,txn_start,getdate())%86400/3600||' hrs '||datediff(s,txn_start,getdate())%3600/60||' mins '||datediff(s,txn_start,getdate())%60||' secs'
from svv_transactions where lockable_object_type='transactionid' and pid<>pg_backend_pid() order by 3;

This is important because sometimes after the Delete operation on a table, when we run vacuum on the same table and if there are any other transactions which are already running in the cluster at the same time, then vacuum delete will not execute and as a result the rows marked for deletion are not removed. As a result the next phase of the vacuum, i.e., vacuum sort will run longer since it will resort all the rows including rows marked for deletion. This will cause delay in the ETLs.
The possible way to avoid this is to use Truncate - Load and if that is not possible, then make sure nothing is running in parallel when the Vacuum is started.

Switch data type name and alias for numeric and decimal

This page: https://docs.aws.amazon.com/redshift/latest/dg/c_Supported_data_types.html (GitHub: https://github.com/awsdocs/amazon-redshift-developer-guide/blob/master/doc_source/c_Supported_data_types.md ) shows the DECIMAL data type having alias NUMERIC. However, when you create decimal / numeric columns in the DB, the canonical name returned by the DB is "NUMERIC". Maybe switch the name and alias for this type in the docs?

dev=# create table brad_loves_numbers (dec DECIMAL(8,3), num DECIMAL(10,5));
CREATE TABLE

dev=# \d brad_loves_numbers
 Table "public.brad_loves_numbers"
 Column |     Type      | Modifiers
--------+---------------+-----------
 dec    | numeric(8,3)  |
 num    | numeric(10,5) |

dev=# select column_name, data_type from information_schema.columns where table_name LIKE 'brad_loves_numbers';
 column_name | data_type
-------------+-----------
 num         | numeric
 dec         | numeric
(2 rows)

Unsupported Type

I'm getting an error while creating a table with the SUPER data type.

[Amazon](500310) Invalid operation: Column "table_name.column_name" has unsupported type "super".

I'm running the same script I used to run to create my database, but as of cluster version 1.0.23412 it no longer shows as supported when I try to use this data type.

Note about DECIMAL type limitation not clear for NUMERIC

Hi,
The Note included in this section states a limitation for the DECIMAL data type, but it seems a bit unclear that the limitation also applies for the NUMERIC data type.
Even if the docs indicates that both types are equivalent, it can be understood that the limitation can be avoided using the NUMERIC data type.
Regards

Is_integer function documentation needs update

As per the is_integer function's documentation, the function can expect a Column or A super. We ran into issue while we passed it a Varchar type field. Here is the code we ran

CREATE TEMP TABLE t(s char(100));
INSERT INTO t VALUES ('5');
SELECT s, is_integer(s) FROM t;

description for stl_query.starttime is wrong

It is "Time in UTC that the query started executing" but as seen by the below query, the time is actually before the query is put into a service class, let alone actually started executing:

select stl_query.starttime, stl_wlm_query.*
from stl_query
join stl_wlm_query
    using(query)
limit 1

Please change to "Time in UTC that the query started processing", "Time in UTC that the query was accepted", or be vague like other Redshift doc pages: "Time in UTC that the query started".

Similarly for endtime.

SHOW PROCEDURE not displaying procedure definition

I can create (and replace) a store procedure correctly but the definition can't be retrieved by using SHOW PROCEDURE sp_name or SHOW PROCEDURE sp_name(arg_name arg_type,...)

Only the following is displayed:
0 rows affected
SHOW executed successfully

I am able to find it exists with the following query but how to see the definition?

SELECT *
FROM PG_PROC_INFO
WHERE proowner<>1

`ALTER FUNCTION` is missing

The following SQL seems to be working, but it is undocumented.

ALTER FUNCTION function_name OWNER TO new_owner

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.