capitalone / giraffez Goto Github PK

View Code? Open in Web Editor NEW

110.0 21.0 35.0 7.49 MB

User-friendly Teradata client for Python

Home Page: https://capitalone.github.io/giraffez

License: Apache License 2.0

Makefile 0.16% Shell 1.33% Python 52.12% C 39.18% C++ 7.21%

teradata database kawaii python command-line-tool

giraffez's People

Contributors

Stargazers

Watchers

giraffez's Issues

option to retain natural capitalization

Is there a way to maintain the capitalization after querying and storing off of a select statement. This becomes an issue as Python is case sensitive, so the response from the query ends up in a dataframe where something was expected to be fiscalYear but ends up being stored as fiscalyear.

Would there be a way to add this as an overall option to set for the package or for specific modules, whichever is easier?

BulkExport.to_str() writes when delimiter is present in table

If I have a table with a pipe (|) in a character field and try to export and write to a string, there's no checking for whether the delimiter exists in any of the fields:

import giraffez

sql = """SELECT * FROM 
(
SELECT
CAST('first|field' AS VARCHAR(20)) AS first_char,
CAST('second field' AS VARCHAR(20)) AS second_char
) AS tbl
"""

with giraffez.BulkExport(sql) as export:
    r = list(export.to_str(delimiter='|'))

print(r)

['first|field|second field']

I could bring it out with a different delimiter, or roll my own to_str that takes in a dict from to_dict/to_list, but wondering if there should be functionality pushed up into to_str to error out, warn, or remove the field if the delimiter exists in a column?

Add support for BTEQ Syntax

Hi,

This is a good project, currently in my job I'm working with Teradata and python. It would be great if we can usegiraffez bteq interactiveand giraffez bteq <bteq file> <querybands>.

Parameterized Queries

Is there a way to parameterize queries with Giraffez? I want to avoid any possible SQL injection when using the library.

Expose or document more options for MLoad and Export

Would it be possible to expose or document the add_attribute method of MLoad and Export objects?

For context I'm working with a Teradata instance that severely limits the number of active sessions. After some research I found this was possible by using add_attribute method of the mload object belonging to the BulkLoad operation and then looking for the appropiate constants:

with giraffez.BulkLoad(table=table_name, **giraffez_opts) as load:
   load.mload.add_attribute(giraffez.constants.TD_MIN_SESSIONS, 1)
   load.mload.add_attribute(giraffez.constants.TD_MAX_SESSIONS, 2)
   [...]

Failure to load libcliv2.dylib on Mac with miniconda

Attempting to run import giraffez produces the following error

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
~/miniconda3/envs/saul/lib/python3.6/site-packages/giraffez/__init__.py in <module>()
     38 try:
---> 39     from . import _teradata
     40     from . import _teradatapt

ImportError: dlopen(/Users/****/miniconda3/envs/saul/lib/python3.6/site-packages/giraffez/_teradata.cpython-36m-darwin.so, 2): Library not loaded: libcliv2.dylib
  Referenced from: /Users/****/miniconda3/envs/saul/lib/python3.6/site-packages/giraffez/_teradata.cpython-36m-darwin.so
  Reason: image not found

During handling of the above exception, another exception occurred:

~/github/rock/rock/upload_conn.py in <module>()
      8 camelCase indicates function
      9 """
---> 10 import giraffez
     11 
     12 class upload_connector:

~/miniconda3/envs/saul/lib/python3.6/site-packages/giraffez/__init__.py in <module>()
     52 documentation:
     53     http://www.capitalone.io/giraffez/intro.html#environment.
---> 54 """.format(error))
     55 
     56 from ._teradata import TeradataError

Exception: dlopen(/Users/****/miniconda3/envs/saul/lib/python3.6/site-packages/giraffez/_teradata.cpython-36m-darwin.so, 2): Library not loaded: libcliv2.dylib
  Referenced from: /Users/****/miniconda3/envs/saul/lib/python3.6/site-packages/giraffez/_teradata.cpython-36m-darwin.so
  Reason: image not found.

This indicates that either the giraffez C extensions did not compile
correctly, or more likely, there is an issue with the environment or
installation of the Teradata dependencies. Both the Teradata Call-Level
Interface Version 2 and Teradata Parallel Transporter API require
several environment variables to be set to find the shared library
files and error message catalog.

For more information, refer to this section in the giraffez
documentation:
    http://www.capitalone.io/giraffez/intro.html#environment.

Using teradata 15.10 with the following enviroment variables set in my .bash_profile

export TERADATA_HOME=/Library/Application\ Support/teradata/client/15.10
export LD_LIBRARY_PATH=$TERADATA_HOME/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$TERADATA_HOME/tbuild/lib:$LD_LIBRARY_PATH
export DYLD_LIBRARY_PATH=$LD_LIBRARY_PATH
export NLSPATH=$TERADATA_HOME/msg/%N:$NLSPATH

Looking at the lib folder in the teradatra folder I see libcliv2.dylib so I'm having a hard time interpreting the above error.

Version Info:

giraffez: 2.0.17
OS: macOS Sierra Version 10.12.6
Python Version: 3.6.1

How do I set encoding to utf8?

I have a DataFrame with columns holding non-ASCII characters. I would to chance session encofing to something similar to udaExec.connect(....,charset='utf8') in teradata and cnxn.setencoding(encoding='utf-8') in pyodbc.

I am able to load data but having funny characters with:

with giraffez.BulkLoad("BDA.Table", print_error_table=True) as load:
    load.columns = df.columns.tolist()
    for row in df.values.tolist():
        load.put(row)

How do I set encoding to UTF-8? Is it done by giraffez.encoders? Is there how-to code example?

Duplicate column titles/names will only return one of the columns

This is a corner case and I'm open to a "won't fix", but came across this and didn't expect the behaviour today: two examples where selecting columns that render out as having the same name will silently squash one of them in output:

Example 1 - Titles:

import giraffez

table_name = 'schema.tester'

drop_sql = f"DROP TABLE {table_name}"
create_sql = f"""CREATE MULTISET TABLE {table_name} (
    field_1 DECIMAL(18,0) TITLE 'Number',
    field_2 DECIMAL(12,0) TITLE 'Number')"""
insert_sql = f"INSERT INTO {table_name} SELECT 1 AS field_1, 2 AS field_2"
read_sql = f"SELECT field_1, field_2 FROM {table_name}"

with giraffez.Cmd() as cmd:
    if cmd.exists(table_name):
        cmd.execute(drop_sql)
    cmd.execute(create_sql)
    cmd.execute(insert_sql)
    res = list(cmd.execute(read_sql).to_dict())
    print(res)
    cmd.execute(drop_sql)

This code results in [{'number': 2.0}], because Column.name uses self.title over self.original_name (here). Note, if you BulkExport out the data, the export.to_str() still works fine, but the to_dict() will overwrite (because it's a dict)

Example 2 - duplicate column names in multiple tables:

create_a = "CREATE MULTISET VOLATILE TABLE tbl_a AS (SELECT 1 AS fld_a) WITH DATA ON COMMIT PRESERVE ROWS"
create_b = "CREATE MULTISET VOLATILE TABLE tbl_b AS (SELECT 1 AS fld_a) WITH DATA ON COMMIT PRESERVE ROWS"
read_sql = "SELECT tbl_a.fld_a, tbl_b.fld_a FROM tbl_a INNER JOIN tbl_b ON tbl_a.fld_a = tbl_b.fld_a"

with giraffez.Cmd() as cmd:
    cmd.execute(create_a)
    cmd.execute(create_b)
    res = cmd.execute(read_sql)
    for row in res:
        print(row)

This code results in {'fld_a': 1}.

If you select the same field name in one query/table (e.g. SELECT 1 AS field_1, 2 AS field_1), Teradata will error out with 3863: Duplicate definition of field.

Wondering if giraffez should exception out when there are duplicates in the list of column names? Or maybe just when calling to_dict() or to_list() on Cmd or Export, which would produce unexpected results? (especially if you're exporting, the fields will still be in export.columns.names but would have fewer records in to_dict() items). Both of these could be fixed by the user providing an alias, but as a user I'd like giraffez to be "noisier" if there was going to be an issue.

[SLES 11 SP4 (x86_64)] not recognizing teradata libs

I am still facing exception saying: libpxicu.so cannot open shared object file [...] but required libs are present and environment variables have been set up as follows:
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/opt/teradata/client/15.00/lib64/libcliv2.so"
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/opt/teradata/client/15.00/tbuild/lib64/libtelapi.so"
export TERADATA_HOME="/opt/teradata/client"

Where should I look for to solve my issue? Any hints?

Peter

Add rowcount to Cursors returned by TeradataCmd.execute

Hello,

First off, thanks for the great tool! My company has a definite interest in utilizing giraffez for our interactions with Teradata. It has been much more pleasant to work with than Teradata's native tooling.

One feature that we desire (and are willing to write a PR for) is to be able to know how many rows were affected by a given operation. We would use this for logging / analytics purposes. Ideally, this would be accomplished by reading a rowcount attribute of the Cursor object(s) returned by the TeradataCmd.execute function. The DB API 2.0 mentions such a feature though I know giraffez isn't designed to be PEP-249 compliant.

In browsing Teradata's CLIv2 documentation, I found that ActivityCount gets returned as a field in the SUCCESS parcel. With some basic testing, I found that this parcel does come through the results processed in the teradata_handle_record function. As for how to propagate this value up to a Cursor object, I'm still looking for the best way.

After that bit of background info, here are the questions I hope to explore with this issue:

Does this capability already exist in giraffez? I've checked both the docs and the source but could have missed something.
Are you (as the giraffez maintainers) open to this sort of feature?
If so, do you have any recommendations on how to implement it? I wonder if teradata_handle_record would even be a good location given the "purpose" of that function.

Thanks again for your time and the amazing tool!

Incorrect decimal(x,y) behaviour with large x, negative numbers

I think this may be related to an internal issue previously closed (144). This is on 1.1.7

from decimal import Decimal
import giraffez

test_cases = [
    {'n': -0.08, 'd1': 32, 'd2': 2},
    {'n': -0.08, 'd1': 30, 'd2': 2},
    {'n': -0.08, 'd1': 32, 'd2': 4},
    {'n': -0.1, 'd1': 28, 'd2': 1},
    {'n': -0.1, 'd1': 20, 'd2': 0},
    {'n': -999, 'd1': 32, 'd2': 2},
    {'n': -999, 'd1': 30, 'd2': 2},
    {'n': -99.9, 'd1': 30, 'd2': 2},
    {'n': -0.08, 'd1': 2, 'd2': 2},
    {'n': -0.08, 'd1': 3, 'd2': 2},
    {'n': -0.08, 'd1': 4, 'd2': 2},
    {'n': -0.08, 'd1': 16, 'd2': 2},
    {'n': -0.08, 'd1': 17, 'd2': 2},
    {'n': -0.08, 'd1': 18, 'd2': 2},
    {'n': -0.08, 'd1': 19, 'd2': 2},
    {'n': -0.08, 'd1': 20, 'd2': 2},
    {'n': -0.08, 'd1': 21, 'd2': 2},
    {'n': -0.08, 'd1': 22, 'd2': 2},
    {'n': -0.08, 'd1': 23, 'd2': 2},
    {'n': -0.08, 'd1': 24, 'd2': 2},
    {'n': -0.08, 'd1': 31, 'd2': 2},
    {'n': -0.08, 'd1': 32, 'd2': 2},
    {'n': -0.08, 'd1': 33, 'd2': 2},
    {'n': 0, 'd1': 33, 'd2': 2},
    {'n': -0.0001, 'd1': 33, 'd2': 4},
    {'n': -1000000000, 'd1': 33, 'd2': 4},
    {'n': -100, 'd1': 20, 'd2': 2},
    {'n': -99, 'd1': 20, 'd2': 2},
    {'n': -99.1, 'd1': 20, 'd2': 2},
    {'n': -99.1, 'd1': 19, 'd2': 2},
    {'n': -99.1, 'd1': 18, 'd2': 2}
    ]

with giraffez.Cmd() as cmd:
    for test_case in test_cases:
        sql = 'SELECT CAST({n} AS DECIMAL({d1},{d2})) AS a'.format(**test_case)
        res = cmd.execute(sql).one().first()
        print(
            'SQL:',
            sql,
            'Results:',
            '{} == {}'.format(test_case['n'], res['a']),
            res['a'] == Decimal(str(test_case['n'])))

Gives:

SQL: SELECT CAST(-0.08 AS DECIMAL(32,2)) AS a Results: -0.08 == -1.92 False
SQL: SELECT CAST(-0.08 AS DECIMAL(30,2)) AS a Results: -0.08 == -1.92 False
SQL: SELECT CAST(-0.08 AS DECIMAL(32,4)) AS a Results: -0.08 == -1.9200 False
SQL: SELECT CAST(-0.1 AS DECIMAL(28,1)) AS a Results: -0.1 == -1.9 False
SQL: SELECT CAST(-0.1 AS DECIMAL(20,0)) AS a Results: -0.1 == 0 False
SQL: SELECT CAST(-999 AS DECIMAL(32,2)) AS a Results: -999 == -999.00 True
SQL: SELECT CAST(-999 AS DECIMAL(30,2)) AS a Results: -999 == -999.00 True
SQL: SELECT CAST(-99.9 AS DECIMAL(30,2)) AS a Results: -99.9 == -100.10 False
SQL: SELECT CAST(-0.08 AS DECIMAL(2,2)) AS a Results: -0.08 == -0.08 True
SQL: SELECT CAST(-0.08 AS DECIMAL(3,2)) AS a Results: -0.08 == -0.08 True
SQL: SELECT CAST(-0.08 AS DECIMAL(4,2)) AS a Results: -0.08 == -0.08 True
SQL: SELECT CAST(-0.08 AS DECIMAL(16,2)) AS a Results: -0.08 == -0.08 True
SQL: SELECT CAST(-0.08 AS DECIMAL(17,2)) AS a Results: -0.08 == -0.08 True
SQL: SELECT CAST(-0.08 AS DECIMAL(18,2)) AS a Results: -0.08 == -0.08 True
SQL: SELECT CAST(-0.08 AS DECIMAL(19,2)) AS a Results: -0.08 == -1.92 False
SQL: SELECT CAST(-0.08 AS DECIMAL(20,2)) AS a Results: -0.08 == -1.92 False
SQL: SELECT CAST(-0.08 AS DECIMAL(21,2)) AS a Results: -0.08 == -1.92 False
SQL: SELECT CAST(-0.08 AS DECIMAL(22,2)) AS a Results: -0.08 == -1.92 False
SQL: SELECT CAST(-0.08 AS DECIMAL(23,2)) AS a Results: -0.08 == -1.92 False
SQL: SELECT CAST(-0.08 AS DECIMAL(24,2)) AS a Results: -0.08 == -1.92 False
SQL: SELECT CAST(-0.08 AS DECIMAL(31,2)) AS a Results: -0.08 == -1.92 False
SQL: SELECT CAST(-0.08 AS DECIMAL(32,2)) AS a Results: -0.08 == -1.92 False
SQL: SELECT CAST(-0.08 AS DECIMAL(33,2)) AS a Results: -0.08 == -1.92 False
SQL: SELECT CAST(0 AS DECIMAL(33,2)) AS a Results: 0 == 0.00 True
SQL: SELECT CAST(-0.0001 AS DECIMAL(33,4)) AS a Results: -0.0001 == -1.9999 False
SQL: SELECT CAST(-1000000000 AS DECIMAL(33,4)) AS a Results: -1000000000 == -1000000000.0000 True
SQL: SELECT CAST(-100 AS DECIMAL(20,2)) AS a Results: -100 == -100.00 True
SQL: SELECT CAST(-99 AS DECIMAL(20,2)) AS a Results: -99 == -99.00 True
SQL: SELECT CAST(-99.1 AS DECIMAL(20,2)) AS a Results: -99.1 == -100.90 False
SQL: SELECT CAST(-99.1 AS DECIMAL(19,2)) AS a Results: -99.1 == -100.90 False
SQL: SELECT CAST(-99.1 AS DECIMAL(18,2)) AS a Results: -99.1 == -99.10 True

Not sure what the issue is here, but these seem to return as expected through odbc/jdbc.

load.py fails at line 258

config.Config() doesn't create a .giraffepg if it doesn't exist, contrary to documentation

The documentation for config.Config says that Config() "Creates an encryption key at ~/.giraffepg if one does not exist"

But it doesn't, instead raising a KeyNotFound error:

In [1]: import giraffez.config

In [2]: cfg = giraffez.config.Config(mode='w')

---------------------------------------------------------------------------
KeyNotFound                               Traceback (most recent call last)
<ipython-input-2-27c6bd581e19> in <module>()
----> 1 cfg = giraffez.config.Config(mode='w')

/opt/anaconda/lib/python3.5/site-packages/giraffez/config.py in __init__(self, conf, mode, key_file)
    127         if key_file is None:
    128             self._key_file = home_file(".giraffepg")
--> 129         self._key_file = _get_key_file(key_file)
    130         self.mode = mode
    131         _check_file_permissions(self._config_file, 0o600)

/opt/anaconda/lib/python3.5/site-packages/giraffez/config.py in _get_key_file(key_file)
     62         key_file = os.path.abspath(key_file)
     63     if not file_exists(key_file):
---> 64         raise KeyNotFound("Key file is required to use giraffez config.")
     65     return key_file
     66

KeyNotFound: Key file is required to use giraffez config.

This is either a bug in the code or a bug in the docs, and I'd advocate for it being a bug in the code. It would be great if it were possible to use this config API to create an initial configuration for a new user, which would require the ability to construct a Config object without a pre-existing .giraffepg, maybe with the option to create one as part of the constructor.

Teradata JSON column load error

Using giraffez CLI load command to put data in a table with a JSON column I get this error.

# giraffez -vvv load mycsv.txt "mydb.mytable"
/usr/local/lib/python3.6/dist-packages/giraffez/utils.py:129: UserWarning: USING MLOAD TO INSERT LESS THAN 100000 ROWS IS NOT RECOMMENDED - USE LOAD INSTEAD!
Connection: Connecting to data source 'other' ...
Connection: Connection to 'other' established successfully.
BulkLoad: Executing ...
  source      => "mycsv.txt"
  output      => "mydb.mytable"
BulkLoad: Initiating Teradata PT request (awaiting server)  ...
Connection: Closing Teradata connection ...
Connection: Connection to 'other' closed.
Traceback (most recent call last):
  File "/usr/local/bin/giraffez", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/dist-packages/giraffez/__main__.py", line 25, in main
    MainCommand().run()
  File "/usr/local/lib/python3.6/dist-packages/giraffez/core.py", line 80, in run
    args.run(args)
  File "/usr/local/lib/python3.6/dist-packages/giraffez/commandline.py", line 431, in run
    exit_code = load.from_file(args.input_file, delimiter=args.delimiter, null=args.null, quotechar=args.quote_char, parse_dates=args.parse_dates)
  File "/usr/local/lib/python3.6/dist-packages/giraffez/load.py", line 254, in from_file
    self._initiate()
  File "/usr/local/lib/python3.6/dist-packages/giraffez/load.py", line 430, in _initiate
    self.mload.initiate(self.table, self.columns)
giraffez.TeradataError: 5674: LOBs are not allowed to be selected in Record or Indicator modes.

Looking online it seems like if I were using ODBC I could make this go away by adding this:

EnableLobSupport=Yes

Unfortunately I can't see how to pass connection string options to Giraffez. Is there a way to do this? Or is the JSON column not supported?

Thanks!

none of the options for setting host, username, password, alternate conf, etc. work

I need a way to pass credentials into the cmd.execute statement without writing them out to .girafferc. The documentation indicates that it should be possible to do this, but it doesn't seem to be working (e.g. giraffez cmd --help doesn't list setting these parameters as options, despite what the documentation says).

What is the advantage of CLIv2 vs JDBC and ODBC?

giraffez is using CLIv2 which I have no knowledge in. I'm curious that what is the advantage of CLIv2 vs JDBC and ODBC?

Install failure upgrading to 1.0.1 - missing requirement.txt

I'm able to install 1.0.0.dev without issue, but getting a traceback related to requirements.txt when upgrading:

$ pip install giraffez --upgrade --user
Collecting giraffez
  Downloading giraffez-1.0.1.tar.gz (78kB)
    100% || 81kB 1.6MB/s
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-build-m5rtWa/giraffez/setup.py", line 366, in <module>
        with open('requirements.txt') as f:
    IOError: [Errno 2] No such file or directory: 'requirements.txt'

From what i can tell, requirements.txt was not included in the 1.0.1 tarball, but was in 1.0.0.dev:

$ wget ...giraffez-1.0.1...
$ tar -zxvf giraffez-1.0.1.tar.gz
$ cd giraffez-1.0.1 && ls

Version Release?

Is this a stable release? If so why isn't there a release version in the releases section?

Small typo in docs

In the docs: http://www.capitalone.io/giraffez/api.html#module-giraffez.load

There is:

with giraffez.Load() as load:
    load.from_file('myfile.txt', 'database.my_table')

but I think it should be:

with giraffez.Load() as load:
    load.from_file('database.my_table', 'myfile.txt')

This is the line: https://github.com/capitalone/giraffez/blob/master/giraffez/load.py#L54

Using TPT Update driver on macOS causes segfault

While the TPT Export driver works as expected on macOS, the Update driver fails to complete the Initiate method with Segmentation Fault: 11. It has something to do with the lifetime of objects for the TPT, mostly teradata::client::API::Connection, and how the compiler and/or platform handles the code differently.

`load` fails when delimiter occurs in quoted field

load detects the delimiter in the source file, but fails on rows that contain the delimiter in a quoted field.

For example, for the following source file:

one|two|three
four|five|six
seven|"eight|nine"|ten

$ giraffe load testfile.txt test_container.test_table
ERROR:root:Mismatch length for row
        ['seven', '"eight', 'nine"', 'ten']
        ['one', 'two', 'three']
Traceback (most recent call last):
  File "build/bdist.linux-x86_64/egg/giraffe/command.py", line 376, in insert
    raise FormatError("Mismatch length for row\n\t{0}\n\t{1}".format(row, fields))
FormatError: Mismatch length for row
        ['seven', '"eight', 'nine"', 'ten']
        ['one', 'two', 'three']
Results
 ---> Rows successful: 1
 ---> Rows unsuccessful: 1
 ---> Total: 2
==========================
run took 4.9091s to finish

Expected results: delimiter is ignored within quotes, and load of seven|"eight|nine"|ten succeeds with "eight|nine" as a single field.

Actual results: load of seven|"eight|nine"|ten fails.

pip install fails on mac os

In file included from giraffez/cli/cmdobject.c:18:
giraffez/cli/cmdobject.h:37:10: fatal error: 'dbcarea.h' file not found
#include <dbcarea.h>
^
1 error generated.
error: command 'clang' failed with exit status 1

Giraffez compatibility with Python 3.7

Hi!
I'm testing Giraffez on Python 3.7 and have the following error:

Traceback (most recent call last):
  File "/Users/flolas/anaconda3/bin/giraffez", line 11, in <module>
    sys.exit(main())
  File "/Users/flolas/anaconda3/lib/python3.7/site-packages/giraffez/__main__.py", line 25, in main
    MainCommand().run()
  File "/Users/flolas/anaconda3/lib/python3.7/site-packages/giraffez/core.py", line 80, in run
    args.run(args)
  File "/Users/flolas/anaconda3/lib/python3.7/site-packages/giraffez/commandline.py", line 363, in run
    if FileReader.check_length(args.input_file, MLOAD_THRESHOLD):
  File "/Users/flolas/anaconda3/lib/python3.7/site-packages/giraffez/io.py", line 122, in check_length
    for i, line in enumerate(f, 1):
RuntimeError: generator raised StopIteration

This is when executing:

giraffez insert com.csv PROD.CODES

Export to Pandas DataFrame and SFrame

Hi,

It would be good if we can export data to SFrame and Pandas directly from Giraffez.Export. I really want to collaborate to this project!

BulkExport.to_str() rounds floats

I have data in a table that is float format, but when I use .to_str(), the value gets rounded.

Specifically the value is 9 digits after the decimal like:
0.#########

and gets rounded to 6 after the decimal like:
0.######

Is this intended behavior?

decimal value (1/tenth position) is dropped when the digit is 0

reading a decimal into a panda data frame has issues when the 1/10th digit is a zero.

import pandas as pd
import giraffe
slq1='select 10.01;'
sql_results=giraffe.Cmd().execute(slq1)
r1=sql_results.one()
df1 = pd.DataFrame(r1.rows, columns=r1.columns.safe_names)
print df1

0 10.1

slq2='select 10.08;'
sql_results=giraffe.Cmd().execute(slq2)
r1=sql_results.one()
df1 = pd.DataFrame(r1.rows, columns=r1.columns.safe_names)
print df1

0 10.8

How can I use LDAP authentication?

I want LDAP authentication to connect to Teradata with giraffez.

Is such a functionality available with giraffez?
What parameters should be specified to use LDAP with giraffez config --set command?

Thank you

Datatype NUMBER(22,4) is not regonized as valid data type while exporting data from teradata

Currently while exporting data when we have a data type of Number the Export fail with an error : Cannot determine data type. Are there any plan to add it to the export as Decimal type

here is sample test table

CREATE MULTISET TABLE dwp_sandbox.DW_PO_TUX_HDR_test2 ,NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT,
DEFAULT MERGEBLOCKRATIO
(
Q_PO_HDR_ID VARCHAR(50) CHARACTER SET LATIN NOT CASESPECIFIC,
PO_TOTAL_AMT NUMBER(22,4)
)
PRIMARY INDEX ( Q_PO_HDR_ID )

coperr.h file not found

I have Mac OS X, Python 3.5

I cloned the repo and checked out 2.0.0. After running python setup.py install I got this error:

gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -I/Users/frw169/anaconda/include -arch x86_64 -I/Users/frw169/CODE/giraffez/giraffez -I/Users/frw169/anaconda/include/python3.5m -c giraffez/src/columns.c -o /Users/frw169/CODE/giraffez/build/giraffez/src/columns.o -Wfatal-errors
In file included from giraffez/src/columns.c:18:
In file included from giraffez/src/giraffez.h:30:
giraffez/src/teradata.h:44:10: fatal error: 'coperr.h' file not found
#include <coperr.h>
         ^
1 error generated.
error: command 'gcc' failed with exit status 1

For me, I find this coperr.h file in opt/teradata/client/16.00/include/coperr.h. Are there any environmental variables that need to be set? I'm currently having no trouble running the master branch (v 1.1.8)

cmd, load requests failing with CLIv2: initiate request failed

When executing back-to-back requests, particularly pretty quickly, Giraffez comments can fail with error message "TeradataError: CLIv2: initiate request failed"

Versions

python 3.5.2 from a conda environment, with
giraffez 1.1.5 (pip installed to the conda environment, without --user)
teradata client library 15.00, 64-bit

I have noticed that sometimes executing requests back-to-back can lead to a first one to succeed and a second to fail; and further that if the cmd calls were split between two jupyter notebook cells then both would succeed.

---------------------------------------------------------------------------
error                                     Traceback (most recent call last)
.../python/conda_envs/py3k/lib/python3.5/site-packages/giraffez/cmd.py in _execute(self, command)
    105         try:
--> 106             results, error = self.cmd.execute(command)
    107             if error is not None:

error: CLIv2: initiate request failed

During handling of the above exception, another exception occurred:

TeradataError                             Traceback (most recent call last)
<ipython-input-23-60624f138025> in <module>()
      2 # a simple giraffez command execution
      3 with gr.Cmd() as cmd:
----> 4     results = cmd.execute_one('select * sample 5 from padw.pl_appn_root')
      5 samp_data = pd.DataFrame(results.items())
      6 samp_data

.../python/conda_envs/py3k/lib/python3.5/site-packages/giraffez/cmd.py in execute_one(self, command, sanitize, silent)
    210         :raises `giraffez.errors.GiraffeError`: if the return data could not be decoded
    211         """
--> 212         return self.execute(command, sanitize, silent).one()
    213 
    214     def execute_many(self, command, sanitize=True, parallel=False, silent=False):

.../python/conda_envs/py3k/lib/python3.5/site-packages/giraffez/cmd.py in execute(self, command, sanitize, silent)
    196             command = prepare_statement(command) # accounts for comments and newlines
    197             log.debug("Debug[2]", "Command (sanitized): {!r}".format(command))
--> 198         return Results(self._execute(command))
    199 
    200     def execute_one(self, command, sanitize=True, silent=False):

.../python/conda_envs/py3k/lib/python3.5/site-packages/giraffez/cmd.py in _execute(self, command)
    109             log.debug("Debug[2]", repr(results))
    110         except _cli.error as error:
--> 111             raise TeradataError(error)
    112         for result in results:
    113             stmt_info = result.get('stmtinfo', None)

TeradataError: CLIv2: initiate request failed

underlying utility used for girafeez?

Can anyone please let me know what is the underlying Teradata utility used for giraffez?

Encrypting and pycryptodome

I've been running into an issue in an environment where I have pycryptodome installed (dependency of another package) when I try to encrypt passwords in .girafferc:

± giraffez config --init
colorful: true
connections:
  db1:
    host: null
    password: null
    username: null
  default: db1

Successfully created file '/Users/username/.girafferc'
Set the username and password using the config module:

    giraffez config --set connections.db1.host <host>
    giraffez config --set connections.db1.username <username>
    giraffez config --set connections.db1.password <password>
Key file '/Users/username/.giraffepg' created successfully.


± giraffez config --set connections.db1.password some_pass
Traceback (most recent call last):
  File "/Users/username/miniconda3/envs/env/bin/giraffez", line 11, in <module>
    sys.exit(main())
  File "/Users/username/miniconda3/envs/env/lib/python3.6/site-packages/giraffez/__main__.py", line 25, in main
    MainCommand().run()
  File "/Users/username/miniconda3/envs/env/lib/python3.6/site-packages/giraffez/core.py", line 80, in run
    args.run(args)
  File "/Users/username/miniconda3/envs/env/lib/python3.6/site-packages/giraffez/commandline.py", line 177, in run
    c.set_value(key, value)
  File "/Users/username/miniconda3/envs/env/lib/python3.6/site-packages/giraffez/config.py", line 298, in set_value
    value = self.encrypt(value, path)
  File "/Users/username/miniconda3/envs/env/lib/python3.6/site-packages/giraffez/config.py", line 171, in encrypt
    value = "ENCRYPTED:{}".format(ensure_str(self.cipher.encrypt(value)))
  File "/Users/username/miniconda3/envs/env/lib/python3.6/site-packages/giraffez/encrypt.py", line 64, in encrypt
    return base64.b64encode(iv + cipher.encrypt(raw))
  File "/Users/username/miniconda3/envs/env/lib/python3.6/site-packages/Crypto/Cipher/_mode_cbc.py", line 160, in encrypt
    expect_byte_string(plaintext)
  File "/Users/username/miniconda3/envs/env/lib/python3.6/site-packages/Crypto/Util/_raw_api.py", line 194, in expect_byte_string
    raise TypeError("Only byte strings can be passed to C code")
TypeError: Only byte strings can be passed to C code

I think that if we change the line here from raw = self.pad(raw) to raw = self.pad(raw).encode("utf-8") that would fix the issue. In the decrypt function it's already assumed that the string is utf-8 encoded, so I don't think this would introduce any downstream issues. That being said I've only done a little bit of testing.

From some reading it sounds like pycryptodome is more supported than pycrypto going forward, but I'm not familiar with the arguments there.

Support for transactions

This might be a silly question since I'm unfamiliar with Teradata, but I didn't see any mention of transactions in the docs or in the repo.

Does Giraffez support transactions that span multiple statements? Is the only way to do that by placing all the statements within a single transaction as part of a single TeradataCmd.execute() execution? i.e. There is no way to define transactions at the connection level.

error during executing giraffez load out_file.txt database.table

I faced problem trying to load data from previously exported file. It is being reported as: "giraffez.errors.GiraffeTypeError: Column 'cif' does not exist" what is quite weird (as stated before the file was created via giraffez export database.table_name output_file.txt -d '\t').

Here is the definition of the table:

CREATE MULTISET TABLE DDHDP01P.AG_KHD_4463_2_BIK_NOWA ,NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT,
DEFAULT MERGEBLOCKRATIO
(
cif CHAR(19) CHARACTER SET LATIN NOT CASESPECIFIC,
nr_rachunku_bik INTEGER,
typ_finansowania CHAR(2) CHARACTER SET LATIN NOT CASESPECIFIC,
data_otwarcia DATE FORMAT 'YYYY-MM-DD',
przewid_data_zakonczenia DATE FORMAT 'YYYY-MM-DD',
okres INTEGER,
waluta CHAR(3) CHARACTER SET LATIN NOT CASESPECIFIC,
kurs CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC,
status CHAR(2) CHARACTER SET LATIN NOT CASESPECIFIC,
data_zamkniecia DATE FORMAT 'YYYY-MM-DD',
data_salda_rachunku DATE FORMAT 'YYYY-MM-DD',
kwota_do_splaty DECIMAL(18,2),
kwota_kredytu DECIMAL(18,2),
kwota_raty DECIMAL(18,2),
data_raportu DATE FORMAT 'YYYY-MM-DD',
kwota_limitu DECIMAL(18,2),
kwota_gwarancji DECIMAL(18,2),
liczba_rat_kapitalowych INTEGER,
czestotliwosc_splat_kapitalu VARCHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC)
PRIMARY INDEX ( cif );

Peter

giraffez version: 1.1.8
Python version: 3.5.2
OS: Windows 7 Enterprise
Teradata Tools and Utilities version: 15.10.05.00

Export right-pads CHAR types with extra spaces

In using the giraffez.Export module, I've found that CHAR(x) types get padded on the right with a number of spaces equal to the the length of the field (in addition to any blank characters that exist at the end of the value that don't fill the full x characters). I believe expected behavior here would be to trim the value as happens with VARCHAR types, and the additional spaces equal to the length of the field are clearly unwelcome.

Here's an example that I think should be reproducible:

>>> with giraffez.Load(host=host, username=username, password=password) as load:
...     load.execute("""create multiset table giraffez_issue (
...         id_var int,
...         variable_field varchar(10),
...         fixed_field char(4),
...         floaty decimal(6, 2)
...     ) primary index (id_var)""")
...     
...     load.insert('giraffez_issue', [
...         (1, 'alpha', 'A1', 1.1),
...         (2, 'beta', 'B', 2.2),
...         (3, 'gamma', 'C3PO', 3.0)
...     ])
... 

defaultdict(<class 'int'>, {'count': 3})
>>> with giraffez.Export('giraffez_issue', host=host, username=username,
...                      password=password, delimiter='|', null='') as export:
...     for row in export.results():
...         print('<<{row}>>'.format(row=row))
... 
<<1|alpha|A1      |1.10>>
<<3|gamma|C3PO    |3.00>>
<<2|beta|B       |2.20>>

Loading Pandas Data to Teradata Problems

I am experiencing problem loading Pandas dataframe to Teradata via MLOAD. First I got NULL values error as len() of NULL returns an error. I dropped all rows with null values but still, no luck. Now I have Teradata PT request failed with code '2583'.

This is my code:

with giraffez.MLoad("DBC.TestTable") as mload:
    mload.columns = df.columns.tolist()
    rows = list(zip(*[df[i].values.tolist() for i in df]))
    for row in rows:
       mload.load_row(row)
    exit_code = mload.finish()
    print("finished load with code {0}".format(exit_code))
    if exit_code == 0:
       mload.cleanup()

Is there a better and working way to mload Pandas DataFrame to Teradata?

Tools:
Python Version: 3.6
Teradata CLIv2 16.10.0.2
Windows 7

Export.encoding documentation missing 'dict' option

The docstring description for the ecoding attribute in the Export class documentation only documents the use of 'text', 'json', and 'archive'. However, in looking at example code and in the export._initiate function, it appears the 'dict' is also support for the 'dict' Python object type.

The 'dict' type is apparently needed for exporting smoothly to DataFrame objects, and so could see wide usage

Please add SPDX lines in license headers.

Include the following lines in the license headers within the scripts. This is to ensure that the automated license scanning tools can identify the license.

SPDX-Copyright: Copyright (c) Capital One Services, LLC
SPDX-License-Identifier: Apache-2.0

Substring error

I seem to be having a problem using the substring command. Simplified code as follows:

import giraffez
query1 = ''' select SUBSTRING('world' ,1,3) as hello'''
with giraffez.Cmd() as cmd:
    results = cmd.execute_one(query1)

Any help or insight would be greatly appreciated!

Aliases not coming through in giraffez.Cmd.execute

When running the following in 2.0.7:

sql = """
SELECT
    field_1 AS alias_1,
    field_2 AS alias_2
FROM
    db.table
"""

with giraffez.Cmd() as cmd:
    res = cmd.execute(sql)
    all_recs = list(res.to_dict())

all_recs looks like [{'FIELD_1': 123, 'FIELD_2': None}, {'FIELD_1': 456, 'FIELD_2': 123}, ...]. Contrast with behaviour in version 1.1.7 on the same SQL:

with giraffez.Cmd() as cmd:
    res = cmd.execute_one(sql)
    all_recs = list(res.to_json())

and all_recs had the "safe" column names, and lower-cased to boot, like [{'alias_1': 123, 'alias_2': None}, {'alias_1': 456, 'alias_2': 123}, ...]

SQLAlchemy support

Is there any plan to have a SQLAlchemy dialect for giraffez?

config init NameError - MESSAGE_WRITE_DEFAULT is not defined

giraffez config --init throws a NameError in 1.1.2:

Traceback (most recent call last):
  File "/home/ec2-user/.local/bin/giraffez", line 9, in <module>
    load_entry_point('giraffez==1.1.2', 'console_scripts', 'giraffez')()
  File "/home/ec2-user/.local/lib/python2.7/site-packages/giraffez/__main__.py", line 25, in main
    MainCommand().run()
  File "/home/ec2-user/.local/lib/python2.7/site-packages/giraffez/core.py", line 91, in run
    args.run(args)
  File "/home/ec2-user/.local/lib/python2.7/site-packages/giraffez/commandline.py", line 147, in run
    log.write(MESSAGE_WRITE_DEFAULT.format(args.conf))
NameError: global name 'MESSAGE_WRITE_DEFAULT' is not defined

commandline.py (line 147) just needs to reference the new variable name from this commit b37054e:

From:
log.write(MESSAGE_WRITE_DEFAULT.format(args.conf))
To:
log.write(message_write_default.format(args.conf))

TeradataError: 3130: Response limit exceeded with cmd.insert

When I try to insert ~1 million rows using the following syntax:

with giraffez.Cmd(panic=True) as cmd:
cmd.insert('db.table_name', 'data.csv')

it runs for about a minute and inserts ~8,000 rows of data, and then stops with the following error:
giraffez.TeradataError: 3130: Response limit exceeded.

I believe this is caused by not closing the responses from Teradata (max allowed to keep open is 16):
https://stackoverflow.com/questions/18903284/teradata-database-3130-response-limit-exceeded

thank you

Import Data From Pandas DataFrame Data Type Issue

When exporting data from Pandas that are not string-objects, raises this error:

Connection: Connecting to data source 'db1' ...
Connection: Connection to 'db1' established successfully.
MLoad: Requesting column info for table 'DBC.TestTable' ...
MLoad: Initiating Teradata PT request (awaiting server)  ...
MLoad: Teradata PT request accepted.
Connection: Closing Teradata connection ...
MLoad: Closing Teradata PT connection ...
MLoad: Acquisition phase was not called before closing.
MLoad: Ending acquisition phase ...
MLoad: Acquisition phase ended.
MLoad: Teradata PT request complete.
Connection: Connection to 'db1' closed.
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-15-4282e9006bed> in <module>()
      3     rows = list(zip(*[df[i].values.tolist() for i in df]))
      4     for row in rows:
----> 5        mload.load_row(row)
      6     exit_code = mload.finish()
      7     print("finished load with code {0}".format(exit_code))

C:\Users\MYUSERNAME\AppData\Local\Continuum\Anaconda3\lib\site-packages\giraffez\mload.py in load_row(self, items, panic)
    372             self._initiate()
    373         try:
--> 374             data = self.processor(items)
    375             row_status = self.load.put_row(data)
    376             if not self._handle_error(row_status):

C:\Users\MYUSERNAME\AppData\Local\Continuum\Anaconda3\lib\site-packages\giraffez\utils.py in _pipeline(x)
     66 def pipeline(funcs):
     67     def _pipeline(x):
---> 68         return reduce(lambda a, f: f(a), funcs, x)
     69     return _pipeline
     70 

C:\Users\MYUSERNAME\AppData\Local\Continuum\Anaconda3\lib\site-packages\giraffez\utils.py in <lambda>(a, f)
     66 def pipeline(funcs):
     67     def _pipeline(x):
---> 68         return reduce(lambda a, f: f(a), funcs, x)
     69     return _pipeline
     70 

C:\Users\MYUSERNAME\AppData\Local\Continuum\Anaconda3\lib\site-packages\giraffez\encoders.py in _encoder(items)
    278                 data.append(value)
    279             elif column.type in VAR_TYPES:
--> 280                 data.append(struct.pack("H", len(item)) + ensure_bytes(item))
    281             elif column.type in DATE_TYPES:
    282                 date_value = Date.from_string(item)

TypeError: object of type 'float' has no len()

Non-string object have no len(): I had to transform my data to str to bypass this issue.

for i in df.columns.tolist():
    df[i] = df[i].values.astype(str)

before passing it to mload:

with giraffez.MLoad("DBC.TestTable") as mload:
    mload.columns = df.columns.tolist()
    rows = list(zip(*[df[i].values.tolist() for i in df]))
    for row in rows:
       mload.load_row(row)
    exit_code = mload.finish()
    print("finished load with code {0}".format(exit_code))
    if exit_code == 0:
       mload.cleanup()

Is this a known issue?

Problem with install on Windows

When I wan't to upgrade giraffez by pip:
pip install --upgrade giraffez

I've got an error:
c:\temp\pip-build-sjcz246m\giraffez\giraffez\src\common.h(34): fatal error C1083: Couldn't open file: '_strptime.c': No such file or directory.

It is not happening when I clone repo from git.

UnicodeDecodeError while installing on windows using python3.5

bash-3.1$ pip install giraffez
Collecting giraffez
Using cached giraffez-1.0.0.dev0.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\qze841\AppData\Local\Temp\pip-build-agnajbb9\giraffez\setup.py", line 371, in
long_description=open("README.rst").read(),
File "c:\Users\qze841\AppData\Local\Continuum\Anaconda3\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 2590: character maps to

----------------------------------------

Command "python setup.py egg_info" failed with error code 1 in C:\Users\qze841\AppData\Local\Temp\pip-build-agnajbb9\giraffez
bash-3.1$ pwd
/c/temp/giraffez
bash-3.1$ python setup.py install
Traceback (most recent call last):
File "setup.py", line 371, in
long_description=open("README.rst").read(),
File "c:\Users\qze841\AppData\Local\Continuum\Anaconda3\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 2636: character maps to
bash-3.1$ python --version
Python 3.5.2 :: Anaconda 4.1.1 (64-bit)

Giraffez build fails with Python 3.7

Python 3.7 updated PyUnicode_AsUTF8 to return a const char* instead of a plain char*. This breaks the current version of Giraffez due to teradatapt.hpp#L114 since TConn::AddAttribute expects a plain char*.

The TPT definition for that method is shown here:
TELAPI_EXPORT void AddAttribute(TD_Attribute, char* value);

I believe that the simplest fix for this would be to cast away the const-ness of the result returned from PyUnicode_AsUTF8, though a better (and potentially safer, depending on CPython's implementation) approach might be to copy the string into a non-const location before passing to AddAttribute. I suppose that ideally the AddAttribute method would already expect a const char*, but that's not really worth fussing about.

Regardless of the approach that is decided upon, I'm willing to draft up and test a PR.

Thanks,
Andrew D.

'bigint' data type overflows on Windows platforms

Hi - When I export bigint fields the exported value does not match the table value.

The steps to recreate the problem are below.

Teradata Code:

create table nztemp.contains_a_bigint( bigint_field bigint not null) no primary index;
insert into nztemp.contains_a_bigint (bigint_field) values (9111200007015236);

Python Code:

import giraffez

with giraffez.Export('nztemp.contains_a_bigint',
                     host='my_host',
                     username='my_user',
                     password='my_password',
                     delimiter='\t',
                     null='',
                     encoding='dict') as export:
    for row in export.results():
        print row
        
>>> {'bigint_field': -1880798396L}

Of course I was expecting 9111200007015236.

Any help would be appreciated.

UnicodeDecodeError: 'utf-8' codec can't decode byte invalid start byte

my environment version info:

giraffez 2.0.19
macOS 10.13.2
python 3.5
Teradata ODBC / TTU 16.10
Teradata Server 15.10

When I run this for any number of tables I get the same result. I am able to get a legitimate response when I use giraffez cmd "select top 1 * from mydb.my_table" I get the print out to stdout properly. But giraffez export seems to be causing issues, only spits out the column headers:

myComp:~|⇒  giraffez export "select top 1 * from mydb.my_table"
Connection: Connecting to data source 'data-server' ...
Connection: Connection to 'data-server' established successfully.
Export: Initiating Teradata PT request (awaiting server)  ...
Export: Teradata PT request accepted.
Export: Executing ...
  delimiter => "|"
  null      => "NULL"
  output    => "stdout"
  encoding  => "str"
  query     => "select top 1 * from mydb.my_table"
--------------------------------
version|id|resid|tid|parentid|off|type|prod|status|dtl|flagb|flagd
Connection: Closing Teradata connection ...
Export: Closing Teradata PT connection ...
Export: Teradata PT request complete.
Connection: Connection to 'data-server' closed.
Traceback (most recent call last):
  File "/Users/ash/miniconda3/envs/py35/bin/giraffez", line 11, in <module>
    sys.exit(main())
  File "/Users/ash/miniconda3/envs/py35/lib/python3.5/site-packages/giraffez/__main__.py", line 25, in main
    MainCommand().run()
  File "/Users/ash/miniconda3/envs/py35/lib/python3.5/site-packages/giraffez/core.py", line 80, in run
    args.run(args)
  File "/Users/ash/miniconda3/envs/py35/lib/python3.5/site-packages/giraffez/commandline.py", line 243, in run
    for i, row in enumerate(exportfn(), 1):
  File "/Users/ash/miniconda3/envs/py35/lib/python3.5/site-packages/giraffez/export.py", line 256, in _fetchall
    data = self.export.get_buffer()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8a in position 376: invalid start byte

The typical way I connect to our Teradata instance using a SQL editor like DBeaver, the JDBC connection string has charset=UTF8 and my ODBC settings are also set that way in odbc.ini file. When I do sys.getdefaultencoding() I am also getting 'utf-8'.

UTF8 / Unicode encoding for csv/dataframe load not storing correctly in DB

I believe this is the same / similar issue as #47 this as I'm having the same issue.

giraffez 2.0.24
macOS 10.14.2 / Ubuntu 16
python 3.5 (on macOS) / python 3.6 (on Ubuntu box)
Teradata ODBC / TTU 16.10
Teradata Server 15.10
SQL client: dbeaver / Aqua Data Studio (tried both)
JDBC for SQL client: TeraJDBC_16.20.00.02

import giraffez as g
import pandas as pd

with g.Cmd() as cmd:
    cmd.execute("""
CREATE SET TABLE mydb.mytable ,NO FALLBACK ,
     NO BEFORE JOURNAL,
     NO AFTER JOURNAL,
     CHECKSUM = DEFAULT,
     DEFAULT MERGEBLOCKRATIO
     (
      location VARCHAR(250) CHARACTER SET UNICODE NOT CASESPECIFIC
PRIMARY INDEX ( location );""")

df = pd.DataFrame({'location': ['Düsseldorf','Marché Saint-Germain','İstanbul']})
df.to_csv('location.csv', index=False)

with g.BulkLoad(table='mydb.mytable') as load:
    load.from_file('location.csv', table='mydb.mytable', delimiter=",", null='')

select *
from mydb.mytable

location             
---------------------
DÃ¼sseldorf          
MarchÃ© Saint-Germain
Ä°stanbul

Have the same result if I use the dataframe to_list and iterate over the rows as the OP had done (from #47). The character set of my connection to TD when I do the select * is utf-8 and I'm using a GUI client. I am able to see these specific unicode characters and non-latin characters in other tables where the variable's character set is UNICODE in the create table statement on the same database. So I believe this is isolated and not related to either my SQL client or the TD server itself.

Originally posted by @ausiddiqui in #47 (comment)

capitalone / giraffez Goto Github PK

giraffez's People

Contributors

Stargazers

Watchers

Forkers

giraffez's Issues

Example 1 - Titles:

Example 2 - duplicate column names in multiple tables:

Recommend Projects

Recommend Topics

Recommend Org

Jobs