maybelinot / df2gspread Goto Github PK

View Code? Open in Web Editor NEW

130.0 5.0 34.0 119 KB

Manage Google Spreadsheets in Pandas DataFrame with Python

License: GNU General Public License v3.0

Python 100.00%

pandas pandas-dataframe gspread google-sheets google-spreadsheet python

df2gspread's Introduction

df2gspread

Transfer data between Google Spreadsheets and Pandas DataFrame.

Description

Python library that provides possibility to transport table-data between Google Spreadsheets and Pandas DataFrame for further management or processing. Can be useful in all cases, when you need to handle the data located in Google Drive.

Status

Latest Release
Build
Docs
License

Install

Example install, using VirtualEnv:

# install/use python virtual environment
virtualenv ~/virtenv_scratch --no-site-packages

# activate the virtual environment
source ~/virtenv_scratch/bin/activate

# upgrade pip in the new virtenv
pip install -U pip setuptools

# install this package in DEVELOPMENT mode
# python setup.py develop

# simply install
# python setup.py install

# or install via pip
pip install df2gspread

Access Credentials

To allow a script to use Google Drive API we need to authenticate our self towards Google. To do so, we need to create a project, describing the tool and generate credentials. Please use your web browser and go to Google console and :

Choose "Create Project" in popup menu on the top.
A dialog box appears, so give your project a name and click on "Create" button.
On the left-side menu click on "API Manager".
A table of available APIs is shown. Switch "Drive API" and click on "Enable API" button. Other APIs might be switched off, for our purpose.
On the left-side menu click on "Credentials".
In section "OAuth consent screen" select your email address and give your product a name. Then click on "Save" button.
In section "Credentials" click on "Add credentials" and switch "OAuth 2.0 client ID".
A dialog box "Create Cliend ID" appears. Select "Application type" item as "Other".
Click on "Create" button.
Click on "Download JSON" icon on the right side of created "OAuth 2.0 client IDs" and store the downloaded file on your file system. Please be aware, the file contains your private credentials, so take care of the file in the same way you care of your private SSH key; i.e. move downloaded JSON file to ~/.gdrive_private.
Then, the first time you run it your browser window will open a google authorization request page. Approve authorization and then the credentials will work as expected.

Usage

Run df2gspread like:

from df2gspread import df2gspread as d2g
import pandas as pd
d = [pd.Series([1., 2., 3.], index=['a', 'b', 'c']),
    pd.Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])]
df = pd.DataFrame(d)

# use full path to spreadsheet file
spreadsheet = '/some/folder/New Spreadsheet'
# or spreadsheet file id
# spreadsheet = '1cIOgi90...'

wks_name = 'New Sheet'

d2g.upload(df, spreadsheet, wks_name)
# if spreadsheet already exists, all data of provided worksheet(or first as default)
# will be replaced with data of given DataFrame, make sure that this is what you need!

Run gspread2df like:

from df2gspread import gspread2df as g2d

# use full path to spreadsheet file
spreadsheet = '/some/folder/New Spreadsheet'
# or spreadsheet file id
# spreadsheet = '1cIOgi90...'
wks_name = 'New Sheet'

df = g2d.download(spreadsheet, wks_name, col_names = True, row_names = True)

Documentation

Documentation is available here.

Testing

Testing is py.test based. Run with:

py.test tests/ -v

Or with coverage:

coverage run --source df2gspread -m py.test
coverage report

Development

Install the supplied githooks; eg:

ln -s ~/repos/df2gspread/_githooks/commit-msg ~/repos/df2gspread/.git/hooks/commit-msg
ln -s ~/repos/df2gspread/_githooks/pre-commit ~/repos/df2gspread/.git/hooks/pre-commit

df2gspread's People

Contributors

Stargazers

Watchers

df2gspread's Issues

pycrypto is dead, raises problems on Windows.

Hi, sorry to create a second issue, but I am incurring in a dead end. I am testing the app i created using this amazing library on a Windows, and unfortunately one of the core dependencies that I see in setup.py is not being updated anymore, and on Windows, using the Visual Studio Community edition compiler this raises all sorts of errors. Would it be possible to switch to PyCryptodome instead?

Getting authentication issues after Auth Flow Browser Popup

If your browser is on a different machine then exit and re-run this
application with the command-line parameter

  --noauth_local_webserver

---------------------------------------------------------------------------
IOError                                   Traceback (most recent call last)
<ipython-input-265-3278267de75b> in <module>()
----> 1 d2g.upload(df, 'OTAT Donor Database', 'New2')

/Users/xxxx/anaconda/lib/python2.7/site-packages/df2gspread/df2gspread.pyc in upload(df, gfile, wks_name, chunk_size, col_names, row_names, clean, credentials, start_cell, df_size, new_sheet_dimensions)
     74     '''
     75     # access credentials
---> 76     credentials = get_credentials(credentials)
     77     # auth for gspread
     78     gc = gspread.authorize(credentials)

/Users/xxxx/anaconda/lib/python2.7/site-packages/df2gspread/utils.pyc in get_credentials(credentials, client_secret_file, refresh_token)
     78         flow.redirect_uri = client.OOB_CALLBACK_URN
     79         if flags:
---> 80             credentials = tools.run_flow(flow, store, flags)
     81         else:  # Needed only for compatability with Python 2.6
     82             credentials = tools.run(flow, store)

/Users/xxxx/anaconda/lib/python2.7/site-packages/oauth2client/util.pyc in positional_wrapper(*args, **kwargs)
    133                 elif positional_parameters_enforcement == POSITIONAL_WARNING:
    134                     logger.warning(message)
--> 135             return wrapped(*args, **kwargs)
    136         return positional_wrapper
    137

/Users/xxxxx/anaconda/lib/python2.7/site-packages/oauth2client/tools.pyc in run_flow(flow, storage, flags, http)
    237         sys.exit('Authentication has failed: %s' % e)
    238
--> 239     storage.put(credential)
    240     credential.set_store(storage)
    241     print('Authentication successful.')

/Users/xxxxx/anaconda/lib/python2.7/site-packages/oauth2client/client.pyc in put(self, credentials)
    432         self.acquire_lock()
    433         try:
--> 434             self.locked_put(credentials)
    435         finally:
    436             self.release_lock()

/Users/xxxxx/anaconda/lib/python2.7/site-packages/oauth2client/file.pyc in locked_put(self, credentials)
     93             CredentialsFileSymbolicLinkError if the file is a symbolic link.
     94         """
---> 95         self._create_file_if_needed()
     96         self._validate_file()
     97         f = open(self._filename, 'w')

/Users/xxxxxx/anaconda/lib/python2.7/site-packages/oauth2client/file.pyc in _create_file_if_needed(self)
     80             old_umask = os.umask(0o177)
     81             try:
---> 82                 open(self._filename, 'a+b').close()
     83             finally:
     84                 os.umask(old_umask)

IOError: [Errno 2] No such file or directory: '/Users/xxxxx/.oauth/drive.json'`

Getting the above error after I complete the Browser authentication piece (So I get through the auth stuff, it uses gdrive_private, etc)

Release to conda-forge

Would you be interested to release your package to conda-forge ? This is widely used in the Python scientific community.

I can help with that if needed.

Do not ask permission to rewrite worksheet in new created spreadsheet.

Clarification on authentication process

My goal is to upload a pandas dataframe to google sheets. I've been struggling with authentication.

My code is:


spreadsheet = '/Users/aschharwood/Google_Drive/gspread/acled_5.gsheet'
wks_name = "Sheet1"

d2g.upload(df, spreadsheet, wks_name)

The first error I got was:

`---------------------------------------------------------------------------
InvalidClientSecretsError                 Traceback (most recent call last)
<ipython-input-33-85103973c811> in <module>()
----> 1 d2g.upload(df, spreadsheet, wks_name)

/Users/aschharwood/anaconda2/lib/python2.7/site-packages/df2gspread/df2gspread.pyc in upload(df, gfile, wks_name, chunk_size, col_names, row_names, clean, credentials, start_cell, df_size, new_sheet_dimensions)
     74     '''
     75     # access credentials
---> 76     credentials = get_credentials(credentials)
     77     # auth for gspread
     78     gc = gspread.authorize(credentials)

/Users/aschharwood/anaconda2/lib/python2.7/site-packages/df2gspread/utils.pyc in get_credentials(credentials, client_secret_file, refresh_token)
     81 
     82         flow = client.flow_from_clientsecrets(
---> 83             client_secret_file, SCOPES)
     84         flow.redirect_uri = client.OOB_CALLBACK_URN
     85         if flags:

/Users/aschharwood/anaconda2/lib/python2.7/site-packages/oauth2client/util.pyc in positional_wrapper(*args, **kwargs)
    135                 elif positional_parameters_enforcement == POSITIONAL_WARNING:
    136                     logger.warning(message)
--> 137             return wrapped(*args, **kwargs)
    138         return positional_wrapper
    139 

/Users/aschharwood/anaconda2/lib/python2.7/site-packages/oauth2client/client.pyc in flow_from_clientsecrets(filename, scope, redirect_uri, message, cache, login_hint, device_uri)
   2103     try:
   2104         client_type, client_info = clientsecrets.loadfile(filename,
-> 2105                                                           cache=cache)
   2106         if client_type in (clientsecrets.TYPE_WEB,
   2107                            clientsecrets.TYPE_INSTALLED):

/Users/aschharwood/anaconda2/lib/python2.7/site-packages/oauth2client/clientsecrets.pyc in loadfile(filename, cache)
    164 
    165     if not cache:
--> 166         return _loadfile(filename)
    167 
    168     obj = cache.get(filename, namespace=_SECRET_NAMESPACE)

/Users/aschharwood/anaconda2/lib/python2.7/site-packages/oauth2client/clientsecrets.pyc in _loadfile(filename)
    124     except IOError as exc:
    125         raise InvalidClientSecretsError('Error opening file', exc.filename,
--> 126                                         exc.strerror, exc.errno)
    127     return _validate_clientsecrets(obj)
    128 

InvalidClientSecretsError: ('Error opening file', '/Users/aschharwood/.gdrive_private', 'No such file or directory', 2)

When I did create the directory, I got an error that said .gdrive_private IS a directory. I've also tried renaming the file and putting it in my working directory.

Can you please clarify where exactly i should store the client_secrets file for authorization?

Thanks!

type question

When uploading to the Google sheet, all data types are string.
How can I save the type?

Make worksheet argument optional if only one worksheet exists

exit silently if no raw_data

df2gspread/df2gspread/gspread2df.py

Lines 84 to 85 in 01bc46d

 if not raw_data: 

 sys.exit()

Exiting silently when raw_data is empty seems like unreasonable behaviour.

This should either return an empty dataframe, or raise pandas.errors.EmptyDataError.

Context: I use this library with Luigi as part of an ETL workflow. When raw_data is empty, the whole thing exits without giving any signal as to why.

Happy to draw up a PR if you'd prefer.

Add flexibility in handling credentials

I was trying to use this on a server and was having issues with the credentials. It might be useful to support either dependency injection of the google authentication object as an option, or directly support the --noauth_local_webserver option, and/or the service account credentials type.

I believe, at the moment, that the only way to authenticate with the current setup is by having a web browser to open the page up on the machine that is accessing the data.

Updating data is slow

Looks like it can take up to a few minutes for each call to df2gspread.upload. Any reason that might be inefficient, or using an older version of the API? For a longer script that updates a few thousand cells it can take up to 10 minutes

unclosed socket warning while using g2d.download

ResourceWarning: unclosed <ssl.SSLSocket fd=xxxx family=AddressFamily.AF_INET6, ...

hi every one
g2d and d2g works great BUT

this warning appears while using g2d.download in unittest

any idea how to remove it or fix it?

The published version on pypi is outdated, is this library abononed or moved somewhere else?

Uploading empty DataFrame setting clean=True & col_names=True writes two rows with headers

    d2g.upload(df,
               spreadsheet_key,
               wks_name,
               credentials=credentials,
               col_names=True,
               row_names=False,
               clean=True)

as per title, when uploading an empty DataFrame with clear=True, the sheet is correctly emptied but it still writes 2 rows of data corresponding to the DataFrame headers.

clean argument of upload function not compatible with providing own credentials

Hi. I think I've worked out a problem I was getting.

If you need to provide your own credentials for uploading a data frame to a gsheet, and you also set the clean parameter on the upload function to True, then you'll get an error, e.g.

oauth2client.clientsecrets.InvalidClientSecretsError: ('Error opening file', '/home/user/.gdrive_private', 'No such file or directory', 2)

... and I think this is because the clean_worksheet function calls upload without providing the credentials argument.

But, you know, I'm not sure if the clean argument is needed anyway, is it? It seems as though an upload replaces the whole gsheet tab anyway (and I think I've seen this mentioned somewhere in the documentation).

Thanks

Kev

"Installed application" option no more available from Google Developer Console?

Thank you for sharing this llibrary, as promised I'm testing the basics ASAP :)

dependencies

I installed this dependencies on top of my Anaconda Python 2.7 dist (you may want to add this to your readme? Are there more dependencies?):

pip install gspread
pip install --upgrade google-api-python-client

The latter one installed:

Successfully installed 
google-api-python-client-1.4.2 
oauth2client-1.5.1 
simplejson-3.8.0 
uritemplate-0.6

FIrst question: Are this the same versions you are working with?

oauth2 id creation problems

Tried to follow your detailed steps how to create a ClientId for "Installed application".
All your steps fit the Google documentation I found here:

But I'm not able to find the "Installed application" option in this step:

Is it still available for you?
Or was there a change, which is not reflected yet in Google docs?

`oauth2client.file` module problems

In parallel I've tested the basic steps in an interactive IPython session, and stumbled over this problem:

In [1]: import oauth2client, os
In [2]: CLIENT_SECRET_FILE = os.path.expanduser('~/.gdrive_private')
In [3]: DEFAULT_TOKEN = os.path.expanduser('~/.oauth/drive.json')
In [4]: store = oauth2client.file.Storage(DEFAULT_TOKEN)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-4-28c0bcf4ba8d> in <module>()
----> 1 store = oauth2client.file.Storage(DEFAULT_TOKEN)

AttributeError: 'module' object has no attribute 'file'

This irritates me, because "file.py" is in the lib and can be imported like this:

In [5]: from oauth2client import file
In [6]: oauth2client.file
Out[6]: <module 'oauth2client.file' from 'C:\Anaconda230-64bit\lib\site-packages\oauth2client\file.pyc'>

next steps

Would be great, if we can figure out this basics together.
I'm really interested in the Google Drive / Pandas integration, but struggled with the OAuth2 4 months ago... and now it is blocking me again.

requests.exceptions.ChunkedEncodingError

With really big data sheets (25k rows) I consistently get the error:

requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(6075 bytes read, 71 more expected)', IncompleteRead(6075 bytes read, 71 more expected))

Any thoughts on this?

Can't upload df with non-string column object types

Set up dataframe like so:

import pandas
from datetime import datetime
df = pd.DataFrame([[datetime.today().date(), 
                   datetime.today(),
                   "2018-05-07",
                   20180507]], 
                  columns=["date_object","timestamp","string","int"])

Check object dtypes:

print(df.dtypes)

date_object            object
timestamp      datetime64[ns]
string                 object
int                     int64
dtype: object

Uploading the date_object, timestamp, and int64 columns all yield the below error. Probably should convert everything to string before upload I guess, but maybe there is a better solution.

d2g.upload(df[['date_object']],sh,"temp")


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-34-1982f2c8ec17> in <module>()
----> 1 d2g.upload(df,xxxxx,"temp")

/usr/local/lib/python3.6/dist-packages/df2gspread/df2gspread.py in upload(df, gfile, wks_name, chunk_size, col_names, row_names, clean, credentials, start_cell, df_size, new_sheet_dimensions)
    146                 cell_list[i + j * len(df.columns.values)].value = df[col][idx]
    147 
--> 148     wks.update_cells(cell_list)
    149     return wks
    150 

/usr/local/lib/python3.6/dist-packages/gspread/models.py in update_cells(self, cell_list, value_input_option)
    607             },
    608             body={
--> 609                 'values': values_rect
    610             }
    611         )

/usr/local/lib/python3.6/dist-packages/gspread/models.py in values_update(self, range, params, body)
    113     def values_update(self, range, params=None, body=None):
    114         url = SPREADSHEET_VALUES_URL % (self.id, quote(range, safe=''))
--> 115         r = self.client.request('put', url, params=params, json=body)
    116         return r.json()
    117 

/usr/local/lib/python3.6/dist-packages/gspread/client.py in request(self, method, endpoint, params, data, json, files, headers)
     71             data=data,
     72             files=files,
---> 73             headers=headers
     74         )
     75 

/usr/local/lib/python3.6/dist-packages/requests/sessions.py in put(self, url, data, **kwargs)
    564         """
    565 
--> 566         return self.request('PUT', url, data=data, **kwargs)
    567 
    568     def patch(self, url, data=None, **kwargs):

/usr/local/lib/python3.6/dist-packages/requests/sessions.py in request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    492             hooks=hooks,
    493         )
--> 494         prep = self.prepare_request(req)
    495 
    496         proxies = proxies or {}

/usr/local/lib/python3.6/dist-packages/requests/sessions.py in prepare_request(self, request)
    435             auth=merge_setting(auth, self.auth),
    436             cookies=merged_cookies,
--> 437             hooks=merge_hooks(request.hooks, self.hooks),
    438         )
    439         return p

/usr/local/lib/python3.6/dist-packages/requests/models.py in prepare(self, method, url, headers, files, data, params, auth, cookies, hooks, json)
    306         self.prepare_headers(headers)
    307         self.prepare_cookies(cookies)
--> 308         self.prepare_body(data, files, json)
    309         self.prepare_auth(auth, url)
    310 

/usr/local/lib/python3.6/dist-packages/requests/models.py in prepare_body(self, data, files, json)
    456             # provides this natively, but Python 3 gives a Unicode string.
    457             content_type = 'application/json'
--> 458             body = complexjson.dumps(json)
    459             if not isinstance(body, bytes):
    460                 body = body.encode('utf-8')

/usr/lib/python3.6/json/__init__.py in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
    229         cls is None and indent is None and separators is None and
    230         default is None and not sort_keys and not kw):
--> 231         return _default_encoder.encode(obj)
    232     if cls is None:
    233         cls = JSONEncoder

/usr/lib/python3.6/json/encoder.py in encode(self, o)
    197         # exceptions aren't as detailed.  The list call should be roughly
    198         # equivalent to the PySequence_Fast that ''.join() would do.
--> 199         chunks = self.iterencode(o, _one_shot=True)
    200         if not isinstance(chunks, (list, tuple)):
    201             chunks = list(chunks)

/usr/lib/python3.6/json/encoder.py in iterencode(self, o, _one_shot)
    255                 self.key_separator, self.item_separator, self.sort_keys,
    256                 self.skipkeys, _one_shot)
--> 257         return _iterencode(o, 0)
    258 
    259 def _make_iterencode(markers, _default, _encoder, _indent, _floatstr,

/usr/lib/python3.6/json/encoder.py in default(self, o)
    178         """
    179         raise TypeError("Object of type '%s' is not JSON serializable" %
--> 180                         o.__class__.__name__)
    181 
    182     def encode(self, o):

TypeError: Object of type 'Timestamp' is not JSON serializable

Uploading the string (d2g.upload(df[['string']],sh,"temp")) works fine.

AttributeError: 'Worksheet' object has no attribute 'get_int_addr'

Hi, I have used this package wonderfully but when I was testing my application on a new computer I incurred into the error above.

To test I created a new environment called Transparency_Italy_TRAC using miniconda and installed df2gspread using conda install -c conda-forge df2gspread. When running python app.py I get the following full error:

Traceback (most recent call last): File "app.py", line 138, in d2g.upload(value, SPREADSHEET_ID, key, credentials=credentials, row_names=True)
File "/home/gabriele/anaconda3/envs/Transparency_Italy_TRAC/lib/python3.6/site-
packages/df2gspread/df2gspread.py", line 101, in upload
start_row_int, start_col_int = wks.get_int_addr(start_cell)
AttributeError: 'Worksheet' object has no attribute 'get_int_addr'

So I went to check in "/home/gabriele/anaconda3/envs/Transparency_Italy_TRAC/lib/python3.6/site- packages/df2gspread/df2gspread.py

The thing is that when running app.py in the Anaconda3 base environment where I had installed df2gspread - there was no problem. So I went to source code of the package in both environments and actually found that in the one in the environment I created to test the project the files were last updated in 2016 - while in the package of the one downloaded in Anaconda3 some files of the library were updated in 2018.

The conda install -c conda-forge df2gspread seems to be installing an older version - any idea how to solve this?

New Sheet in first position

Hi,

df2gspread is awesome!

By default, a newly created sheet's position is at the end.

Would be great to have a parameter in Order to add a New sheet in first position.

Cache mechanism

This lib is super useful ! Thanks for that.

What do you think of a cache mechanism that will store the df as an .h5 file in a standard path such as $HOME/.df2gspread or similar ?

Google spreadsheets do not allow having duplicated column names

Needs solution for duplicated columns in Pandas DataFrame.

Having issues writing a "Note" column with various encodings

The specific sentence the upload stops working on is here:

u'Love what you do to help the most kindest souls on the planet!!\x1a\x1a Thank you so so much!!!!'

What happens is basically the grouped object (This is row 1100 / 7000) shows it as writing, and the cells object ( wks.update_cells(list(cells)) ) looks totally fine, but it just doesn't write correctly to the Google doc. The ~20 or so rows in the group end up being blank, with the index written previous still being there.

I've tried using:
d['Note'] = d['Note'].str.decode('utf8')

Along with encoding to utf8. Encoding to ascii/others doesn't work on the series as I get errors.

Is there a specific encoding I should be using, or some kind of safe method to get this to write?

Polluting global logging space.

Hello!

Importing df2gspread always changes the global logging level to ERROR. I'm pretty sure it's due to https://github.com/maybelinot/df2gspread/blob/master/df2gspread/utils.py#L19

Maybe it would be better to set the basicConfig with the logr object instead of the global namespace.

Authorization via email and password

Secrets Error from oauth2client deprecation

Recently, Google deprecated the oauth2client. https://google-auth.readthedocs.io/en/latest/oauth2client-deprecation.html
So, we have error when we try to use d2g.upload and g2d.download with the below Error message.

I think it is possible to reproduce the below error when we run the below code in Google Colab.

(Please replace 'XXXXXX' in gfile.)

from google.colab import auth
auth.authenticate_user()

from google.auth import default
credentials, _ = default()

from df2gspread import df2gspread as d2g
from df2gspread import gspread2df as g2d
import numpy as np
import pandas as pd

df = pd.DataFrame(np.arange(16).reshape(4, 4))

d2g.upload(df=df, gfile='XXXXXXXXXXXXXXXXXXXXXXXXXX', wks_name='test', credentials=credentials, row_names=False)

↓ Error message

Invalid credentials supplied. Will generate from default token.
/usr/local/lib/python3.7/dist-packages/oauth2client/_helpers.py:255: UserWarning: Cannot access /root/.oauth/drive.json: No such file or directory
  warnings.warn(_MISSING_FILE_MESSAGE.format(filename))
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
[/usr/local/lib/python3.7/dist-packages/oauth2client/clientsecrets.py](https://localhost:8080/#) in _loadfile(filename)
    120     try:
--> 121         with open(filename, 'r') as fp:
    122             obj = json.load(fp)

FileNotFoundError: [Errno 2] No such file or directory: '/root/.gdrive_private'

During handling of the above exception, another exception occurred:

InvalidClientSecretsError                 Traceback (most recent call last)
6 frames
[/usr/local/lib/python3.7/dist-packages/oauth2client/clientsecrets.py](https://localhost:8080/#) in _loadfile(filename)
    123     except IOError as exc:
    124         raise InvalidClientSecretsError('Error opening file', exc.filename,
--> 125                                         exc.strerror, exc.errno)
    126     return _validate_clientsecrets(obj)
    127 

InvalidClientSecretsError: ('Error opening file', '/root/.gdrive_private', 'No such file or directory', 2)

oauth2client.clientsecrets.InvalidClientSecretsError: ('Error opening file',.......) in df2gpread

Hi i am trying to upload my sample dataframe to my google sheet using df2gpread which i also tried with gspread-pandas but the problem i am facing is storing the google credentials file at right place in both the case. for eg:

from df2gspread import df2gspread as d2g
import pandas as pd

d = [pd.Series([1., 2., 3.], index=['a', 'b', 'c']),
pd.Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])]
df = pd.DataFrame(d)
print(df)

spreadsheet = '1DxR8u57A8ASMJ4re_4coheYz0Q2xgMlb-4_5lDjy3w8'
wks_name = 'Sheet2'
d2g.upload(df, spreadsheet, wks_name)

This is my actual code and following is the error i keep on getting:
oauth2client.clientsecrets.InvalidClientSecretsError: ('Error opening file', 'C:\Users\RAJ\.gdrive_private', 'No such file or directory', 2)

The credentials file is stored in "C:\Users\RAJ" this path and filename is .gdrive_private still getting that error. Please help

Gsheets API v4 429 Error

I'm attempting to load a dataframe to a gsheet with code that has previously worked. The Google API is responding with an error code 429. Here is the full error:

APIError: {
  "error": {
    "code": 429,
    "message": "Insufficient tokens for quota 'ReadGroup' and limit 'USER-100s' of service 'sheets.googleapis.com' for consumer 'project_number:REMOVED.",
    "status": "RESOURCE_EXHAUSTED",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.Help",
        "links": [
          {
            "description": "Google developer console API key",
            "url": "https://console.developers.google.com/project/REMOVED/apiui/credential"
          }

Researching the error here (https://developers.google.com/analytics/devguides/reporting/core/v4/errors) , the error appears to suggest the fix is to "Retry using exponential back-off. You need to slow down the rate at which you are sending the requests."

It's not apparent in the documentation of df2spread, if or how this can be accomplished.

Here is the command being sent (tmp is a pandas dataframe):

d2g.upload(tmp, spreadsheet, accel, credentials=credentials,
                   row_names=False, df_size=True, new_sheet_dimensions=(tmp.shape))

Permission denied

Hi!

I'm facing a problem. There is an error which I cannot fix:
InvalidClientSecretsError: ('Error opening file', 'C:\Users\***/.gdrive_private', 'Permission denied', 13)

I opened an access for all for this folder but it does not help.
Any ideas how to fix it?

gspread authentication on df2gspread

Hi,

I'm trying to upload to drive using an already authenticated session with gspread:

scope = ['https://spreadsheets.google.com/feeds']
creds = ServiceAccountCredentials.from_json_keyfile_name('client_secret.json', scope)
client = gspread.authorize(creds)

I can add rows just fine, which is awesome BTW.
sheet = client.open("file_name")
worksheet = sheet.worksheet(sheet_name)
row = something. ..
worksheet.append_row(row)

Now I'm trying to use pandas, with something like this:

df = pd.read_csv(file_name)
d2g.upload(df, sheet, worksheet)

But I got file no exists errors, is possible to reuse the auth already working?

I tried to copy my client_secret.json to ~/.gdrive_private and ~/.oauth/drive.json as I'm not sure which is the correct one and seems to be asking for both, anyway I only have one file client_secret.json and I got this error: File "/usr/local/lib/python2.7/site-packages/oauth2client/client.py", line 302, in new_from_json
module_name = data['_module']
KeyError: '_module'

Thanks!

Authentication on headless server

I'm trying to run some code using df2gspread on a headless server (using Xvfb), and running into a challenge for the authentication, since I cannot open a browser to authorize my application.

The code suggests:

If your browser is on a different machine then exit and re-run this
application with the command-line parameter

  --noauth_local_webserver

but I'm not sure how I would specify that parameter from my df2gspread code. Any suggestions/alternatives?

apiclient should be changed to googleapiclient

while installing and running this package for the first time, i encountered an error

AttributeError: module 'googleapiclient' has no attribute '__version__' df2gspread

i fixed it by changing the following in the file gfiles.py:

from apiclient import discovery, errors

from googleapiclient import discovery, errors

Key error when uploading dataframe with two columns with same label

If df[col] is not a series (because df has more than one column named col), then pd.isnull(df[col][idx]) throws a KeyError. This occurs on line 144 of df2gspread.py.

 for j, idx in enumerate(df.index): 
      for i, col in enumerate(df.columns.values):
        if not pd.isnull(df[col][idx]):
            cell_list[i + j * len(df.columns.values)].value = df[col][idx]`

Version 1.0.5 is not available on PyPi

Hi, I noticed that the latest version of this package is not available on PyPI, is this project still active?

https://pypi.org/project/df2gspread/

Can't install package on anaconda

Hi,

I tried to instal df2gspread on anaconda, this, unfortunately, doesn't work.

I add the following line in conda run:
conda install -c conda-forge df2gspread

this gave me the following error:

UnsatisfiableError: The following specifications were found
to be incompatible with the existing python installation in your environment:

Specifications:

  - df2gspread -> python[version='2.7.*|3.5.*|3.6.*']

Your python: python=3.7

If python is on the left-most side of the chain, that's the version you've asked for.
When python appears to the right, that indicates that the thing on the left is somehow
not available for the python version you are constrained to. Note that conda will not
change your python version to a different minor version unless you explicitly specify
that.

How can i solve this? (Maybe a beginner's question? I've only just started programming)

Looking forward to your respons.

Jeroen

Update df2gspread to use gspread >= 2.0 in order to speed up insertions

@maybelinot Looks like gspread was updated last month to use the new Google Sheet API v4, which is lightning fast compared to the API gspread in the version we depend on uses: https://github.com/burnash/gspread/releases/tag/v2.0.0. Likely breaking changes that we'll have to resolve, but it'll solve or mitigate issues like #30 and #26, which I can confirm that I have as well when I try to upload large dataframes to sheets.

Pass GoogleService Object as Credentials

Hey

Sometimes I use service account to connect to google services. It has different credentials. But also sometimes I use regular credentials. So I wrote a function which returns a service object, so I can use it to read from google sheet. Like this:

import apiclient
from google.oauth2 import service_account as google_service_account
...
credentials = google_service_account.Credentials.from_service_account_file(
                            credentialPath, scopes = scopes )
service     = apiclient.discovery.build(apiName, apiVersion, credentials = credentials)

Can you please change df2gspread.upload function, so I can pass to it service object

Thanks in advance

Allow us to set how uploaded data should be interpreted.

From what I understand, by default when we use the upload() from df2gspread, dataframe values are converted to string then upload is performed.

I struggled with it when I had to upload values which were prices. Same situation if I would like them to stay as digits after update, without my interaction in google docs.

Simple fix.
In upload() add argument value_input_opt='RAW'

df2gspread/df2gspread/df2gspread.py

Lines 24 to 26 in 01bc46d

 def upload(df, gfile="/New Spreadsheet", wks_name=None, 

 col_names=True, row_names=True, clean=True, credentials=None, 

 start_cell = 'A1', df_size = False, new_sheet_dimensions = (1000,100)):

And in three function calls value_input_option=value_input_opt

df2gspread/df2gspread/df2gspread.py

Line 126 in 01bc46d

wks.update_cells(cell_list)

df2gspread/df2gspread/df2gspread.py

Line 134 in 01bc46d

wks.update_cells(cell_list)

df2gspread/df2gspread/df2gspread.py

Line 147 in 01bc46d

wks.update_cells(cell_list)

From: developers.google.com
What we get? Now in upload we could set parameter value_input_option='USER_ENTERED', so when we upload digits they will be automatically parsed as numbers.
If we will not specify value_input_option in upload() it will stay default as 'RAW' so data will be uploaded as it is now.

Not sure is it related?: #41

It's my first time ever adding Issue at github. I will gladly accept advice if I should improve something.

oauth2client is deprecated

df2gspread/df2gspread/df2gspread.py

Line 55 in f14da35

:type credentials: class 'oauth2client.client.OAuth2Credentials'

Google has deprecated oauth2client.

According to the gspread docs we should use google-auth instead. Shouldn't the df2gspread function allow for passing in the gsrpead.client.Client object instead of messing with credentials? That way users can manage their credentials and authorization with gspread and this package doesn't have to worry about the deprecation.

File ~/miniconda3/envs/py39/lib/python3.9/site-packages/df2gspread/df2gspread.py:125, in upload(df, gfile, wks_name, col_names, row_names, clean, credentials, start_cell, df_size, new_sheet_dimensions)
    123     cell_list = wks.range('%s%s:%s%s' % (first_col, start_row, last_col, start_row))
    124     for idx, cell in enumerate(cell_list):
--> 125         cell.value = df.columns.astype(str)[idx]
    126     wks.update_cells(cell_list)
    128 # Addition of row names

File ~/miniconda3/envs/py39/lib/python3.9/site-packages/pandas/core/indexes/multi.py:3733, in MultiIndex.astype(self, dtype, copy)
   3731     raise NotImplementedError(msg)
   3732 elif not is_object_dtype(dtype):
-> 3733     raise TypeError(
   3734         "Setting a MultiIndex dtype to anything other than object "
   3735         "is not supported"
   3736     )
   3737 elif copy is True:
   3738     return self._view()

TypeError: Setting a MultiIndex dtype to anything other than object is not supported

	def upload(df, gfile="/New Spreadsheet", wks_name=None,
	col_names=True, row_names=True, clean=True, credentials=None,
	start_cell = 'A1', df_size = False, new_sheet_dimensions = (1000,100)):

maybelinot / df2gspread Goto Github PK

df2gspread's Introduction

df2gspread

Description

Status

Install

Access Credentials

Usage

Documentation

Testing

Development

df2gspread's People

Contributors

Stargazers

Watchers

Forkers

df2gspread's Issues

dependencies

oauth2 id creation problems

oauth2client.file module problems

next steps

Recommend Projects

Recommend Topics

Recommend Org

Jobs

`oauth2client.file` module problems