GithubHelp home page GithubHelp logo

replicate_gdrive's Introduction

Replicate Google Drive

Content

  1. Abstract
  2. Setup Python Environment
  3. Install the Google APIs client library for Python
  4. Specifying project in Google Cloud Console
  5. Authorize API requests (user authorization)
  6. Running the Code
  7. Future Scopes
  8. Bibliography

Abstract

When you try and download multiple files, or a folder with multiple sub-folders and files, you'll see that Google Drive doesn't simply download all the files sequentially. It will first attempt to zip all the files, and then download the zip file.

Here's the problem with this approach. When you have a LOT of files, or a big folder (with multiple sub-folders and files), the zipping takes a lot of time, and it often fails to zip altogether.

So I created the program that (given a folder) downloads all the files in that folder, and then recursively goes into each sub-folder and downloads all the files in the consecutive sub-folder(s).

Setup Python Environment

Opening the terminal on Mac and typing python would tell you the version of python that is installed in your computer

image

As you can see from the above image, I'm working with Python version 3.11.5

I need to assume that you already have a relatively recent version of python already installed in your laptop. I'm also assuming that you have pip already installed as well

Install the Google APIs client library for Python

Assuming you just typed python in your terminal to check the version of your python installation, I know want you to exit out of that by typing exit() and then hitting the Enter button. We'll now use pip to install all the dependencies. So just type these commands in your terminal and execute each one of them

pip install -U pip google-api-python-client oauth2client
python3 -c "import googleapiclient, httplib2, oauth2client"

Specifying project in Google Cloud Console

An application using Google APIs requires a project. Those are managed in the Google Cloud Developers Console or simply, "devconsole." In this project, we're only going to use the Google Drive API, so we have a magic link (below in Step 1) that:

  • Takes you to the devconsole
  • Walks you through creating a new project (or choosing an existing one), and
  • Automagically enables the Drive API

Let's do it!

  1. Navigate to console.developers.google.com/start/api?id=drive and login to your Google account.
  2. If you don't have any projects yet, you'll see this screen to accept the Google APIs Terms of Service:

image

Once you accept the terms, a new project named "My Project" will be created, and the Drive API automatically enabled.

  1. If instead, you've already created a project, you'll get this screen instead:

image

When you click the Create a project pulldown, choose an existing project or really create a new project.

image

Once you've made your selection (new or existing project), the Drive API will be automatically enabled for you.

  1. You'll know the Drive API has been enabled with this confirmation:

image

  1. Click Go to credentials to move to the next step.

Authorize API requests (user authorization)

To get OAuth2 credentials for user authorization, go back to the API manager and select the "Credentials" tab on the left-nav:

image

When you get there, you'll see all your credentials in three separate sections:

image

The first is for API keys, the second OAuth 2.0 client IDs, and the last OAuth2 service accts—we're using the one in the middle.

From the Credentials page, click on the + Create Credentials button at the top, which then gives you a dialog where you'd choose "OAuth client ID:"

image

On the next screen, you have 2 actions: configuring your app's authorization "consent screen" and choosing the application type:

image

If you have not set a consent screen, you will see the warning in the console and would need to do so now. (Skip this these next steps if your consent screen has already been setup.)

Click on "Configure consent screen" where you select an "External" app (or "Internal" if you're a Google Workspace [formerly "Google Workspace"] customer):

image

It doesn't matter which you pick because you're not publishing your code sample. Most people will select "External" to be taken to a more complex screen, but you really only need to complete the "Application name" field at the top:

image

The only thing you need at this time is just an application name so pick someone that reflects the codelab you're doing then click Save.

Now go back to the Credentials tab to create an OAuth2 client ID. Here you'll see a variety of OAuth client IDs you can create:

image

We're developing a command-line tool, which is Other, so choose that then click the Create button. Choose a client ID name reflecting the app you're creating or simply take the default name, which is usually, "Other client N".

  1. A dialog with the new credentials appears; click OK to close

image

  1. Back on the Credentials page, scroll down to the "OAuth2 Client IDs" section find and click the download icon image to the far right bottom of your newly-created client ID.

image

  1. This open a dialog to save a file named client_secret-LONG-HASH-STRING.apps.googleusercontent.com.json, likely to your Downloads folder. We recommend shortening to an easier name like credentials.json (which is what this app uses), then save it to the directory/folder where you'll be saving this main.py app

Running the Code

Open the main.py python file. Simply scroll to the last line of code. You'll see the following function written :

download_all_files_in_this_folder("12e1Ll_AOK_RyfgjVy8DiS7ckHrvOxrRa", "/Volumes/WD/HDD backup/")

As you probably notice, it takes 2 arguments :

  1. The first argument takes the folder_id of the google drive folder you want to replicate. How do you find the folder id? Simply open your Google Drive folder (that you want to replicate) in your browser and check the URL. So for example, if I want to replicate the folder "HDD backup" from my Google Drive to my system, I would open the "HDD Backup" folder on my browser and the folder id is the highlighted part of the link
image
  1. The second argument is the path where you want to save all the components of the folder (in my case, I am saving it to my external HDD)

IMPORTANT NOTE

The first time you execute the script, it won't have the authorization to access the user's files on Drive (yours). The command-line script is paused as a browser window opens and presents you with the OAuth2 permissions dialog:

image

This is where the application asks the user for the permissions the code is requesting (via the SCOPES variable). In this case, it's the ability to view the file metadata from the user's Google Drive. Yes, in your code, these permission scopes appear as URIs, but they're translated into the language specified by your locale in the OAuth2 flow dialog window. The user must give explicit authorization for the requested permission(s) requested, else the "run flow" part of the code will throw an exception, and the script does not proceed further.

THAT'S IT! After you've changed the arguments and authorized the program to use your Google Drive account, you should be able to run the program without any errors, and it will download everything present in that folder to whichever path you specified in the second argument.

Future Scopes

Based on whatever ideas I got laying on my bed before sleeping, here are a few improvements I'm thinking I'll mnake sometime in the future:

  1. Implement multiprocessing for faster downloads
  2. Make this into a webapp which connects to your Google Drive Account and then saves everything using this program (to whichever folder to specify it to)

Bibliography

  1. Python Codelab
  2. Drive API Documentation

replicate_gdrive's People

Contributors

saatweek avatar

Stargazers

Mangaldeep Das avatar

Watchers

 avatar

Forkers

liku88

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.