- Abstract
- Setup Python Environment
- Install the Google APIs client library for Python
- Specifying project in Google Cloud Console
- Authorize API requests (user authorization)
- Running the Code
- Future Scopes
- Bibliography
When you try and download multiple files, or a folder with multiple sub-folders and files, you'll see that Google Drive doesn't simply download all the files sequentially. It will first attempt to zip all the files, and then download the zip file.
Here's the problem with this approach. When you have a LOT of files, or a big folder (with multiple sub-folders and files), the zipping takes a lot of time, and it often fails to zip altogether.
So I created the program that (given a folder) downloads all the files in that folder, and then recursively goes into each sub-folder and downloads all the files in the consecutive sub-folder(s).
Opening the terminal on Mac and typing python would tell you the version of python that is installed in your computer
As you can see from the above image, I'm working with Python version 3.11.5
I need to assume that you already have a relatively recent version of python already installed in your laptop. I'm also assuming that you have pip already installed as well
Assuming you just typed python
in your terminal to check the version of your python installation, I know want you to exit out of that by typing exit()
and then hitting the Enter button. We'll now use pip to install all the dependencies. So just type these commands in your terminal and execute each one of them
pip install -U pip google-api-python-client oauth2client
python3 -c "import googleapiclient, httplib2, oauth2client"
An application using Google APIs requires a project. Those are managed in the Google Cloud Developers Console or simply, "devconsole." In this project, we're only going to use the Google Drive API, so we have a magic link (below in Step 1) that:
- Takes you to the devconsole
- Walks you through creating a new project (or choosing an existing one), and
- Automagically enables the Drive API
Let's do it!
- Navigate to console.developers.google.com/start/api?id=drive and login to your Google account.
- If you don't have any projects yet, you'll see this screen to accept the Google APIs Terms of Service:
Once you accept the terms, a new project named "My Project" will be created, and the Drive API automatically enabled.
- If instead, you've already created a project, you'll get this screen instead:
When you click the Create a project pulldown, choose an existing project or really create a new project.
Once you've made your selection (new or existing project), the Drive API will be automatically enabled for you.
- You'll know the Drive API has been enabled with this confirmation:
- Click Go to credentials to move to the next step.
To get OAuth2 credentials for user authorization, go back to the API manager and select the "Credentials" tab on the left-nav:
When you get there, you'll see all your credentials in three separate sections:
The first is for API keys, the second OAuth 2.0 client IDs, and the last OAuth2 service accts—we're using the one in the middle.
From the Credentials page, click on the + Create Credentials button at the top, which then gives you a dialog where you'd choose "OAuth client ID:"
On the next screen, you have 2 actions: configuring your app's authorization "consent screen" and choosing the application type:
If you have not set a consent screen, you will see the warning in the console and would need to do so now. (Skip this these next steps if your consent screen has already been setup.)
Click on "Configure consent screen" where you select an "External" app (or "Internal" if you're a Google Workspace [formerly "Google Workspace"] customer):
It doesn't matter which you pick because you're not publishing your code sample. Most people will select "External" to be taken to a more complex screen, but you really only need to complete the "Application name" field at the top:
The only thing you need at this time is just an application name so pick someone that reflects the codelab you're doing then click Save.
Now go back to the Credentials tab to create an OAuth2 client ID. Here you'll see a variety of OAuth client IDs you can create:
We're developing a command-line tool, which is Other, so choose that then click the Create button. Choose a client ID name reflecting the app you're creating or simply take the default name, which is usually, "Other client N".
- A dialog with the new credentials appears; click OK to close
- Back on the Credentials page, scroll down to the "OAuth2 Client IDs" section find and click the download icon to the far right bottom of your newly-created client ID.
- This open a dialog to save a file named
client_secret-LONG-HASH-STRING.apps.googleusercontent.com.json
, likely to your Downloads folder. We recommend shortening to an easier name likecredentials.json
(which is what this app uses), then save it to the directory/folder where you'll be saving thismain.py
app
Open the main.py
python file. Simply scroll to the last line of code. You'll see the following function written :
download_all_files_in_this_folder("12e1Ll_AOK_RyfgjVy8DiS7ckHrvOxrRa", "/Volumes/WD/HDD backup/")
As you probably notice, it takes 2 arguments :
- The first argument takes the
folder_id
of the google drive folder you want to replicate. How do you find the folder id? Simply open your Google Drive folder (that you want to replicate) in your browser and check the URL. So for example, if I want to replicate the folder "HDD backup" from my Google Drive to my system, I would open the "HDD Backup" folder on my browser and the folder id is the highlighted part of the link
- The second argument is the path where you want to save all the components of the folder (in my case, I am saving it to my external HDD)
The first time you execute the script, it won't have the authorization to access the user's files on Drive (yours). The command-line script is paused as a browser window opens and presents you with the OAuth2 permissions dialog:
This is where the application asks the user for the permissions the code is requesting (via the SCOPES variable). In this case, it's the ability to view the file metadata from the user's Google Drive. Yes, in your code, these permission scopes appear as URIs, but they're translated into the language specified by your locale in the OAuth2 flow dialog window. The user must give explicit authorization for the requested permission(s) requested, else the "run flow" part of the code will throw an exception, and the script does not proceed further.
THAT'S IT! After you've changed the arguments and authorized the program to use your Google Drive account, you should be able to run the program without any errors, and it will download everything present in that folder to whichever path you specified in the second argument.
Based on whatever ideas I got laying on my bed before sleeping, here are a few improvements I'm thinking I'll mnake sometime in the future:
- Implement multiprocessing for faster downloads
- Make this into a webapp which connects to your Google Drive Account and then saves everything using this program (to whichever folder to specify it to)