GithubHelp home page GithubHelp logo

smartondev / gwbackupy Goto Github PK

View Code? Open in Web Editor NEW
31.0 3.0 4.0 602 KB

Open source Google Workspace™ backup solution written in python. (gmvault alternative)

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%
backup gmail google-cloud-platform google-workspace python tool gmvault google-api restore cli gcp oauth2 package pip service-account versioned-backups g-suite

gwbackupy's Introduction

gwbackupy: Google Workspace™ backup and restore solution.

0.2.0 0.2.0 BSD-3-Clause black Coverage Status

What is it?

Gwbackupy is an open source Google Workspace™ backup and restore solution, written in python.

Currently supported Gmail messages and labels only.

Why?

Due to gmvault limitations:

  • is still abandoned (??)
  • authentication method is not usable in Google Workspace wide
  • designed only for gmail messages
  • only supports IMAP protocol (slow and less security)

Details

  • Run from CLI or run directly from python code

  • Authentication

    • OAUTH for free or paid plans (not recommended for paid plans)
    • Service account file (JSON or P12) for paid plans (can be configured to access all accounts in workspace.)
  • Version controlled storage interface

    Allows to restore specific moments without using an external snapshot system (eg. zips, file system with snapshot). Not limited to file storage and is not limited to the use of a database server. Currently, file system based storage is only available.

  • Dry mode (not write to local storage and not modify on server)

  • API communication (no need for special IMAP and other settings): secure and fast

  • Gmail backup

    • full backup (download all messages, labels)

    • full backup continuously (periodically rerunning)

      Scanning the full mailbox, but download only the new messages and mark the deleted messages.

    • Quick backup (sync the last N days)

  • Gmail restore

    • restore deleted message in specified interval
    • full restore messages and labels to an empty mailbox (e.g. to other gmail account)

Paid plans are the following: here. Google One or additional storages are not considered as paid plans

Requirements

  • python3 and pip

  • Google Cloud account and own created access files. This software does not contain access files, this is for security reasons.

    A credit card is required during registration, but the use of Workspace APIs is free.

Install

The easiest way for installing:

pip install gwbackupy
# and run...
gwbackupy ...

or

# clone this repository
# install requirements
pip install -r requirements.txt
# and run...
python3 -m gwbackupy ...

The project also has an official Docker image: gwbackupy-docker - under development. The docker image has scheduled backup runs and also supports managing multiple email accounts.

Instructions

Usage

Example usage Gmail

Backup run in CLI:

gwbackupy \
  --service-account-key-filepath <service-acount-json-key-file> \
  --batch-size 5 \
  gmail backup \
  --email <mailbox email address>

Restore run in CLI:

gwbackupy \
  --service-account-key-filepath <service-acount-json-key-file> \
  --batch-size 5 \
  gmail restore \
  --add-label "backup-restore-1231" \
  --add-label "more-restore-label" \
  --filter-date-from <date or datetime eg. "2023-01-01"> \
  --filter-date-to <date or datetime eg. "2023-02-02 03:00:00"> \
  --restore-deleted \
  --email <source backup mailbox email address> \
  --to-email <destination mailbox email address> # If you want to a different destination account

Backup run from python code:

# WARNING: Calling directly from python code actively change in the current state of development.

from gwbackupy.gmail import Gmail
from gwbackupy.storage.file_storage import FileStorage
from gwbackupy.providers.gmail_service_provider import GmailServiceProvider
from gwbackupy.providers.gapi_gmail_service_wrapper import GapiGmailServiceWrapper

storage = FileStorage('./data/[email protected]')
service_provider = GmailServiceProvider(
    service_account_file_path='serviceacc.json',
    storage=storage,
)
service_wrapper = GapiGmailServiceWrapper()
gmail = Gmail(email='[email protected]',
              service_provider=service_provider,
              service_wrapper=service_wrapper,
              batch_size=3,
              storage=storage)
if gmail.backup():
    print('Yeah!')
else:
    print(':(')

Security

See SECURITY.md

Contributing

Welcome! I am happy that you want to make the project better.

Currently, there is no developed documentation for the process, in the meantime, please use issues and pull requests.

Changelog

The changes are contained in CHANGELOG.md.

About

Márton Somogyi

gwbackupy's People

Contributors

kamarton avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

gwbackupy's Issues

SA not working

Describe the bug
I did the SA account JSON key on a paid account as written. It doesn't work.

To Reproduce
Steps to reproduce the behavior:

Run

gwbackupy --service-account-key-filepath sa.json gmail backup --email [email protected]
INFO 2023-08-08 13:33:10,814 - Starting backup for [email protected]
INFO 2023-08-08 13:33:10,814 - Scanning backup storage...
INFO 2023-08-08 13:33:10,814 - Stored items: 0
INFO 2023-08-08 13:33:10,814 - Backing up labels...
INFO 2023-08-08 13:33:10,814 - Getting labels from server ([email protected])
INFO 2023-08-08 13:33:10,816 - file_cache is only supported with oauth2client<4.0.0
INFO 2023-08-08 13:33:10,818 - Attempting refresh to obtain initial access_token
INFO 2023-08-08 13:33:10,820 - Refreshing access_token
INFO 2023-08-08 13:33:10,981 - Failed to retrieve access token: {
  "error": "unauthorized_client",
  "error_description": "Client is unauthorized to retrieve access tokens using this method, or client not authorized for any of the scopes requested."
}

Desktop (please complete the following information):
Ubuntu Linux CLI

After OAuth authentication check email matches

Currently, it is easily possible that the browser starts in a different profile and the user approves it with another profile. The system does not check the email address thereafter, so another account is backed up.

Can't restore to a different mailbox

Hi - I'm using the pip installed latest version on Ubuntu. i've just taken my first full backup and im trying to restore it to a different (new) gmail account. I've got the creds etc for both.

I'm attempting to do the restore using the following command:

gwbackupy --credentials-filepath restore-account-creds.json --workdir /mnt/gmail gmail restore --restore-deleted --email [email protected] --to-email [email protected]

I get the following output but nothing is restored.....am i doing something wrong?

thanks!

        INFO 2023-04-18 14:04:05,019 - Filter options: deleted
        INFO 2023-04-18 14:04:05,019 - Scanning backup storage...
        INFO 2023-04-18 14:04:10,497 - Stored items: 476545
        INFO 2023-04-18 14:04:10,497 - Loading labels...
        INFO 2023-04-18 14:04:10,695 - Labels loaded successfully (23)
        INFO 2023-04-18 14:04:10,695 - Getting labels from server ([email protected])
        INFO 2023-04-18 14:04:10,703 - file_cache is only supported with oauth2client<4.0.0
        INFO 2023-04-18 14:04:10,922 - Filtering messages...
        INFO 2023-04-18 14:04:12,848 - Number of potentially affected messages: 0
        INFO 2023-04-18 14:04:12,848 - Upload messages...
        INFO 2023-04-18 14:04:13,852 - Messages uploaded successfully

Separated OAuth token storage

Currently, the service stores the tokens, so sharing them between several services is a problem in the future.

Tasks:

  • store tokens in common storage
  • scopes add incrementally to token

More log

  • add tag for items (eg. gmail message id)
  • gmail local files scanning

gmail: deleted email storing forever problem

Stores mails indefinitely. This is unnecessary in the case of spam or larger letters or other cases.

  • Storage management e.g. remove those older than 30 days, remove based on label.
  • While syncing e.g. remove emails older than 30 days.

exception if cli started without parameters

$ gwbackupy
...
ERROR:root:CLI startup/run failed
Traceback (most recent call last):
  File "/home/smarton/.local/lib/python3.10/site-packages/gwbackupy/gwbackupy.py", line 161, in cli_startup
    args = parse_arguments()
  File "/home/smarton/.local/lib/python3.10/site-packages/gwbackupy/gwbackupy.py", line 137, in parse_arguments
    args = parser.parse_args(args=None if sys.argv[1:] else ["--help"])
  File "/usr/lib/python3.10/argparse.py", line 1838, in parse_args
    args, argv = self.parse_known_args(args, namespace)
  File "/usr/lib/python3.10/argparse.py", line 1871, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
  File "/usr/lib/python3.10/argparse.py", line 2080, in _parse_known_args
    start_index = consume_optional(start_index)
  File "/usr/lib/python3.10/argparse.py", line 2020, in consume_optional
    take_action(action, args, option_string)
  File "/usr/lib/python3.10/argparse.py", line 1948, in take_action
    action(self, namespace, argument_values, option_string)
  File "/usr/lib/python3.10/argparse.py", line 1112, in __call__
    parser.exit()
  File "/usr/lib/python3.10/argparse.py", line 2582, in exit
    _sys.exit(status)
SystemExit: 0

args = parser.parse_args(args=None if sys.argv[1:] else ["--help"])

web interface for stored item listing

A web interface where user can easily view the locally stored items. Probably this will be included in a separate package, because it is not necessarily part of this CLI project.

Error if python 3.7 used

Debian uses python 3.7.2

$ gwbackupy
Traceback (most recent call last):
  File "/usr/local/bin/gwbackupy", line 3, in <module>
    from gwbackupy import gwbackupy_cli
  File "/usr/local/lib/python3.7/dist-packages/gwbackupy/[gwbackupy_cli.py](http://gwbackupy_cli.py/)", line 10, in <module>
    from gwbackupy.filters.gmail_filter import GmailFilter
  File "/usr/local/lib/python3.7/dist-packages/gwbackupy/filters/[gmail_filter.py](http://gmail_filter.py/)", line 7, in <module>
    from gwbackupy.storage.storage_interface import LinkInterface
  File "/usr/local/lib/python3.7/dist-packages/gwbackupy/storage/[storage_interface.py](http://storage_interface.py/)", line 53, in <module>
    LinkGroupBy = Callable[[LinkInterface], list[Union[str, int]]]
TypeError: 'type' object is not subscriptable

Service account setup doc

  • service account key file creation
  • Google Workspace wide application settings
  • free plan oauth screen and usage

restore missing messages (gmail)

If the backup is not run before, then has not yet marked the missing emails as deleted.

Currently, missing messages can only be restored after running a backup.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.