basak / glacier-cli Goto Github PK

Command-line interface to Amazon Glacier

License: Other

Shell 2.14% Python 97.86%

glacier-cli's Introduction

glacier-cli

This tool provides a sysadmin-friendly command line interface to Amazon Glacier, turning Glacier into an easy-to-use storage backend. It automates tasks which would otherwise require a number of separate steps (job submission, polling for job completion and retrieving the results of jobs). It provides integration with git-annex, making Glacier even more useful.

glacier-cli uses Amazon Glacier's archive description field to keep friendly archive names, although you can also address archives directly by using their IDs. It keeps a local cache of archive IDs and their corresponding names, as well as housekeeping data to keep the cache up-to-date. This will save you time because you won't have to wait spend hours retrieving inventories all the time, and will save you mental effort because you won't have to keep track of the obtuse archive IDs yourself.

glacier-cli is fully interoperable with other applications using the same Glacier vaults. It can deal gracefully with vaults changing from other machines and/or other applications, and introduces no new special formats from the point of view of a vault.

Example with git-annex

$ echo 42 > example-content
$ git annex add example-content
add example-content (checksum...) ok
(Recording state in git...)
$ git commit -m'Add example-content'
[master cc632d1] Add example-content
 1 file changed, 1 insertion(+)
 create mode 120000 example/example-content

(the local annex now stores example-content)

$ git annex copy --to glacier example-content
copy example-content (gpg) (checking glacier...) (to glacier...) 
ok

(copying content to Amazon Glacier is straightforward)

$ git annex drop example-content
drop example-content (gpg) (checking glacier...) ok

(now the only copy of the data is in Amazon Glacier)

$ git annex get --from glacier example-content
get example-content (from glacier...) (gpg) 
glacier: queued retrieval job for archive 'GPGHMACSHA1--2945f64be96ccbb9feb4d8ff44ac9692fdbe654e'

  retrieve hook exited nonzero!
failed
git-annex: get: 1 failed

(this fails on the first attempt since the data isn't immediately available; but it does submit a job to Amazon Glacier requesting the data, so a later retry will work)

(...four hours later...)

$ git annex get --from glacier example-content
get example-content (from glacier...) (gpg) 
ok
$ cat example-content
42

(content successfully retrieved from Glacier)

Example without git-annex

$ glacier vault list

(empty result with zero exit status)

$ glacier vault create example-vault

(silently successful: like other Unix commands, only errors are noisy)

$ glacier vault list
example-vault

(this list is retrieved from Glacier; a relatively quick operation)

$ glacier archive list example-vault

(empty result with zero exit status; nothing is in our vault yet)

$ echo 42 > example-content
$ glacier archive upload example-vault example-content

(Glacier has now stored example-content in an archive with description example-content and in a vault called example-vault)

$ glacier archive list example-vault
example-content

(this happens instantly, since glacier-cli maintains a cached inventory)

$ rm example-content

(now the only place the content is stored is in Glacier)

$ glacier archive retrieve example-vault example-content
glacier: queued retrieval job for archive 'example-content'
$ glacier archive retrieve example-vault example-content
glacier: job still pending for archive 'example-content'
$ glacier job list
a/p 2012-09-19T21:41:35.238Z example-vault example-content
$ glacier archive retrieve --wait example-vault example-content

(...hours pass while Amazon retrieves the content...)

$ cat example-content
42

(content successfully retrieved from Glacier)

Costs

Before you use Amazon Glacier, you should make yourself familiar with how much it costs. Note that archive retrieval costs are complicated and may be a lot more than you expect. Files are uploaded in chunks, so uploading an archive can cause many requests. The default size of the parts is determined by the boto library; check DefaultPartSize in the documentation. Changes have been proposed to this tool in order to allow the user to specify the chunk size, but they have not been merged yet.

Installation

glacier-cli and its dependencies are pure Python packages and it should be straightforward to install them on a system with standard Python tools. It was originally developed for Python 2, but it is possible to get it working with Python 3 too.

Installation with `pip`

It is recommended (but not required) to install glacier-cli in a virtual environment (e.g. using something like virtualenv, venv, conda, ...). This allows managing glacier-cli and its dependencies in isolation and avoids conflicts with other packages and tools.

These virtual environment systems come with pip (the standard Python package manger) out of the box, which is the easiest way to install glacier-cli including its dependencies. If you don't use a virtual environment, it is still recommended to at least use pip.

Install glacier-cli using pip (preferably in a virtual environment) with one of these methods:

Installation from source: make a local clone of this repository and install it:

  git clone git://github.com/basak/glacier-cli.git
  cd glacier-cli
  pip install .

Installation directly from GitHub repo:

  pip install git+https://github.com/basak/glacier-cli.git

Symlinks

The installation through pip will also create a command line tool glacier-cli in the bin folder of the virtual environment. However, you probably want to be able to use it without first activating the virtual environment. Also, if you want to use it with git-annex (see lower), it should be available under the name glacier in your PATH.

Create appropriate symlinks to achieve this. The details of this largely depend on your setup or workflow, but here are some examples as inspiration (to be executed from within the virtual environment):

if ~/bin is in your PATH:

  ln -s $(which glacier-cli) ~/bin/glacier

to make it available globally to all users:

  sudo ln -s $(which glacier-cli) /usr/local/bin/glacier

Integration with git-annex

Using glacier-cli via git-annex is the easiest way to use Amazon Glacier from the CLI.

git-annex now has native glacier-cli integration. See the git-annex Glacier documentation and tutorial for details.

You probably want to set git-annex to only use glacier as a last resort in order to control your costs:

git config remote.glacier.annex-cost 1000

Copying to the remote works as normal. Retrieving from the remote initially fails after a job is queued. If you try again after the job is complete (usually around four hours), then retrieval should work successfully. You can monitor the status of the jobs using glacier job list; when the job status changes from p (pending) to d (done), a retrieval should work. Note that jobs expire from Amazon Glacier after around 24 hours or so.

glacier checkpresent cannot always check for certain that an archive definitely exists within Glacier. Vault inventories take hours to retrieve, and even when retrieved do not necessarily represent an up-to-date state. For this reason and as a compromise, glacier checkpresent will confirm to git-annex that an archive exists if it is known to have existed less than 60 hours ago. You may override this permitted lag interval with the --max-age option to glacier checkpresent.

Commands

glacier vault list
glacier vault create vault-name
glacier vault sync [--wait] [--fix] [--max-age hours] vault-name
glacier archive list vault-name
glacier archive upload [--name archive-name] vault-name filename
glacier archive retrieve [--wait] [-o filename] [--multipart-size bytes] vault-name archive-name
glacier archive retrieve [--wait] [--multipart-size bytes] vault-name archive-name [archive-name...]
glacier archive delete vault-name archive-name
glacier job list

Delayed Completion

If you request an archive retrieval, then this requires a job which will take some number of hours to complete. You have one of two options:

If the command fails with a temporary failure—printed to stderr and with an exit status of EX_TEMPFAIL (75)—then a job is pending, and you must retry the command until it succeeds.
If you prefer to just wait, then use --wait (or retry with --wait if you didn't use it the first time). This will just do everything and exit when it is done. Amazon Glacier jobs typically take around four hours to complete.

Without --wait, glacier-cli will follow this logic:

Look for a suitable existing archive retrieval job.
If such a job exists and it is pending, then exit with a temporary failure.
If such a job exists and it has finished, then retrieve the data and exit with success.
Otherwise, submit a new job to retrieve the archive and exit with a temporary failure. Subsequent calls requesting the same archive will find this job and follow these same four steps with it, resulting in a downloaded archive when the job is complete.

Cache Reconstruction

glacier-cli follows the XDG Base Directory Specification and keeps its cache in ${XDG_CACHE_HOME:-$HOME/.cache}/glacier-cli/db.

After a disaster, or if you have modified a vault from another machine, you can reconstruct your cache by running:

$ glacier vault sync example-vault

This will set off an inventory job if required. This command is subject to delayed completion semantics as above but will also respond to --wait as needed.

By default, existing inventory jobs that completed more than 24 hours ago are ignored, since they may be out of date. You can override this with --max-age=hours. Specify --max-age=0 to force a new inventory job request.

Note that there is a lag between creation or deletion of an archive and the archive's corresponding appearance or disappearance in a subsequent inventory, since Amazon only periodically regenerates vault inventories. glacier-cli will show you newer information if it knows about it, but if you perform vault operations that do not update the cache (eg. on another machine, as another user, or from another program), then updates may take a while to show up here. You will need to run a vault sync operation after Amazon have updated your vault's inventory, which could be a good day or two after the operation took place.

If something doesn't go as expected (eg. an archive that glacier-cli knows it created fails to appear in the inventory after a couple of days, or an archive disappears from the inventory after it showed up there), then vault sync will warn you about it. You can use --fix to accept the correction and update the cache to match the official inventory.

Addressing Archives

Normally, you can just address an archive by its name (which, from Amazon's perspective, is the Glacier archive description).

However, you may end up with multiple archives with the same name, or archives with no name, since Amazon allows this. In this case, you can refer to an archive by ID instead by prefixing your reference with id:.

To avoid ambiguity, prefixing a reference with name: works as you would expect. If you end up with archive names or IDs that start with name: or id:, then you must use a prefix to disambiguate.

Using Pipes

Use glacier archive upload <vault> --name=<name> - to upload data from standard input. In this case you must use --name to name your archive correctly.

Use glacier archive retrieve <vault> <name> -o- to download data to standard output. glacier-cli will not output any data to standard output apart from the archive data in order to prevent corrupting the output data stream.

Future Directions

Add resume functionality for uploads and downloads

Contact

For bugs or feature requests please create a glacier-cli github issue.

glacier-cli's People

Contributors

Stargazers

Watchers

glacier-cli's Issues

glacier-cli wrongly conflates identically-named vaults from different AWS regions

Description of problem:
glacier-cli keeps a local cache of vaults and archives. However, the cache does not seem to distinguish between vaults in different AWS regions. So different AWS vaults with the same name in different AWS regions are assumed to be the same vault.

Version-Release number of selected component (if applicable):
glacier-cli.noarch-0-8.20131113gite8a2536.fc20
Originaly reported to Fedora as
https://bugzilla.redhat.com/show_bug.cgi?id=1167186

How reproducible:
Always

Steps to Reproduce:

$ glacier-cli --region=us-east-1 vault create somevault
$ glacier-cli --region=us-east-1 archive upload somevault somearchive

$ glacier-cli --region=eu-west-1 archive list somevault
(note different region specified)

Actual results:
wrongly returns 'somearchive' as an archive in region eu-west-1

Expected results:
No output (there is no vault or archive in region eu-west-1)

Additional information:
Obvious workaround is to pick one region and stick to it, or to use unique vault names across all regions.

glacier vault sync: sqlalchemy.exc.IntegrityError: (IntegrityError) column id is not unique

I get this error when I do a vault sync:

$ glacier vault sync **********
Traceback (most recent call last):
  File "/root/bin/glacier", line 730, in <module>
    App().main()
  File "/root/bin/glacier", line 716, in main
    self.args.func()
  File "/root/bin/glacier", line 469, in vault_sync
    wait=self.args.wait)
  File "/root/bin/glacier", line 448, in _vault_sync
    self._vault_sync_reconcile(vault, complete_job, fix=fix)
  File "/root/bin/glacier", line 435, in _vault_sync_reconcile
    fix=fix)
  File "/root/bin/glacier", line 257, in mark_seen_upstream
    key=self.key, vault=vault, id=id).one()
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 2184, in one
    ret = list(self)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 2226, in __iter__
    self.session._autoflush()
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 1012, in _autoflush
    self.flush()
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 1583, in flush
    self._flush(objects)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 1654, in _flush
    flush_context.execute()
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/unitofwork.py", line 331, in execute
    rec.execute(self)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/unitofwork.py", line 475, in execute
    uow
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/persistence.py", line 64, in save_obj
    table, insert)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/persistence.py", line 530, in _emit_insert_statements
    execute(statement, multiparams)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1449, in execute
    params)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1584, in _execute_clauseelement
    compiled_sql, distilled_params
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1698, in _execute_context
    context)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1691, in _execute_context
    context)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/default.py", line 331, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (IntegrityError) column id is not unique u'INSERT INTO archive (id, name, vault, "key", last_seen_upstream, created_here, deleted_here) VALUES (?, ?, ?, ?, ?, ?, ?)' (u'***********', u'***********', '**********', '**********', 1433336619, 1433423949.613394, None)

How do I get an unique column id again?

Make into a Python package which can be pip installed and indexed on PyPI?

I can supply a PR if there's interest.

"archive not found" error from checkpresent

checkpresent prints "archive not found" to stderr if a key is not found. This clutters up the git-annex display, in both the hook special remote, and the native glacier git-annex remote that I am currently writing. Could that instead be sent to stdout, or not printed at all?

Feature Request: --delete-local option on "archive upload"

It would be great to have an option on the "upload" command to delete the local copy of the file after a successful upload. Maybe a "--delete-local"????

I am encrypting my files before uploading to Glacier (like everybody). Therefore, after the file is uploaded I don't need the local file anymore. The deletion of the file also indicates that the file has been uploaded. This means I can interrupt an upload script and restart without duplicating uploads.

I need to be able to interrupt my upload script sometimes because I need my uplink bandwidth for other purposes.

Further info: I am uploading my 80GB of photos as individual encrypted files via a slow 2Mbps link. I am using a basic 'find' to recurse down my directory tree, with an -exec to encrypt, upload, delete encrypted file. Interrupting my simple 'find' and restarting leads to duplicated files. Adding the above option, would allow me to create an encrypted cache of my files, and upload when convenient. If it's deleted then it was successfully uploaded.

SQLAlchemy database is locked error

Not sure if I have my configuration right, but I'm seeing the stack trace below. Not sure if this is relevant, but I'm on a shared server and have installed everything I needed (including boto and sqlalchemy with pip install --user).

[ djoshea@host ~ ]: glacier vault list
Traceback (most recent call last):
  File "/net/home/djoshea/bin/glacier", line 730, in <module>
    App().main()
  File "/net/home/djoshea/bin/glacier", line 708, in __init__
    cache = Cache(get_connection_account(connection))
  File "/net/home/djoshea/bin/glacier", line 124, in __init__
    self.Base.metadata.create_all(self.engine)
  File "/net/home/djoshea/.local/lib/python2.7/site-packages/sqlalchemy/sql/schema.py", line 3687, in create_all
    tables=tables)
  File "/net/home/djoshea/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1856, in _run_visitor
    conn._run_visitor(visitorcallable, element, **kwargs)
  File "/net/home/djoshea/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1481, in _run_visitor
    **kwargs).traverse_single(element)
  File "/net/home/djoshea/.local/lib/python2.7/site-packages/sqlalchemy/sql/visitors.py", line 121, in traverse_single
    return meth(obj, **kw)
  File "/net/home/djoshea/.local/lib/python2.7/site-packages/sqlalchemy/sql/ddl.py", line 709, in visit_metadata
    [t for t in tables if self._can_create_table(t)])
  File "/net/home/djoshea/.local/lib/python2.7/site-packages/sqlalchemy/sql/ddl.py", line 686, in _can_create_table
    table.name, schema=table.schema)
  File "/net/home/djoshea/.local/lib/python2.7/site-packages/sqlalchemy/dialects/sqlite/base.py", line 1127, in has_table
    connection, "table_info", table_name, schema=schema)
  File "/net/home/djoshea/.local/lib/python2.7/site-packages/sqlalchemy/dialects/sqlite/base.py", line 1468, in _get_table_pragma
    cursor = connection.execute(statement)
  File "/net/home/djoshea/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 906, in execute
    return self._execute_text(object, multiparams, params)
  File "/net/home/djoshea/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1054, in _execute_text
    statement, parameters
  File "/net/home/djoshea/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1146, in _execute_context
    context)
  File "/net/home/djoshea/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1341, in _handle_dbapi_exception
    exc_info
  File "/net/home/djoshea/.local/lib/python2.7/site-packages/sqlalchemy/util/compat.py", line 199, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb)
  File "/net/home/djoshea/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1139, in _execute_context
    context)
  File "/net/home/djoshea/.local/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 450, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) database is locked [SQL: u'PRAGMA table_info("archive")']

my config looks like this:

[ djoshea@index ~ ]: cat .boto
[Credentials]
aws_access_key_id = ...
aws_secret_access_key = ...

[s3]
calling_format = boto.s3.connection.OrdinaryCallingFormat
host = s3-us-west-1.amazonaws.com

Understanding error messages

I am using your amazing tool to upload files to aws glacier vaults but every time I upload a new file I get a error traceback

Traceback (most recent call last):
File "/usr/local/bin/glacier", line 730, in
File "/usr/local/bin/glacier", line 716, in main
File "/usr/local/bin/glacier", line 498, in archive_upload
File "/root/scripts/glacier-cli/boto/glacier/vault.py", line 141, in create_archive_from_file
File "/root/scripts/glacier-cli/boto/glacier/writer.py", line 152, in write
File "/root/scripts/glacier-cli/boto/glacier/writer.py", line 141, in send_part
File "/root/scripts/glacier-cli/boto/glacier/layer1.py", line 626, in upload_part
File "/root/scripts/glacier-cli/boto/glacier/layer1.py", line 83, in make_request
File "/root/scripts/glacier-cli/boto/connection.py", line 913, in make_request
File "/root/scripts/glacier-cli/boto/connection.py", line 859, in _mexe
socket.gaierror: [Errno -2] Name or service not known

The point is after some time I check vault with glacier cli and it seems all files were uploaded ok but then, what about this messages? Is something I have to worry about?

Thanks!!

comparison with aws glacier CLI?

#30 led me to boto/boto#1548 (comment) which in turn led me to the official aws glacier command. So now I'm a bit confused why glacier-cli exists at all, when there is an officially supported and apparently comprehensive upstream command which does the same thing. Is it simply that glacier-cli was created before aws glacier? Or does glacier-cli offer something which aws glacier doesn't? (besides git-annex support of course, which for me personally is the whole point of using glacier-cli)

I think this should be clarified in the README...

Wait hours also for delete?

im just wondering what happen when you uses glacier-cli and you request to delete a file .

Do I have to understand also Glacier will wait some hours to retrieve my request and delete the files?

I have done two tests requesting a deletion of a file and then, when I run another time

$ glacier --region eu-west-1 archive list myvault

the file requested to delete it is not showed anymore. so? :)

what happen? You have to wait hours to retrieve a file but the delete is instantaneous ?

This is important to me because I dont want to archive different versions of the same files. So Im thinking to run a script to delete first all uploaded files and then make the upload of the new ones.

Thanks!

Error when using ca-central-1 region (stacktrace included in description)

Installation is working correctly for us-west-1 (and others), but not for ca-central-1:

./glacier.py --region ca-central-1 vault list
Traceback (most recent call last):
  File "./glacier.py", line 736, in <module>
    main()
  File "./glacier.py", line 732, in main
    App().main()
  File "./glacier.py", line 710, in __init__
    cache = Cache(get_connection_account(connection))
  File "./glacier.py", line 330, in get_connection_account
    return connection.layer1.aws_access_key_id
AttributeError: 'NoneType' object has no attribute 'layer1'

sqlalchemy.orm.exc.MultipleResultsFound: Multiple rows were found for one()

I've been trying to use glacier-cli with git-annex for a while and currently getting this on sync:

move SHA3_512-s78968--58aa331db254fd53cc422b6b8ba22b492cccec00b888497eb0bab8763d371b6232a854e1419190ab61685f5be82fa
be9b614cafd30f79f09044a3daea1291cfa (checking glacier...) Traceback (most recent call last):
  File "/home/annex/bin/glacier", line 737, in <module>
    main()
  File "/home/annex/bin/glacier", line 733, in main
    App().main()
  File "/home/annex/bin/glacier", line 719, in main
    self.args.func()
  File "/home/annex/bin/glacier", line 600, in archive_checkpresent
    self.args.vault, self.args.name)
  File "/home/annex/bin/glacier", line 161, in get_archive_last_seen
    result = self._get_archive_query_by_ref(vault, ref).one()
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 2404, in one
    "Multiple rows were found for one()")
sqlalchemy.orm.exc.MultipleResultsFound: Multiple rows were found for one()
(user error (glacier ["--region=us-east-1","archive","checkpresent","glacier-c22e72f3-62a4-4c74-afe1-ad946906e3e1","--quiet","SHA3_512-s78968--58aa331db254fd53cc422b6b8ba22b492cccec00b888497eb0bab8763d371b6232a854e1419190ab61685f5be82fabe9b614cafd30f79f09044a3daea1291cfa"] exited 1)) failed

Poor error message on duplicate archive name errors

I uploaded the same file 2 times. i got this error when i want to delete.

Traceback (most recent call last):
File "/usr/local/bin/glacier", line 730, in
App().main()
File "/usr/local/bin/glacier", line 716, in main
self.args.func()
File "/usr/local/bin/glacier", line 587, in archive_delete
self.args.vault, self.args.name)
File "/usr/local/bin/glacier", line 145, in get_archive_id
result = self._get_archive_query_by_ref(vault, ref).one()
File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 2023, in one
"Multiple rows were found for one()")
sqlalchemy.orm.exc.MultipleResultsFound: Multiple rows were found for one()

SSL certificate verify failed on upload of 6.6 GiB file

I uploaded a 6.6 GiB file and after some time I got this error:

Traceback (most recent call last):
  File "/home/amedee/bin/glacier", line 730, in <module>
    App().main()
  File "/home/amedee/bin/glacier", line 716, in main
    self.args.func()
  File "/home/amedee/bin/glacier", line 498, in archive_upload
    file_obj=self.args.file, description=name)
  File "/home/amedee/bin/glacier-cli/boto/glacier/vault.py", line 141, in create_archive_from_file
    writer.write(data)
  File "/home/amedee/bin/glacier-cli/boto/glacier/writer.py", line 152, in write
    self.send_part()
  File "/home/amedee/bin/glacier-cli/boto/glacier/writer.py", line 141, in send_part
    content_range, part)
  File "/home/amedee/bin/glacier-cli/boto/glacier/layer1.py", line 626, in upload_part
    response_headers=response_headers)
  File "/home/amedee/bin/glacier-cli/boto/glacier/layer1.py", line 83, in make_request
    data=data)
  File "/home/amedee/bin/glacier-cli/boto/connection.py", line 913, in make_request
    return self._mexe(http_request, sender, override_num_retries)
  File "/home/amedee/bin/glacier-cli/boto/connection.py", line 859, in _mexe
    raise e
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590)

I don't know if the file was uploaded correctly, but the time was long enough to expect an upload to complete. The glacier archive list command doesn't show the file.

I have previously used glacier archive retrieve and glacier vault sync on this system, and they do not return an error.

EDIT: I uploaded a 1.2 GiB file and didn't get an error. The smaller file is listed with glacier archive list.

Credentials?

I don't understand how pass aws credentials to glacier-cli

glacier vault list
Traceback (most recent call last):
File "/usr/local/bin/glacier", line 730, in
App().main()
File "/usr/local/bin/glacier", line 705, in init
connection = boto.glacier.connect_to_region(args.region)
File "/root/scripts/glacier-cli/boto/glacier/init.py", line 56, in connect_to_region
return region.connect(*_kw_params)
File "/root/scripts/glacier-cli/boto/regioninfo.py", line 62, in connect
return self.connection_cls(region=self, *_kw_params)
File "/root/scripts/glacier-cli/boto/glacier/layer2.py", line 38, in init
self.layer1 = Layer1(_args, *_kwargs)
File "/root/scripts/glacier-cli/boto/glacier/layer1.py", line 63, in init
suppress_consec_slashes)
File "/root/scripts/glacier-cli/boto/connection.py", line 525, in init
host, config, self.provider, self._required_auth_capability())
File "/root/scripts/glacier-cli/boto/auth.py", line 621, in get_auth_handler
'Check your credentials' % (len(names), str(names)))
boto.exception.NoAuthHandlerFound: No handler was ready to authenticate. 1 handlers were checked. ['HmacAuthV4Handler'] Check your credentials

Does anybody any clue?

thanks

Could not upload files with diacritics in name

I have filename "./2009/Agátka ve školce/PC090374.JPG"
and I'm trying to upload it using:
`glacier archive upload --name "./2009/Agátka ve školce/PC090374.JPG" Photos "./2009/Agátka ve školce/PC090374.JPG"``

I end up with traceback:

Traceback (most recent call last):
  File "/usr/local/bin/glacier", line 618, in <module>
    App().main()
  File "/usr/local/bin/glacier", line 604, in main
    args.func(args)
  File "/usr/local/bin/glacier", line 416, in archive_upload
    archive_id = vault.create_archive_from_file(file_obj=args.file, description=name)
  File "/home/mirek/glacier-cli/boto/glacier/vault.py", line 163, in create_archive_from_file
    part_size=part_size)
  File "/home/mirek/glacier-cli/boto/glacier/vault.py", line 126, in create_archive_writer
    description)
  File "/home/mirek/glacier-cli/boto/glacier/layer1.py", line 479, in initiate_multipart_upload
    response_headers=response_headers)
  File "/home/mirek/glacier-cli/boto/glacier/layer1.py", line 83, in make_request
    raise UnexpectedHTTPResponseError(ok_responses, response)
boto.glacier.exceptions.UnexpectedHTTPResponseError

Not sure if this is problem of boto or glacier-cli.

Will investigate later.

TypeError: create_archive_from_file() got an unexpected keyword argument 'description'

I'm getting a TypeError whenever I try to upload a file to a vault:

glacier.py archive upload Backup file

where Backup is the vault name and file is a file on my fs.

Full stack trace:

Traceback (most recent call last):
  File "/Users/mark/bin/glacier.py", line 618, in <module>
    App().main()
  File "/Users/mark/bin/glacier.py", line 604, in main
    args.func(args)
  File "/Users/mark/bin/glacier.py", line 416, in archive_upload
    archive_id = vault.create_archive_from_file(file_obj=args.file, description=name)
TypeError: create_archive_from_file() got an unexpected keyword argument 'description'

fsck "light"

Running git annex fsck against the Glacier remote requests files from glacier. Would it be possible to have a "light" fsck which just compares the Glacier inventory hash with what is expected? This would be a really nice way to do the occasional check without having to retrieve all of that data.

Current project status?

There have been no commits for almost 2 years, and the installation instructions seem out of date, as reported in #49. This leads me to wonder if the project is still maintained? It seems that other forks are more active. Thanks for any info you can shed.

Installation instructions are out of date

"Check out the glacier branch of boto from Github (this branch is not released yet and is still under heavy development)." but this is not true - the branch only has 2 commits which are not in master:

https://github.com/boto/boto/compare/glacier

and even those changes (i.e. adding a description parameter) seem to be already upstream.

Aren't ANNEX_KEY and ANNEX_FILE swapped in the docs in one of the lines?

glacier-store-hook = glacier archive upload --name="$ANNEX_KEY" vault-name "$ANNEX_FILE"

The other lines have $ANNEX_KEY as the vault name.

Warn about multipart

Hi,

I just uploaded 32GB of data to Glacier with your tool (thanks for it), but was afterwards surprised by the 8500 requests this has caused, which cost extra money. This was caused by a multipart size of 8MB in glacier-ci, it seemed. For some reason I had expected glacier to upload my files in one go, unless I explicitly specify a multipart size.

You should mention this isssue more prominently, e.g. in README.md, and possibly change the default to uploading all in one go.

Thanks,
Joachim

please provide a glacier-cli command

As noted in #30 , boto also provides a glacier command. Worse, this command exits 0 when passed glacier-cli parameters, without uploading anything. boto/boto#2942

This makes it very unsafe for git-annex to run "glacier" and expect it to do anything sane. I think it could best be dealt with by glacier-cli providing a glacier-cli command, which git-annex could then run. (It could continue providing a glacier command too, or not.. no opinion here.)

Empty file getting error

When i try to upload empty file, i'm getting error IndexError: list index out of range
It's affected any automation!

alp ▶ ws207 ▶ ~ ▶ $ ▶ python /opt/glacier-cli/glacier.py --region eu-west-1 archive upload live example-content2222
it's hashes on input: []
len(hashes): 0
[]
Traceback (most recent call last):
File "/opt/glacier-cli/glacier.py", line 730, in
App().main()
File "/opt/glacier-cli/glacier.py", line 716, in main
self.args.func()
File "/opt/glacier-cli/glacier.py", line 498, in archive_upload
file_obj=self.args.file, description=name)
File "/usr/lib/python2.7/dist-packages/boto/glacier/vault.py", line 179, in create_archive_from_file
writer.close()
File "/usr/lib/python2.7/dist-packages/boto/glacier/writer.py", line 229, in close
self.uploader.close()
File "/usr/lib/python2.7/dist-packages/boto/glacier/writer.py", line 156, in close
hex_tree_hash = bytes_to_hex(tree_hash(self._tree_hashes))
File "/usr/lib/python2.7/dist-packages/boto/glacier/utils.py", line 108, in tree_hash
return hashes[0]
IndexError: list index out of range

Any plan to incorporate packaging improvements into the master branch?

In issue #50 you said "Packaging and documentation could do with improvement". Markus Hubig has added requirements.txt and setup.py in his branch here. His changes permit installation of glacier-cli with pip, which is always a good thing, using that URL. Will you please consider pulling Markus' changes into the master branch?

Feature request. Bandwidth limit; especially on uploads

Thanks for the tool. Here's an idea on how to improve it. Will keep using it with or without :)

Feature request: list all commands with -h

The current --help is not very helpful, would be nice to get a complete list of all possible commands.

ctrl-c catching

See http://git-annex.branchable.com/todo/ctrl_c_handling/

Basically:
When I am transfering a large number of files to glacier with git-annex, it'd be nice to ctrl-c and have the transfers killed cleanly (ie: stop after current transfer is completed, or, if it is a huge file, allow subsequent ctrl-c's to kill it). In the end, a ctrl-c would not result in ugly backtraces but instead useful messages ("All pending transfers stopped" or "All active and pending transfers killed", whatever makes sense).

FreeBSD glacier-cli empty archive list

Hi!

I've used glacier-cli on Linux and its working great. I'm now trying to use glacier-cli on FreeBSD (FreeNAS 9.2) and glacier-cli will correctly list vaults but when I try to list archives in a vault, I get nothing. I wiresharked traffic and I see no web traffic for 'glacier-cli archive list valuename' but I do capture traffic for 'glacier-cli vault list'.

On my FreeNAS system, I have python 2.7.6, lastest glacier-cli, iso8601 0.1.11, and I've tried various versions of boto downloaded from github.

Is there a way to debug boto and/or glacier-cli to see why its not connecting to Glacier AWS when I ask it to query archives?

Thanks,

Bobby

Does not show more than 10 vault or jobs...

Appears to have a limit of 10 vaults or jobs when doing 'vault list' or 'job list' when there are more than 10.

UnicodeDecodeError due to non-ASCII chars in key

I've encountered this issue with glacier-cli failing due to git-annex mistakenly adding things that look like file extension to the key when using the SHA256E backend. Essentially what it means is that certain files will have characters that look like a file extension appended to the key, even when they might not be part of the extension.

Example:

 % ls 12.\ Change\ The\ World\ \(feat.\ 웅산\).mp3 
12. Change The World (feat. 웅산).mp3
 % git annex info 12.\ Change\ The\ World\ \(feat.\ 웅산\).mp3
file: 12. Change The World (feat. 웅산).mp3
size: 7.48 megabytes
key: SHA256E-s7479642--957208748ae03fe4fc8d7877b2c9d82b7f31be0726e4a3dec9063b84cc64cf09.웅산.mp3
present: true
 % git annex calckey 12.\ Change\ The\ World\ \(feat.\ 웅산\).mp3
SHA256E-s7479642--957208748ae03fe4fc8d7877b2c9d82b7f31be0726e4a3dec9063b84cc64cf09.웅산.mp3

I've opened an issue with git-annex here:
https://git-annex.branchable.com/bugs/git-annex_adds_unicode_characters_at_end_of_checksum/

And the will be a fix for the case with brackets, but there are other cases in which a file extension might not be just ASCII. And then this is what happens:

% git annex copy 12.\ Change\ The\ World\ \(feat.\ 웅산\).mp3 --to glacier
copy 12. Change The World (feat. 웅산).mp3 (checking glacier...) Traceback (most recent call last):
  File "/usr/local/bin/glacier", line 737, in <module>
    main() 
  File "/usr/local/bin/glacier", line 733, in main
    App().main()
  File "/usr/local/bin/glacier", line 719, in main
    self.args.func()
  File "/usr/local/bin/glacier", line 600, in archive_checkpresent
    self.args.vault, self.args.name)
  File "/usr/local/bin/glacier", line 161, in get_archive_last_seen
    result = self._get_archive_query_by_ref(vault, ref).one()
  File "/usr/local/bin/glacier", line 136, in _get_archive_query_by_ref
    if ref.startswith('id:'):
UnicodeDecodeError: 'ascii' codec can't decode byte 0xec in position 83: ordinal not in range(128)
(user error (glacier ["--region=eu-west-1","archive","checkpresent","music","--quiet","SHA256E-s7479642--957208748ae03fe4fc8d7877b2c9d82b7f31be0726e4a3dec9063b84cc64cf09.\50885\49328.mp3"] exited 1)) failed
git-annex: copy: 1 failed

Now, As the bug report says, you can avoid this issue by changing your backend from SHA256E to SHA256 to avoid adding extensions. But I think addressing this issue would be good anyway.

Exception ends upload of very large file

Attempts to upload a very large (288.64 GB) file run for about an hour or two, then fail with the following output:

frontier:glacier-cli khagler$ ./glacier.py archive upload Photos ~/Documents/photos.tgz
Traceback (most recent call last):
File "./glacier.py", line 618, in
App().main()
File "./glacier.py", line 604, in main
args.func(args)
File "./glacier.py", line 416, in archive_upload
archive_id = vault.create_archive_from_file(file_obj=args.file, description=name)
File "/Users/khagler/glacier/glacier-cli/boto/glacier/vault.py", line 141, in create_archive_from_file
writer.write(data)
File "/Users/khagler/glacier/glacier-cli/boto/glacier/writer.py", line 152, in write
self.send_part()
File "/Users/khagler/glacier/glacier-cli/boto/glacier/writer.py", line 141, in send_part
content_range, part)
File "/Users/khagler/glacier/glacier-cli/boto/glacier/layer1.py", line 626, in upload_part
response_headers=response_headers)
File "/Users/khagler/glacier/glacier-cli/boto/glacier/layer1.py", line 83, in make_request
data=data)
File "/Users/khagler/glacier/glacier-cli/boto/connection.py", line 913, in make_request
return self._mexe(http_request, sender, override_num_retries)
File "/Users/khagler/glacier/glacier-cli/boto/connection.py", line 859, in _mexe
raise e
socket.gaierror: [Errno 8] nodename nor servname provided, or not known

OS: Mac OS X 10.7.5 Server
Python 2.7.1

I tried uploading the same file using FastGlacier with the part size set to 1 GB. It would upload some of each part before failing with a message about the remote host dropping the connection. After setting the part size to 256 MB, it was able to upload individual parts successfully.

Addendum:

After a bit more investigation, I think I've figured out what might be going on. According to the Amazon documentation, the maximum number of parts for a multi-part upload is 10,000. For this (very large) archive to be split evenly into 10,000 parts, each part would have to be about 27.5 MB--or, given the limits on allowable part sizes, 32 MB. It looks like you're using a default part size (which I didn't realize at the time I could change) of 8 MB. If I'm right about that, then an 80 GB file would be a (marginally less painful) valid test.

Can't list when the upload was done with another tool

I'm opening a new issue related to #10, done more tests and got a conclusion that when you upload with another tool (like FastGlacier) you can't download the file back using glacier-cli.

But everything works OK when you upload and download using glacier-cli, I think the files can't be related to the tool that was used to upload.

Please give more information on how to debug this for you.

Thank you

OS X 10.8: Missing third party modules - iso8601 and sqlalchemy

OS X 10.8.2 does not include the python modules iso8601 and sqlalchemy by default. Consider adding checks for third party modules.

$ ./glacier.py vault list
Traceback (most recent call last):
  File "./glacier.py", line 36, in <module>
    import iso8601
ImportError: No module named iso8601

Mac users can run the following commands to fix these dependencies:

sudo easy_install iso8601
sudo easy_install sqlalchemy

`glacier` binary is in conflict with boto's `glacier` binary

boto is a requirement to run glacier-cli, but when boto is installed, boto creates a glacier script.

https://github.com/boto/boto/blob/develop/setup.py#L59

In the README examples, glacier.py is referred to as just glacier and in this case it won't work because boto will override it.

It's not clear how to provide authentication details

Install the tool
Execute glacier vaults list
Exception happens

Traceback (most recent call last):
  File "C:/Python37-32/Scripts/glacier", line 161, in <module>
    main()
  File "C:/Python37-32/Scripts/glacier", line 149, in main
    list_vaults(region, access_key, secret_key)
  File "C:/Python37-32/Scripts/glacier", line 98, in list_vaults
    layer2 = connect(region, access_key = access_key, secret_key = secret_key)
  File "C:/Python37-32/Scripts/glacier", line 90, in connect
    debug=debug_level)
  File "C:\python37-32\lib\site-packages\boto\glacier\__init__.py", line 41, in connect_to_region
    return connect('glacier', region_name, connection_cls=Layer2, **kw_params)
  File "C:\python37-32\lib\site-packages\boto\regioninfo.py", line 220, in connect
    return region.connect(**kw_params)
  File "C:\python37-32\lib\site-packages\boto\regioninfo.py", line 290, in connect
    return self.connection_cls(region=self, **kw_params)
  File "C:\python37-32\lib\site-packages\boto\glacier\layer2.py", line 38, in __init__
    self.layer1 = Layer1(*args, **kwargs)
  File "C:\python37-32\lib\site-packages\boto\glacier\layer1.py", line 98, in __init__
    profile_name=profile_name)
  File "C:\python37-32\lib\site-packages\boto\connection.py", line 569, in __init__
    host, config, self.provider, self._required_auth_capability())
  File "C:\python37-32\lib\site-packages\boto\auth.py", line 1021, in get_auth_handler
    'Check your credentials' % (len(names), str(names)))
boto.exception.NoAuthHandlerFound: No handler was ready to authenticate. 1 handlers were checked. ['HmacAuthV4Handler'] Check your credentials

Please add to readme part about storing credentials and add proper error handling to the tool

Vault list only report the first 10 vaults

I think something changed on Amazon side as the command

vault list

only report the first 10 vaults.

Eg:

$binary --region $region vault list | wc -l
10

and I've like 200 vaults remotely.

Same happen on their aws glacier ui, they show you the first 10 and then you've to push "load more".

Can this be fixed somehow?

Thanks
A.

Feature request: specify region by env and keys by argument

Currently keys need to be specified through environment variables and region by command line argument. Would be nice to be able to do both in both ways.

piping files to and from glacier-cli

It'd be very useful to be able to pipe data into glacier-cli to upload it, as well as allow it to download data and output it to stdout. I'd suggest using - as the filename to enable these modes.

One reason I'd like to be able to do this is because git-annex is encrypting files before sending them. Which means it has to write the encrypted file to a temp file right now, which is a waste of disk space and IO.

The other reason I'd like this is that I want to display progress when uploading & downloading, and if I can pipe data, it's easy to update my progress bar as I send/receive data.

Note: Some of this may already work, but I hesitate to use it unless it's documented. Especially because -o- might send a file to stdout, but for all I know, glacier-cli could write other messages to stdout and corrupt the file.

no archives listed

I can see the vault names and the number of objects in each vault. I can list the vault names in whatever region I choose:

./glacier.py --region us-west-1 vault list
testvault-2
testvault-3
testvault-4
testvault-5
testvault-6
testvault-7
testvault1
testvault2
testvault3
vault_test

vault_test has 1 archive in it according to what I uploaded yesterday and viewed in AWS web.

./glacier.py --region us-west-1 archive list vault_test
glacier-cli [master +0 ~0 -0]$

nothing. Am I missing something? I get the same result back for other region vaults and archives.

latest code as of this writing

Another UnicodeDecode Error

  File "/usr/local/bin/glacier", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/dist-packages/glacier.py", line 729, in main
    App().main()
  File "/usr/local/lib/python3.6/dist-packages/glacier.py", line 715, in main
    self.args.func()
  File "/usr/local/lib/python3.6/dist-packages/glacier.py", line 497, in archive_upload
    file_obj=self.args.file, description=name)
  File "/usr/local/lib/python3.6/dist-packages/boto/glacier/vault.py", line 174, in create_archive_from_file
    data = file_obj.read(part_size)
  File "/usr/lib/python3.6/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb7 in position 14: invalid start byte
failed

with git-annex context - reading from stdin with python3 requires .buffer

Thank you for the project.

I have been using it in the context of git-annex. My system has python3 installed and that seems to cause problems when reading from sys.stdin as provided by git-annex:

git annex move text_file.txt --to glacier
move text_file.txt (checking glacier...) (to glacier...)
100%  13 B               27 B/s 0sTraceback (most recent call last):
  File "/Users/username/.local/bin/glacier", line 740, in <module>
    main()
  File "/Users/username/.local/bin/glacier", line 736, in main
    App().main()
  File "/Users/username/.local/bin/glacier", line 722, in main
    self.args.func()
  File "/Users/username/.local/bin/glacier", line 504, in archive_upload
    file_obj=self.args.file, description=name)
  File "/Users/username/.pyenv/versions/3.7.3/lib/python3.7/site-packages/boto/glacier/vault.py", line 178, in create_archive_from_file
    writer.close()
  File "/Users/username/.pyenv/versions/3.7.3/lib/python3.7/site-packages/boto/glacier/writer.py", line 228, in close
    self.partitioner.flush()
  File "/Users/username/.pyenv/versions/3.7.3/lib/python3.7/site-packages/boto/glacier/writer.py", line 79, in flush
    self._send_part()
  File "/Users/username/.pyenv/versions/3.7.3/lib/python3.7/site-packages/boto/glacier/writer.py", line 64, in _send_part
    data = b''.join(self._buffer)
TypeError: sequence item 0: expected a bytes-like object, str found

100%  13 B               17 B/s 0sTraceback (most recent call last):
  File "/Users/username/.local/bin/glacier", line 740, in <module>
    main()
  File "/Users/username/.local/bin/glacier", line 736, in main
    App().main()
  File "/Users/username/.local/bin/glacier", line 722, in main
    self.args.func()
  File "/Users/username/.local/bin/glacier", line 504, in archive_upload
    file_obj=self.args.file, description=name)
  File "/Users/username/.pyenv/versions/3.7.3/lib/python3.7/site-packages/boto/glacier/vault.py", line 178, in create_archive_from_file
    writer.close()
  File "/Users/username/.pyenv/versions/3.7.3/lib/python3.7/site-packages/boto/glacier/writer.py", line 228, in close
    self.partitioner.flush()
  File "/Users/username/.pyenv/versions/3.7.3/lib/python3.7/site-packages/boto/glacier/writer.py", line 79, in flush
    self._send_part()
  File "/Users/username/.pyenv/versions/3.7.3/lib/python3.7/site-packages/boto/glacier/writer.py", line 64, in _send_part
    data = b''.join(self._buffer)
TypeError: sequence item 0: expected a bytes-like object, str found
failed
git-annex: move: 1 failed

When reading from sys.stdin.read(), one needs to use sys.stdin.buffer.read() instead. I got it to work with that change.

git annex move text_file.txt --to glacier
move text_file.txt (checking glacier...) (to glacier...)
ok
(recording state in git...)

I will create a pull request with a patch.

Invalid Signature exception

i have export aws credentials to environment variables

and now Im getting

boto.glacier.exceptions.UnexpectedHTTPResponseError: Expected (200,), got (403, {"message":"Signature not yet current: 20131120T113529Z is still later than 20131120T104538Z (20131120T104038Z + 5 min.)","code":"InvalidSignatureException","type":"Client"})

do you know what is happening? I have googled a lot without sucess :S

UnexpectedHTTPResponseError when uploading

I have a 2.7G file that fails to upload every time with the following error:

Traceback (most recent call last):
File "/home/jarno/bin/glacier", line 694, in
App().main()
File "/home/jarno/bin/glacier", line 680, in main
args.func(args)
File "/home/jarno/bin/glacier", line 482, in archive_upload
archive_id = vault.create_archive_from_file(file_obj=args.file, description=name)
File "/home/jarno/bin/glacier-cli/glacier-cli/boto/glacier/vault.py", line 141, in create_archive_from_file
writer.write(data)
File "/home/jarno/bin/glacier-cli/glacier-cli/boto/glacier/writer.py", line 152, in write
self.send_part()
File "/home/jarno/bin/glacier-cli/glacier-cli/boto/glacier/writer.py", line 141, in send_part
content_range, part)
File "/home/jarno/bin/glacier-cli/glacier-cli/boto/glacier/layer1.py", line 626, in upload_part
response_headers=response_headers)
File "/home/jarno/bin/glacier-cli/glacier-cli/boto/glacier/layer1.py", line 88, in make_request
raise UnexpectedHTTPResponseError(ok_responses, response)
boto.glacier.exceptions.UnexpectedHTTPResponseError: Expected (204,), got (403, {"message":"The value passed in as x-amz-content-sha256 does not match the computed payload hash. Computed digest: e5c9cf7330ec9629b5db363c84e0da85e8267ad3b4cc646135fa1450c78c7e7b expected hash: dd4e39e6d6e0dc191af09d36db834dd25026a97b066956a09e2664033142f253","code":"InvalidSignatureException","type":"Client"})

Any ideas what to try?

Check if upload fails from shell script

from shell script, how to check if the file was successfully uploaded to vault or not? does glacier.py returns something which i can check to see if upload was successful or failed? i am using a shell script to schedule backups to glacier. i want to be notified by e-mail if the upload fails somehow.

Documentation Missing: Encryption

It would be nice if you could drop a line about encryption into the readme. Are files encryted when used directly? Are they encrypted when using git annex? Am I right that using git annex makes the backup unreadable by other tools?

Regards, Peter

UnicodeDecodeError while archiving files with git-annex to glacier

Hi i'm trying git annex with glacier archiving but unfortunately I get an strange UnicodeDecodeError when trying to archive a file. First I thought it's related to #16, but even if I rename the file to test.txtI'll get the same error msg:

Traceback (most recent call last):
  File "/usr/local/bin/glacier", line 9, in <module>
    load_entry_point('glacier-cli==0.1.0', 'console_scripts', 'glacier-cli')()
  File "build/bdist.macosx-10.11-x86_64/egg/glacier.py", line 732, in entry_point
  File "build/bdist.macosx-10.11-x86_64/egg/glacier.py", line 718, in main
  File "build/bdist.macosx-10.11-x86_64/egg/glacier.py", line 500, in archive_upload
  File "/Users/markus/.virtualenvs/glacier-cli/lib/python2.7/site-packages/boto-2.38.0-py2.7.egg/boto/glacier/vault.py", line 178, in create_archive_from_file
    writer.close()
  File "/Users/markus/.virtualenvs/glacier-cli/lib/python2.7/site-packages/boto-2.38.0-py2.7.egg/boto/glacier/writer.py", line 228, in close
    self.partitioner.flush()
  File "/Users/markus/.virtualenvs/glacier-cli/lib/python2.7/site-packages/boto-2.38.0-py2.7.egg/boto/glacier/writer.py", line 79, in flush
    self._send_part()
  File "/Users/markus/.virtualenvs/glacier-cli/lib/python2.7/site-packages/boto-2.38.0-py2.7.egg/boto/glacier/writer.py", line 75, in _send_part
    self.send_fn(part)
  File "/Users/markus/.virtualenvs/glacier-cli/lib/python2.7/site-packages/boto-2.38.0-py2.7.egg/boto/glacier/writer.py", line 222, in _upload_part
    self.uploader.upload_part(self.next_part_index, part_data)
  File "/Users/markus/.virtualenvs/glacier-cli/lib/python2.7/site-packages/boto-2.38.0-py2.7.egg/boto/glacier/writer.py", line 129, in upload_part
    content_range, part_data)
  File "/Users/markus/.virtualenvs/glacier-cli/lib/python2.7/site-packages/boto-2.38.0-py2.7.egg/boto/glacier/layer1.py", line 1279, in upload_part
    response_headers=response_headers)
  File "/Users/markus/.virtualenvs/glacier-cli/lib/python2.7/site-packages/boto-2.38.0-py2.7.egg/boto/glacier/layer1.py", line 114, in make_request
    data=data)
  File "/Users/markus/.virtualenvs/glacier-cli/lib/python2.7/site-packages/boto-2.38.0-py2.7.egg/boto/connection.py", line 1071, in make_request
    retry_handler=retry_handler)
  File "/Users/markus/.virtualenvs/glacier-cli/lib/python2.7/site-packages/boto-2.38.0-py2.7.egg/boto/connection.py", line 943, in _mexe
    request.body, request.headers)
  File "/usr/local/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1053, in request
    self._send_request(method, url, body, headers)
  File "/usr/local/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1093, in _send_request
    self.endheaders(body)
  File "/usr/local/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 1049, in endheaders
    self._send_output(message_body)
  File "/usr/local/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 891, in _send_output
    msg += message_body
UnicodeDecodeError: 'ascii' codec can't decode byte 0x8c in position 0: ordinal not in range(128)

Need a way to access glacier ids in event of dupe archive or non-named archives

Glacier allows the uploading of multiple archives with the same name (also archives with no name), so the only way to retrieve or delete the dupes is by id. Currently, glacier-cli provides no way to retrieve the ids from the database.

From the command line, we can access them from the local cache DB:

sqlite3 ~/.cache/glacier-cli/db "select name, id from archive"

From the code, the error condition can be reported when the sqlalchemy.orm.exc.MultipleResultsFound exception is thrown. For example, see http://pastebin.com/Y6WybtJH

Limit hourly retrieval rate

Glacier can be very expensive when downloading files too fast, but free when spreading retrievals across a month.

Do you have any plans to allow setting an hourly retrieval limit? User would need to calculate this by

total storage * 5% / 720 hours

Can't list when the upload was done with another tool

Hi.
I've upload all my data using FastGlacier and I can't get an archive list using glacier-cli, I'm using the same access key id and access secret.

Everything goes OK when I create a new vault using glacier-cli and upload files using the command line.

What more information can I give to you?

UnicodeDecodeError on upload

Unfortunately this is a blocker for me, rendering git-annex's glacier special remote useless - the whole point of trying it was to be able to upload big binary files.

$ glacier --region eu-west-1 archive upload music myiso.exe
Traceback (most recent call last):
  File "/home/adam/bin/glacier", line 730, in <module>
    App().main()
  File "/home/adam/bin/glacier", line 716, in main
    self.args.func()
  File "/home/adam/bin/glacier", line 498, in archive_upload
    file_obj=self.args.file, description=name)
  File "/usr/lib/python2.7/site-packages/boto/glacier/vault.py", line 177, in create_archive_from_file
    writer.write(data)
  File "/usr/lib/python2.7/site-packages/boto/glacier/writer.py", line 219, in write
    self.partitioner.write(data)
  File "/usr/lib/python2.7/site-packages/boto/glacier/writer.py", line 61, in write
    self._send_part()
  File "/usr/lib/python2.7/site-packages/boto/glacier/writer.py", line 75, in _send_part
    self.send_fn(part)
  File "/usr/lib/python2.7/site-packages/boto/glacier/writer.py", line 222, in _upload_part
    self.uploader.upload_part(self.next_part_index, part_data)
  File "/usr/lib/python2.7/site-packages/boto/glacier/writer.py", line 129, in upload_part
    content_range, part_data)
  File "/usr/lib/python2.7/site-packages/boto/glacier/layer1.py", line 1279, in upload_part
    response_headers=response_headers)
  File "/usr/lib/python2.7/site-packages/boto/glacier/layer1.py", line 114, in make_request
    data=data)
  File "/usr/lib/python2.7/site-packages/boto/connection.py", line 1068, in make_request
    retry_handler=retry_handler)
  File "/usr/lib/python2.7/site-packages/boto/connection.py", line 942, in _mexe
    request.body, request.headers)
  File "/usr/lib64/python2.7/httplib.py", line 973, in request
    self._send_request(method, url, body, headers)
  File "/usr/lib64/python2.7/httplib.py", line 1007, in _send_request
    self.endheaders(body)
  File "/usr/lib64/python2.7/httplib.py", line 969, in endheaders
    self._send_output(message_body)
  File "/usr/lib64/python2.7/httplib.py", line 827, in _send_output
    msg += message_body
UnicodeDecodeError: 'ascii' codec can't decode byte 0xff in position 12: ordinal not in range(128)