GithubHelp home page GithubHelp logo

dogsheep / dogsheep-photos Goto Github PK

View Code? Open in Web Editor NEW
170.0 11.0 15.0 68 KB

Upload your photos to S3 and import metadata about them into a SQLite database

License: Apache License 2.0

Python 100.00%
dogsheep datasette-io datasette sqlite datasette-tool

dogsheep-photos's Introduction

dogsheep-photos

PyPI Changelog CircleCI License

Save details of your photos to a SQLite database and upload them to S3.

See Using SQL to find my best photo of a pelican according to Apple Photos for background information on this project.

What these tools do

These tools are a work-in-progress mechanism for taking full ownership of your photos. The core idea is to help implement the following:

  • Every photo you have taken lives in a single, private Amazon S3 bucket
  • You have a single SQLite database file which stores metadata about those photos - potentially pulled from multiple different places. This may include EXIF data, Apple Photos, the results of running machine learning APIs against photos and much more besides.
  • You can then use Datasette to explore your own photos.

I'm a heavy user of Apple Photos so the initial releases of this tool will have a bias towards that, but ideally I would like a subset of these tools to be useful to people no matter which core photo solution they are using.

Installation

$ pip install dogsheep-photos

Authentication (if using S3)

If you want to use S3 to store your photos, you will need to first create S3 credentials for a new, dedicated bucket.

You may find the s3-credentials tool useful for this.

Run this command and paste in your credentials. You will need three values: the name of your S3 bucket, your Access key ID and your Secret access key.

$ dogsheep-photos s3-auth

This will create a file called auth.json in your current directory containing the required values. To save the file at a different path or filename, use the --auth=myauth.json option.

Uploading photos

Run this command to upload every photo in a specific directory to your S3 bucket:

$ dogsheep-photos upload photos.db \
    ~/Pictures/Photos\ Library.photoslibrary/original

The command will only upload photos that have not yet been uploaded, based on their sha256 hash.

photos.db will be created with an uploads table containing details of which files were uploaded.

To see what the command would do without uploading any files, use the --dry-run option.

The sha256 hash of the photo contents will be used as the name of the file in the bucket, with an extension matching the type of file. This is an implementation of the Content addressable storage pattern.

Importing Apple Photos metadata

The apple-photos command imports metadata from your Apple Photos library.

$ photo-to-sqlite apple-photos photos.db

Imported metadata includes places, people, albums, quality scores and machine learning labels for the photo contents.

Creating a subset database

You can create a new, subset database of photos using the create-subset command.

This is useful for creating a shareable SQLite database that only contains metadata for a selected set of photos.

Since photo metadata contains latitude and longitude you may not want to share a database that includes photos taken at your home address.

create-subset takes three arguments: an existing database file created using the apple-photos command, the name of the new, shareable database file you would like to create and a SQL query that returns the sha256 hash values of the photos you would like to include in that database.

For example, here's how to create a shareable database of just the photos that have been added to albums containing the word "Public":

$ dogsheep-photos create-subset \
    photos.db \
    public.db \
    "select sha256 from apple_photos where albums like '%Public%'"

Serving photos locally with datasette-media

If you don't want to upload your photos to S3 but you still want to browse them using Datasette you can do so using the datasette-media plugin. This plugin adds the ability to serve images and other static files directly from disk, configured using a SQL query.

To use it, first install Datasette and the plugin:

$ pip install datasette datasette-media

If any of your photos are .HEIC images taken by an iPhone you should also install the optional pyheif dependency:

$ pip install pyheif

Now create a metadata.yaml file configuring the plugin:

plugins:
  datasette-media:
    thumbnail:
      sql: |-
        select path as filepath, 200 as resize_height from apple_photos where uuid = :key
    large:
      sql: |-
        select path as filepath, 1024 as resize_height from apple_photos where uuid = :key

This will configure two URL endpoints - one for 200 pixel high thumbnails and one for 1024 pixel high larger images.

Create your photos.db database using the apple-photos command, then run Datasette like this:

$ datasette -m metadata.yaml

Your photos will be served on URLs that look like this:

http://127.0.0.1:8001/-/media/thumbnail/F4469918-13F3-43D8-9EC1-734C0E6B60AD
http://127.0.0.1:8001/-/media/large/F4469918-13F3-43D8-9EC1-734C0E6B60AD

You can find the UUIDs for use in these URLs by running select uuid from photos_with_apple_metadata.

Displaying images using datasette-json-html

If you are using datasette-media to serve photos you can include images directly in Datasette query results using the datasette-json-html plugin.

Run pip install datasette-json-html to install the plugin, then use queries like this to view your images:

select
    json_object(
        'img_src',
        '/-/media/thumbnail/' || uuid
    ) as photo,
    uuid,
    date
from
    apple_photos
order by
    date desc
limit 10;

The photo column returned by this query should render as image tags that display the correct images.

Displaying images using custom template pages

Datasette's custom pages feature lets you create custom pages for a Datasette instance by dropping HTML templates into a templates/pages directory and then running Datasette using datasette --template-dir=templates/.

You can combine that ability with the datasette-template-sql plugin to create custom template pages that directly display photos served by datasette-media.

Install the plugin using pip install datasette-template-sql.

Create a templates/pages folder and add the following files:

recent-photos.html

<h1>Recent photos</h1>

<div>
{% for photo in sql("select * from apple_photos order by date desc limit 20") %}
    <img src="/-/media/photo/{{ photo['uuid'] }}">
{% endfor %}
</div>

random-photos.html

<h1>Random photos</h1>

<div>
{% for photo in sql("with foo as (select * from apple_photos order by date desc limit 5000) select * from foo order by random() limit 20") %}
    <img src="/-/media/photo/{{ photo['uuid'] }}">
{% endfor %}
</div>

Now run Datasette like this:

$ datasette photos.db -m metadata.yaml --template-dir=templates/

Visiting http://localhost:8001/recent-photos will display 20 recent photos. Visiting http://localhost:8001/random-photos will display 20 photos randomly selected from your 5,000 most recent.

dogsheep-photos's People

Contributors

rhettbull avatar simonw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dogsheep-photos's Issues

bucket name

I followed the instructions to setup credentials but I am getting a invalid bucket name. Can you put a sample auth.json file in the base that shows the correct format for this? Thanks

ImportError: cannot import name 'formatargspec' from 'inspect'

I get the following error when running "pip3 install dogsheep-photos"
" from inspect import ismethod, isclass, formatargspec
ImportError: cannot import name 'formatargspec' from 'inspect' (/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/inspect.py). Did you mean: 'formatargvalues'?"

Python 3.12.0rc1
sqlite 3.43.0
datasette, version 0.64.3

photo-to-sqlite: command not found

Having installed in a venv I get:

(venv) (base) Robins-MacBook:datasette robin$ photo-to-sqlite apple-photos photos.db

-bash: photo-to-sqlite: command not found

Integrate image content hashing

To spot duplicate images (where the file content differs such that the sha256 is no longer a match) it would be useful to calculate and store perceptual hashes of some sort.

create-subset command for creating a publishable subset of a photos database

I want to share a subset of my photos, without sharing everything. Idea:

$ photos-to-sqlite create-subset photos.db public.db "select sha256 from ... where ..."

So the command takes a SQL query that returns sha256 hashes, then creates a new file called public.db containing just the data corresponding to those photos.

apple-photos command should work even if upload has not run

I want people to be able to query their Apple Photos metadata without having to first run upload to upload all of their files to their own S3 bucket.

To do this I can have apple-photos calculate SHA256 hashes of each photo if the uploads table does not yet exist (or does not contain that photo).

KeyError: 'Contents' on running upload

Following the readme, on big sur, and having entered my auth creds via dogsheep-photos s3-auth:

(venv) (base) Robins-MacBook:datasette robin$ dogsheep-photos upload photos.db     ~/Pictures/Photos\ /Users/robin/Pictures/Library.photoslibrary --dry-run

Fetching existing keys from S3...
Traceback (most recent call last):
  File "/Users/robin/datasette/venv/bin/dogsheep-photos", line 8, in <module>
    sys.exit(cli())
  File "/Users/robin/datasette/venv/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/robin/datasette/venv/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/robin/datasette/venv/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/robin/datasette/venv/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/robin/datasette/venv/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Users/robin/datasette/venv/lib/python3.8/site-packages/dogsheep_photos/cli.py", line 96, in upload
    key.split(".")[0] for key in get_all_keys(client, creds["photos_s3_bucket"])
  File "/Users/robin/datasette/venv/lib/python3.8/site-packages/dogsheep_photos/utils.py", line 46, in get_all_keys
    for row in page["Contents"]:
KeyError: 'Contents'

Possibly since the bucket is in EU (London) eu-west-2 and this into is not requested?

bpylist.archiver.CircularReference: archive has a cycle with uid(13)

% python -i $(which photos-to-sqlite) apple-photos photos.db                     
Traceback (most recent call last):
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/osxphotos/photoinfo.py", line 611, in place
    return self._place  # pylint: disable=access-member-before-definition
AttributeError: 'PhotoInfo' object has no attribute '_place'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/bin/photos-to-sqlite", line 11, in <module>
    load_entry_point('photos-to-sqlite', 'console_scripts', 'photos-to-sqlite')()
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/Users/simon/Dropbox/Development/photos-to-sqlite/photos_to_sqlite/cli.py", line 249, in apple_photos
    photo_row = osxphoto_to_row(sha256, photo)
  File "/Users/simon/Dropbox/Development/photos-to-sqlite/photos_to_sqlite/utils.py", line 91, in osxphoto_to_row
    place = photo.place
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/osxphotos/photoinfo.py", line 614, in place
    self._place = PlaceInfo5(self._info["reverse_geolocation"])
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/osxphotos/placeinfo.py", line 505, in __init__
    self._plrevgeoloc = archiver.unarchive(revgeoloc_bplist)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 16, in unarchive
    return Unarchive(plist).top_object()
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 256, in top_object
    return self.decode_object(self.top_uid)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 247, in decode_object
    obj = klass.decode_archive(ArchivedObject(raw_obj, self))
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/osxphotos/placeinfo.py", line 126, in decode_archive
    mapItem = archive.decode("mapItem")
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 140, in decode
    return self._unarchiver.decode_key(self._object, key)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 216, in decode_key
    return self.decode_object(val)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 247, in decode_object
    obj = klass.decode_archive(ArchivedObject(raw_obj, self))
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/osxphotos/placeinfo.py", line 180, in decode_archive
    sortedPlaceInfos = archive.decode("sortedPlaceInfos")
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 140, in decode
    return self._unarchiver.decode_key(self._object, key)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 216, in decode_key
    return self.decode_object(val)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 247, in decode_object
    obj = klass.decode_archive(ArchivedObject(raw_obj, self))
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 112, in decode_archive
    return [archive._decode_index(index) for index in uids]
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 112, in <listcomp>
    return [archive._decode_index(index) for index in uids]
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 137, in _decode_index
    return self._unarchiver.decode_object(index)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 247, in decode_object
    obj = klass.decode_archive(ArchivedObject(raw_obj, self))
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/osxphotos/placeinfo.py", line 217, in decode_archive
    placeType = archive.decode("placeType")
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 140, in decode
    return self._unarchiver.decode_key(self._object, key)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 216, in decode_key
    return self.decode_object(val)
  File "/Users/simon/.local/share/virtualenvs/photos-to-sqlite-0uGSHd6e/lib/python3.8/site-packages/bpylist/archiver.py", line 227, in decode_object
    raise CircularReference(index)
bpylist.archiver.CircularReference: archive has a cycle with uid(13)

In the debugger I traced this back to:

178  	    @staticmethod
179  	    def decode_archive(archive):
180  ->	        sortedPlaceInfos = archive.decode("sortedPlaceInfos")
181  	        finalPlaceInfos = archive.decode("finalPlaceInfos")
182  	        return PLRevGeoMapItem(sortedPlaceInfos, finalPlaceInfos)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.