GithubHelp home page GithubHelp logo

anarcat / fatcat Goto Github PK

View Code? Open in Web Editor NEW

This project forked from internetarchive/fatcat

0.0 2.0 0.0 2.61 MB

MIRROR of upstream IA repository

Home Page: https://guide.fatcat.wiki

License: Other

Python 35.40% Shell 0.24% HTML 7.18% Rust 55.09% PLpgSQL 2.09%

fatcat's Introduction

  __       _            _   
 / _| __ _| |_ ___ __ _| |_ 
| |_ / _` | __/ __/ _` | __|
|  _| (_| | || (_| (_| | |_ 
|_|  \__,_|\__\___\__,_|\__|

                                    ... catalog all the things!

The RFC is the original design document, and the best place to start for background. There is a work-in-progress "guide" at https://guide.fatcat.wiki; the canonical public location of this repository is https://github.com/internetarchive/fatcat.

There are four main components:

  • backend API server and database
  • elasticsearch index
  • API client libraries and bots (eg, ingesters)
  • front-end web interface (built on API and library)

The API server was prototyped in python. "Real" implementation started in golang, but shifted to Rust, and is work-in-progress. The beginings of a client library, web interface, and data ingesters exist in python. Elasticsearch index is currently just a Crossref metadata dump and doesn't match entities in the database/API (but is useful for paper lookups).

See the LICENSE file for details permissions and licensing of both python and rust code. In short, the auto-generated client libraries are permissively released, while the API server and web interface are strong copyleft (AGPLv3).

Status

  • HTTP API
    • base32 encoding of UUID identifiers
    • inverse many-to-many helpers (files-by-release, release-by-creator)
  • SQL Schema
    • Basic entities
    • one-to-many and many-to-many entities
    • JSON(B) "extra" metadata fields
    • full rev1 schema for all entities
    • editgroup review: comments? actions?
  • Web Interface
    • Migrate Python codebase
    • Creation and editing of all entities
  • Other
    • Basic logging
    • Swagger-UI
    • Sentry (error reporting)
    • Metrics
    • Authentication (eg, accounts, OAuth2, JWT)
    • Authorization (aka, roles)
    • bot vs. editor

Identifiers

Fatcat entity identifiers are 128-bit UUIDs encoded in base32 format. Revision ids are also UUIDs, and encoded in normal UUID fashion, to disambiguate from edity identifiers.

Python helpers for conversion:

import base64
import uuid

def fcid2uuid(s):
    s = s.split('_')[-1].upper().encode('utf-8')
    assert len(s) == 26
    raw = base64.b32decode(s + b"======")
    return str(uuid.UUID(bytes=raw)).lower()

def uuid2fcid(s):
    raw = uuid.UUID(s).bytes
    return base64.b32encode(raw)[:26].lower().decode('utf-8')

test_uuid = '00000000-0000-0000-3333-000000000001'
assert test_uuid == fcid2uuid(uuid2fcid(test_uuid))

fatcat's People

Contributors

bnewbold avatar vinaygo avatar espertus avatar mekarpeles avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.