This issue is to brainstorm the design of the API endpoints and responses. I'll start with a couple of points on shorter URLs, HATEOAS and the folder structure.
Shorter URLs
I propose maintaining numeric IDs for each author, corpus, text, etc. and using those to construct the REST endpoints.
So, for example, endpoint GET /lang/latin/corpus/perseus/author/tacitus/text/germania
becomes GET /lang/latin/corpus/1/author/6/text/8
.
This keeps the URLs short while allowing the actual names that the IDs map to to be as long as needed.
A problem with this (assuming an external API consumer) is figuring out the ID of a specific author/corpus/text.
API Discoverability
The formal term for this is HATEOAS. This implies a user should be able to browse and discover all the endpoints of the REST API using the REST API itself.
Towards this, we should define endpoints like GET /lang/latin/corpus/
that returns a response:
{"corpora": [ {"name": "perseus", "id": "1"}, ... ]}
This way, the user will be able to query for all the available corpora and figure out the ID.
Another example of this is from my POS tagger implementation. It is possible to view the list of languages and POS tagging methods they support via GET /core/pos
, and perform the actual POS tagging for a string via POST /core/pos
.
In general, adding a GET
request handler to endpoints like /lang
, /lang/<int:lang_id>/corpus
, etc. should make the API discoverable.
Folder Structure
Right now all the resources are defined in a single file (api_json.py
), and so are tests (tests.py
). There is also no distinction between files containing utility functions and actual REST resources.
I briefly mentioned this in my #20 (comment).
An example of my proposed organisation is in #27. Inside the folder for a specific function (/pos
), the resources will be in views.py
, the database stuff (if any) in models.py
, utility functions in utils.py
and parameters in constants.py
.
(It may be better to keep constants.py
at the root of the API folder structure, to easily find and change)