review theme module structure

Is the __init__.py file needed: I assume yes cuz is a python module. Next question, is the version import needed? I think is not, shall it be removed?

from ..version import __version__

__all__ = ('__version__', )

datamodel: create datacite model based on zenodo

InvenioRDM data model

DataCite JSON Schema and PDF version.
Translate all camel to snake case.
Let's shorten common structures like schemeURI/valueURI -> scheme/value

Metadata fields

Extras for all

Add references (need to check zenodo structure if ok). Not all has related identifiers, sometimes is just a reference to the text.
internal notes (discuss with ILS what they have). "curators notes", non-public notes.
Access right. ~~Discuss _access~~ TBD in the future.
Access condition

How to hanlde custom fields?

See Zenodo implementation (custom)

Zenodo custom fields

Imprint
Journal
Part of?
Thesis

TODO

Check against ILS schema.

Discussion points for community:

Keywords or only subjects?

*CV: Control Vocabulary

publication_date: support for EDTF lvl 0

~~Currently only publication_date is specified to support EDTF.~~

~~However, all the rest of fields have date-time as format. Should they only be of format date? date was only introduced in JSONSchema draft 7 (more info here).~~

Redefinition of this task ;)

As a depositor, I may be unsure about the publication_date of a record (e.g. sometime in World War II), but I want to convey this range as the publication date, because a publication date is needed to mint a DOI (DataCite) and it is more informative than nothing.

technical implications
Allow publication_date field to be an EDTF of lvl 0

marhsmallow
jsonschema
elasticsearch + create a lower end date for sorting purposes

Add access_condition to data model

[Shelved in favour of access levels] Implement permissions metadata fields

This issue tracks implementation of permissions metadata fields in the schema. Different approaches were considered.

See pending RFC 7 for permissions mechanics and pending RFC 12 for permissions schema.

Potential simplification of additional titles/descriptions

We could potentially collapse additional_descriptions and additional_titles fields into a descriptions and titles fields resp. if we build an indexing hook (or other means) to allow description=mySearchTerm and title=mySearchTerm to still be valid search box searches.

Align codebase with cookiecutter-invenio-instance

The cookeicutter-invenio-instance includes a data model. This data model should align with structure of the cookiecutter package, adding marshmallow, serializers, etc.

Change preview box filename when previewed file is changed

Steps to reproduce:
1- visit a record page with multiple attachments
2- click to preview a different file
expect
3- previewed filename to be showed at top of box
get
3- initially previewed file's name (it doesn't change)

Support less specific dates

When dealing with historic material, dates are not always known to an accuracy of a specific day.

Would it be possible to support YYYY / YYYY-MM (and even, dare I ask -MM-DD) both in publication_date, and the general dates?

datamodel: reference sheet

This is a reference sheet for the core metadata shared by InvenioRDM records:

Jsonschema as of 2019-12-20 :

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "id": "http://localhost/schemas/records/record-v1.0.0.json",
  "title": "Invenio Datacite based Record Schema v1.0.0",
  "type": "object",
  "additionalProperties": false,
  "properties": {
    "_access": {
      "metadata_restricted": {
        "default": false,
        "description": "Record metadata accesibility. Public by default (False).",
        "type": "boolean"
      },
      "files_restricted": {
        "default": false,
        "description": "Record associated files accesibility. Public by default (False).",
        "type": "boolean"
      }
    },
    "_bucket": {
      "description": "Record bucket.",
      "type": "string"
    },
    "access_right": {
      "default": "open",
      "description": "Access right for record.",
      "type": "string"
    },
    "additional_descriptions": {
      "type": "array",
      "items": {
          "type": "object",
          "properties": {
              "description": {
                "description": "Description/abstract for record.",
                "type": "string"
              },
              "description_type": {
                "description": "Type of description.",
                "type": "string"
              },
              "lang": {
                "description": "Language of the description. ISO 639-3 language code.",
                "type": "string",
                "maxLength": 3
              }
          },
          "required": ["description", "description_type"]
      },
      "uniqueItems": true
    },
    "additional_titles": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
            "title": {
              "description": "Title of the record.",
              "type": "string"
            },
            "title_type": {
              "description": "Type of title.",
              "type": "string"
            },
            "lang": {
              "description": "Language of the title. ISO 639-3 language code.",
              "type": "string",
              "maxLength": 3
            }
        },
        "required": ["title"]
      },
      "uniqueItems": true
    },
    "contributors": {
      "description": "Contributors in order of importance.",
      "minItems": 1,
      "type": "array",
      "items": {
        "type": "object",
        "additionalProperties": false,
        "properties": {
          "ids": {
            "description": "List of IDs related with the person.",
            "type": "array",
            "uniqueItems": true,
            "items": {
              "additionalProperties": false,
              "type": "object",
              "properties": {
                "source": {
                  "type": "string"
                },
                "value": {
                  "type": "string"
                }
              }
            }
          },
          "name": {
            "description": "Full name of person or organisation. Personal name format: family, given.",
            "type": "string"
          },
          "affiliations": {
            "description": "Affiliation(s) for the purpose of this specific record.",
            "type": "array",
            "uniqueItems": true,
            "items": {
              "type": "string"
            }
          },
          "email": {
            "type": "string",
            "description": "Contact email for the purpose of this specific record.",
            "format": "email"
          },
          "role": {
            "description": "",
            "type": "string"
          }
        },
        "required": [
          "name"
        ]
      }
    },
    "dates": {
      "description": "Date interval.",
      "items": {
        "additionalProperties": false,
        "properties": {
          "description": {
            "description": "Description of the date interval.",
            "type": "string"
          },
          "end": {
            "description": "End date.",
            "type": "string",
            "format": "date-time"
          },
          "start": {
            "description": "Start date.",
            "type": "string",
            "format": "date-time"
          },
          "type": {
            "description": "Type of the date interval."
          }
        },
        "required": [
          "type"
        ],
        "type": "object"
      },
      "type": "array"
    },
    "description": {
      "description": "Description for record.",
      "type": "string"
    },
    "embargo_date": {
      "description": "Embargo date of record (ISO8601 formatted date).",
      "type": "string",
      "format": "date-time"
    },
    "keywords": {
      "description": "Free text keywords.",
      "items": {
        "type": "string"
      },
      "type": "array"
    },
    "language": {
      "description": "Primary language of the resource. ISO 639-3 language code.",
      "type": "string",
      "maxLength": 3
    },
    "owners": {
      "description": "List of user IDs that are owners of the record.",
      "items": {
        "type": "number"
      },
      "type": "array",
      "minItems": 1,
      "uniqueItems": true
    },
    "publication_date": {
      "description": "Record publication date (IS8601-formatted). EDTF support to be added for field.",
      "type": "string",
      "format": "date-time"
    },
    "recid": {
      "description": "Invenio record identifier (alphanumeric).",
      "type": "string"
    },
    "resource_type": {
      "additionalProperties": false,
      "description": "Record resource type.",
      "properties": {
        "subtype": {
          "description": "Specific resource type.",
          "type": "string"
        },
        "type": {
          "default": "publication",
          "description": "General resource type.",
          "type": "string"
        }
      },
      "required": [
        "type",
        "subtype"
      ],
      "type": "object"
    },
    "rights": {
      "description": "Any rights information for this resource.",
      "type": "array",
      "items": {
          "type": "object",
          "properties": {
              "rights": {
                "description": "The right itself. Free text.",
                "type": "string"
              },
              "uri": {
                "description": "The URI of the license.",
                "type": "string",
                "format": "uri"
              },
              "identifier": {
                "description": "A short, standardized version of the license name.",
                "type": "string"
              },
              "identifier_scheme": {
                "description": "The name of the scheme.",
                "type": "string"
              },
              "scheme_uri": {
                "description": "The URI of the identifier_scheme.",
                "type": "string",
                "format": "uri"
              },
              "lang": {
                "description": "Language of the right information. ISO 639-3 language code.",
                "type": "string",
                "maxLength": 3
              }
          }
      },
      "uniqueItems": true
    },
    "title": {
      "description": "Record title.",
      "type": "string"
    },
    "version": {
      "description": "Record version tag.",
      "type": "string"
    }
  },
  "required": [
    "_access",
    "access_right",
    "contributors",
    "description",
    "owners",
    "publication_date",
    "resource_type",
    "title"
  ]
}

Fields

Extras for all

Add references (need to check zenodo structure if ok). Not all has related identifiers, sometimes is just a reference to the text.
internal notes (discuss with ILS what they have). "curators notes", non-public notes.
Access condition
"internal fields" : See #38

How to handle custom fields?

See Zenodo implementation (custom)
See #2

Zenodo custom fields

Imprint
Journal
Part of?
Thesis

TODO

Check against ILS schema.

Discussion points for community:

Keywords or only subjects?

*CV: Control Vocabulary
updated 2019-08-16 with comments below.
updated 2019-12-20 with issues.

global: migrate records config from invenio-app-rdm

Configuration about records, rest and files is currently in invenio-app-rdm. It should be migrated to invenio-rdm-records.

ui: implemente citation box

identifiers field (and subfields) as dict

The identifiers field is an array of scheme and identifier.

Pros:

matches up well with a potentially reusable UI component that adds an element to a list.
allows for preference among different identifiers (1 one is the preferred one to be displayed)

Cons: (in contrast with identifiers as a dict of scheme key and identifier value)

no structural enforcement of unique scheme / identifier
no reason could not create a reusable UI component that could do the trick for dict approach
no reason could not add preference as a field (but would be finiky to be honest)

Perhaps we should switch it to a dict.

After In video life (IVL, I am coining this :) ) conversation

Use dict for identifiers in top level identifiers and creators/contributors identifiers. Other cases have more data. To be seen on a case by case basis.

Implement OAI-PMH serialization

customization: resource types

Support custom resource types. There are different approaches:

Add the list as a configuration variable in invenio.cfg.
Create a command in the CLI (in rdm-records or in the scritps) in order to allow: load, search, removal, and edit of the types.

Note: We must consider the maximum size we expect this list to grow, as in the second case we need to store them in DB. Consider the use of controlled vocabularies.

demo: create and load records

As a hoster/developer I want to be able to test my instance with demo data.

Create demo dataset.
Create a script to load the data in the repository. (Note: A fixture approach might be easier)

global: automate releases

Automate pushing releases to PyPi

Add files to records

Invenio 3.2 brings a solid Files bundle that we want to rely on to attach files to records.

This task includes integration of the Files upload API.

We will use prior files' fields for the data model.

files upload permissions inveniosoftware/invenio-records-permissions#32
files upload
files list display inveniosoftware/invenio-app-rdm#42
files download permission inveniosoftware/invenio-records-permissions#35
files download
files previewer (if time) inveniosoftware/invenio-app-rdm#43

global: implemente record versioning

global: integrate with invenio-stats

This issue is two folded:

Integration with invenio-stats per se
Displaying the stats in the UI (Metrics)

datamodel: treating internal fields with `_`

Currently the internal fields are prefix with underscore. However, the user submits the record without them (e.g. access).

An issue also arises when displaying the record. Currently the record view shows the record with _access.

Marshmallow?
Record API class?

Suggestion: rename community to communities

For consistency and clarity.

Mint records with DataCite-like provider

Confirmation is needed on this task:

Implement and integrate minters to generate internal record persistent identifiers (pid_type=recid). Rather than legacy record identifiers, random 10-character alphanumeric string (with checksum) should be used.

implement provider, minter and fetcher in invenio-pidstore: inveniosoftware/invenio-pidstore#125
integrate in invenio-rdm-records fake records: #31

This is the task that will finally connect base32-lib functionality with record minting (via invenio-pidstore).

Data model extension

As a developer-hoster I want to be able to extend the core metadata schema with fields specific to my domain so that my metadata schema addresses my needs.

These custom fields are limited to the following types: array, string, integer and date.

Links of interest
Marshmallow schema

https://github.com/zenodo/zenodo/blob/d56a7d5005d9982ebcaa7d7ea0091376000f017f/zenodo/modules/records/serializers/schemas/common.py#L498

jsonschema

https://github.com/zenodo/zenodo/blob/d56a7d5005d9982ebcaa7d7ea0091376000f017f/zenodo/modules/records/jsonschemas/records/record-v1.0.0.json#L611

Inner filling for ES indexing

An example of such extension would be added biomedical profile fields:

The "biomedical" profile will have the following extensions to the core metadata (see #1)

Field or equivalent	Notes	Why	Implemented
~~Language `language`~~	Language of the content, optional. Just 1 for now.	filtering, legacy	in main model as `language`
Presentation location `presentation_location`	Location where content was presented, optional. Applies to exhibits and presentations. Geolocation values or controlled vocabulary values or both?	filtering, legacy
~~Content location `content_location`~~	Location pertaining to the content itself. (e.g., `Uganda` for dataset of vaccinated population in Uganda). Optional. Geolocation values or controlled vocabulary values (Feed from MeSH) or both? We may or may not be able to stick this in `terms`	searching, filtering, legacy	in main model as `location`
Number in sequence `number_in_sequence`	Indicates page or order of record in an ensemble. Integer, optional	sorting, in collection record ordering, synergy with `part` relation type
~~Private Note `private_note`~~	Free text, optional. Used internally for repo managers. SUPER_USER, librarian, owner, proxy can see it.	Need to understand use case better?	in main model as `internal_notes`
~~Subject: Name (re-use `terms`)~~	Name of person/organization referred in content (e.g. book about someone). Optional. Fed from controlled vocabulary	legacy, searching, filtering	in main model as `subjects`

Note that acknowledgements are not included for now.
"Abstract" and "content date" will be addressed by core metadata.

mappings: support elasticsearch v7

Fix assumption that creators/contrbutors' identifiers are present in templates

If no identifiers field for creators/contributors, the record page breaks because they are expected.

dependencies: pin invenio-records-permissions

setup.py includes an unpinned reference to invenio-records-permissions. This is to avoid not noticing breaking changes. Once a stable release of invenio-records-permissions has been release it should be pinned.

marshmallow: validate Organizational creators/contributors

If the contributor/creator is of type "Organizational" it should be checked that "affiliation", "given_name" and "family_name" are not filled in.

Make resource_type a required field

Resource type should be required.

jsonschema: removed fields

The following fields have been removed in order to use DataCite's schema according to what was decided #18.

contributors:
- ids

Validation errors don't get passed for resource_type

Typically when one creates a record with an incorrect field a validation error with helpful message is returned:

e.g. 'titles': [{'title': 'A Romans story', 'type': 'Otherss', 'lang': 'eng'}] returns an errors field that includes Invalid title type. Otherss not one of MainTitle, AlternativeTitle, Subtitle, TranslatedTitle, Other

However when one provides an incorrect resource_type in either the type or subtype field no errors field is returned:

e.g. 'resource_type': {'type': 'imagess', 'subtype': 'photo'} only returns {"status": 400, "message": "Validation error."}

ui: creator/contributor icon

In contributors.html:

<span class="text-muted" {% if creator.affiliations and creator.affiliations[0] %}data-toggle="tooltip" title="{{creator.affiliations[0].name}}"{% endif %}>{{creator.name}}</span>{% if not loop.last %}; {% endif %}

Could eventually get away with no ; and always have an icon (even if generic) next to a creator

global: implement export formats

Implement Creators and Contributors equivalent metadata fields

This issue tracks implementation of the "creators" / "contributors" metadata fields.

An accompanying RFC would be the place to define and document to others those fields (I think).

Priming content for RFC (should be moved to it):

Schema discussion starting point

    "<creators/contributors>": {
      "description": "Contributors in order of importance.",
      "minItems": 1,
      "type": "array",
      "items": {
        "type": "object",
        "additionalProperties": false,
        "properties": {
          "ids": {
            "description": "List of IDs related with the person.",
            "type": "array",
            "uniqueItems": true,
            "items": {
              "additionalProperties": false,
              "type": "object",
              "properties": {
                "source": {
                  "type": "string"
                },
                "value": {
                  "type": "string"
                }
              }
            }
          },
          "name": {
            "description": "Full name of person or organisation. Personal name format: family, given.",
            "type": "string"
          },
          "affiliations": {
            "description": "Affiliation(s) for the purpose of this specific record.",
            "type": "array",
            "uniqueItems": true,
            "items": {
              "type": "string"
            }
          },
          "email": {
            "type": "string",
            "description": "Contact email for the purpose of this specific record.",
            "format": "email"
          },
          "role": {
            "description": "",
            "type": "string",
            "enum": [
              "ContactPerson",
              "Researcher",
              "Other"
            ]
          }
        },
        "required": [
          "name"
        ]

For unknown authors, use DataCite's unknown.

See w3C recommendations on names.

Organization as author is something we will eventually want. Perhaps they are an alternative with their own fields. The organization use case may also be an opportunity to solve the "too many authors" problem. Organization ids: Research Organization Registry

Why this field, these properties and this implementation?

For citation purposes
For DOI minting
To respect w3C standard
To disambiguate authors
To allow auto-complete from a source

version: new schema

Switch to the new versioning schema:

Change setup.py to not be alpha
Update version to 0.0.1
Requires invenio-records-permissions released under the new schema first

dates: support for ranges and dates metadata

Currently dates are specified as two fields (start + end). This comes from ES2 and makes it difficult with certain advanced range queries (e.g. intersections).

ES6.x onwards has the range type, which would allow this type of queries.

However, we should also take into account actual metadata from type date. For example, from @fenekku:

for actual date metadata (as opposed to created and updated which are meta-metadata). We have "present day" books about "olden days" medical practices for instance.

Open discussion and possible solutions:

Keep the same structure and not give support to advance queries. In this case both use cases (ranges and content metadata) could be address with a structure similar to:

"dates": {
      "description": "Related dates and intervals.",
      "items": {
        "additionalProperties": false,
        "properties": {
          "description": {
            "description": "Description of the date or interval.",
            "type": "string"
          },
          "value": {
            "description": "Date value in ISO-8601 format. If interval, this is the start.",
            "type": "string"
          },
          "value_end": {
            "description": "End date value of interval.",
            "type": "string"
          },
          "type": {
            "description": "Type of date/interval: 'created', 'updated', 'content'..."
          }
        },
        "required": [
          "type", "value"
        ],
        "type": "object"
      },
      "type": "array"
    },

Have two fields. One for ranges, using the ES6.x+ range type, and one for dates metadata. This might require the creation of some sort of query parsing.
One of the two should be added as custom field.

detail page will not render if no communities

similar to #69, #70 -

side_bar.html template fails to render if the record in question is not assigned to a community (which is not required)

templates: bug rendering record detail template due to edtf dates

The following code in record_landing_page.html does not support EDTF dates:

{{ record.publication_date|to_date|dateformat(format='long') }}

Example traceback:

Traceback (most recent call last):
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/flask/app.py", line 2463, in __call__
    return self.wsgi_app(environ, start_response)
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/werkzeug/middleware/proxy_fix.py", line 232, in __call__
    return self.app(environ, start_response)
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/werkzeug/middleware/dispatcher.py", line 66, in __call__
    return app(environ, start_response)
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/flask/app.py", line 2449, in wsgi_app
    response = self.handle_exception(e)
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/flask/app.py", line 1866, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/flask/app.py", line 2446, in wsgi_app
    response = self.full_dispatch_request()
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/flask/app.py", line 1951, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/flask/app.py", line 1820, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/flask/app.py", line 1949, in full_dispatch_request
    rv = self.dispatch_request()
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/flask/app.py", line 1935, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/invenio_records_ui/views.py", line 205, in record_view
    return view_method(pid, record, template=template, **kwargs)
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/invenio_records_ui/views.py", line 227, in default_view_method
    record=record,
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/flask/templating.py", line 140, in render_template
    ctx.app,
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/flask/templating.py", line 120, in _render
    rv = template.render(context)
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/jinja2/environment.py", line 1090, in render
    self.environment.handle_exception()
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/jinja2/environment.py", line 832, in handle_exception
    reraise(*rewrite_traceback_stack(source=source))
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/jinja2/_compat.py", line 28, in reraise
    raise value.with_traceback(tb)
  File "/Users/lnielsen/src/invenio-rdm-records/invenio_rdm_records/theme/templates/invenio_rdm_records/record_landing_page.html", line 9, in top-level template code
    {%- extends config.BASE_TEMPLATE %}
  File "/Users/lnielsen/src/invenio-app-rdm/invenio_app_rdm/theme/templates/invenio_app_rdm/page.html", line 7, in top-level template code
    #}
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/invenio_theme/templates/invenio_theme/page.html", line 28, in top-level template code
    {%- endblock head_title %}
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/invenio_theme/templates/invenio_theme/page.html", line 31, in block "body"
    {%- if keywords %}<link rel="canonical" href="{{ canonical_url }}"/>{% endif %}
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/invenio_theme/templates/invenio_theme/page.html", line 32, in block "body_inner"
    {%- block head_links_langs %}
  File "/Users/lnielsen/src/invenio-rdm-records/invenio_rdm_records/theme/templates/invenio_rdm_records/record_landing_page.html", line 12, in block "page_body"
    {{ webpack['invenio-app-rdm-theme.css'] }}
  File "/Users/lnielsen/src/invenio-rdm-records/invenio_rdm_records/theme/templates/invenio_rdm_records/record_landing_page.html", line 13, in block "record_body"
    {{ webpack['invenio-rdm-records-theme.css'] }}
  File "/Users/lnielsen/src/invenio-rdm-records/invenio_rdm_records/theme/views.py", line 52, in to_date
    return arrow.get(date_string).date()
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/arrow/api.py", line 21, in get
    return _factory.get(*args, **kwargs)
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/arrow/factory.py", line 196, in get
    dt = parser.DateTimeParser(locale).parse_iso(arg)
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/arrow/parser.py", line 211, in parse_iso
    return self._parse_multiformat(datetime_string, formats)
  File "/Users/lnielsen/envs/cli/lib/python3.6/site-packages/arrow/parser.py", line 494, in _parse_multiformat
    string, ", ".join(formats)
arrow.parser.ParserError: Could not match input '1970-06-16/2003-12-05' to any of the following formats: YYYY-MM-DD, YYYY-M-DD, YYYY-M-D, YYYY/MM/DD, YYYY/M/DD, YYYY/M/D, YYYY.MM.DD, YYYY.M.DD, YYYY.M.D, YYYYMMDD, YYYY-DDDD, YYYYDDDD, YYYY-MM, YYYY/MM, YYYY.MM, YYYY

communities: schema fields

Currently the schema for the communities field is:

"community": {
      "type": "object",
      "properties": {
        "primary": {"type": "string"},
        "secondary": {
          "type": "array",
          "minItems": 0,
          "items":{"type": "string"}
        }
      }
    }

Is this enough or should we add something like (also applicable to secondary):

{
  "primary": {
    "type": "object",
    "properties": {
      "name": { "type": "string"},
      "identifier": {"type": "string"}
    }
}

Serialize FilesSchema (dump_only)

marshmallow: upgrade to version 3

Current Marshmallow schemas are made for version 2. Until Invenio v3.2 is released the compatibility with version 3 is not widely available around Invenio.

For now, it is pinned 'marshmallow<3' in the setup.py. Fix and unpin once Invenio v3.2 is out.

Ref

Reuse IdentifierScheme

From @ppanero

IdentifierScheme could be reused in many other schemas. However, there is no easy and clean way to flatten its attributes. Tested Pluck and Method. The latter worked but it has a more difficult code comprehension, which in my perspective makes is not a good choice.

ui: stats collapsed message

Currently it reads "See more details" even when rolled out. Then is should read "Show less details"

I tried to make it with the same logic than the files. However, it was an inconsistent behavior:

1- First time showing "See less details" when collapsed
2- After the first click it keeps showing "See less details"
3- In next clicks, it changes between labels as expected :O

Access level metadata

Implement access levels metadata per record.

Terms datamodel (index)

The "terms" index contains:

source : keyword
id : keyword
value : text or keyword?
definition : text
deprecated : bool
suggest : list of stopword-removed words making up the value that will be used by the suggester.

As a depositor, when I search for a term through the auto-complete feature, I want to be able to select one word terms. I don't want them to be occluded by other keywords that contains the one word, but are longer.

To solve the above issue, a scoring algorithm on the suggest endpoint can be used to give a higher score to shorter suggest list (I think).

Combine / disambiguate _access and access_right

2 parts to this task: discuss the differences / combination and implement.

access_right can currently be:

'open'
'embargoed'
'restricted'
'closed'

'_access' can be:

    'metadata_restricted': <true|false>    
    'files_restricted': <true|false>

This leads to strange combinations: an open access record with files restricted or a record with metadata restricted but files not restricted or an open record with metadata restricted... The first in particular is something participants in our usability sessions have complained about: "An open access record should have its files available" to paraphrase.

What also makes things hard to keep straight is that we have a rights field (the license) and an access_levels field.

Because of this, I suggest we merge them together. This way, only combinations that make sense are possible and the combinations are not across different fields but within 1 field. We can always have the UI reflect what we want from this 1 field. Enforcing a strict semantic we can cover most cases (I think?) with something like the following for example:

New "access_right" field	metadata available?	files available?
`open`	✔️	✔️
`embargoed`	✔️	❌ until embargo date (unless user has permissions)
`scheduled` (or `embargoed` with added metadata)	❌ until embargo date (unless user has permissions)	❌ until embargo date (unless user has permissions)
`restricted`	✔️	❌ (unless user has permissions)
~~closed~~ `private`	❌ (unless user has permissions)	❌ (unless user has permissions)

I am completely fine with having access_right be bibliographic AND "interpreted" metadata.

What are your thoughts @lnielsen @ppanero ?

[UPDATED 2019-12-20]

DOI template fails if there is no DOI

Similar to #69 - the DOI template assumes a DOI is present and fails to render if the identifier(s) are of a separate scheme.

EDIT from discussions below
A record may not have a DOI. This task is about fixing the template to account for that without crashing.

Validate marshmallow identifiers via idutils

We could use idutils to validate identifier scheme in the marshmallow schema.

Make sure that references to identifiers are made to their lowercase form.

There was talk about distinguishing identifiers for different uses / targets (people / orgs / objects)
Should we (invenio-rdm-records) check for those appropriately / is that idutils jobs ?

[Parent Issue] datamodel: still todo

TODO as outcome of #49:

Implement versioning (conceptrecid minting)
OAI serializing/schema, add to tests and fixtures.
Customization of enums
Resource types are loaded from a JSON file. We need to converge on how to load them, for all enums.
IdentifierScheme could be reused in many other schemas. However, there is no easy and clean way to flatten its attributes. Tested Pluck and Method. The latter worked but it has a more difficult code comprehension, which in my perspective makes is not a good choice.
Tests for access_condition since the access part is not fully defined yet. It is not added to the fixtures either.
Tests for FilesSchema, since is dump_only (see Zenodo). How should this be tested?
Access: #37

marshmallow: evaluate fields.Str vs SanitizedUnicode

Several fields, such as type, are strings that can only take as value one of an enumeration. Therefore at the moment there is no need for them to be sanitized and could go along with just being a Str(). However, if we open these to customization (e.g. introduce your own CVs) it might cause problems (maybe in other languages that are not English).

Run some tests on performance from marshmallow.fields.Str vs invenio_records_rest.schemas.fields.SanitizedUnicode.

1 load/dump:

Sanitized: 0.0006556510925292969 seconds
Marshmallow Str: 0.0002028942108154297 seconds

1000 load/dumps:

Sanitized:  0.18001985549926758 seconds
Marshmallow Str: 0.09990668296813965 seconds

100000 load/dumps:

Sanitized: 18.859431743621826 seconds
Marshmallow Str: 10.20322585105896 seconds

inveniosoftware / invenio-rdm-records Goto Github PK

invenio-rdm-records's Introduction

Invenio-RDM-Records

Development

Install

Tests

invenio-rdm-records's People

Contributors

Stargazers

Watchers

Forkers

invenio-rdm-records's Issues

InvenioRDM data model

Metadata fields

Extras for all

How to hanlde custom fields?

Zenodo custom fields

TODO

Discussion points for community:

Extras for all

How to handle custom fields?

Zenodo custom fields

TODO

Discussion points for community:

Recommend Projects

Recommend Topics

Recommend Org

Jobs