GithubHelp home page GithubHelp logo

ivoa-std / dap Goto Github PK

View Code? Open in Web Editor NEW
1.0 9.0 3.0 187 KB

Dataset Access Procotol (name TBD)

License: Creative Commons Attribution Share Alike 4.0 International

Makefile 1.57% TeX 98.43%

dap's Introduction

PDF-Preview

DAP

The Dataset Access Protocol (DAP, name still TBD) provides capabilities for the discovery, description, access, and retrieval of multi-dimensional image datasets, including 2-D images as well as datacubes of three or more dimensions.

Status

This draft version is based off SIA-2.0. The last stable version is REC-2.0.

SIA 2.0 ported to github

  • SIA 2.0 became an IVOA recommendation on december the 23rd 2015
  • Original source was an OpenOffice file
  • Here can be found the result of porting this specification to GitHub
  • Some changes in lay out of the document
    • "Status of the document" section is automatically generated by ivoatex process
    • Tables have a different rendering than in the original version
    • References are using the "\citep" flavor. They are generally quoted only once, with a few exceptions when it is ambiguous. Original version used the "[n]" notation for each occurence of the publication to be quoted in the text.
    • This porting has been made to allow authors to propose PR for identified errata and propose changes towards version 2.1

Want to contribute?

  1. Raise a GitHub Issue on this repository

  2. Fork this repository (eventually clone it on your machine if you want to)

  3. Create a branch in your forked repository ; this branch should be named after the issue(s) to fix (for instance: issue-7-add-license)

  4. Commit suggested changes inside this branch

  5. Create a Pull Request on the official repository (note: a git push is needed first, if you are working on a clone)

  6. Wait for someone to review your Pull Request and accept it

This process has been described and demonstrated during the IVOA Interoperability Meeting of Oct. 2019 in Groningen ; see slides)

License

Creative Commons License
The IVOA Architecture document is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License.

dap's People

Contributors

bonnarel avatar jd-au avatar molinaro-m avatar pdowler avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dap's Issues

review of OpenDAP interface/capabilities

Anyone in the authors have looked into OPeNDAP? (documentation is here)

Of course this is not IVOA, not FITS. It is Earth observation and NetCDF (mostly). But it contains a lot of capabilities and features that I would be very happy to see implemented.

Beware that this ecosystem also has an interface called DAP (Data Access Protocol).

Remove STC-S reference

The current draft still contains this sentence in the POS section:

This syntax for circles and polygons is in the same style as STC-S, but with no reference positions, coordinate systems, units, or geometric operators (union, intersection, not).

This is a very large "but", to the point where the reference to STC-S seems misleading. Probably this should point to DALI's shape instead? It seems most likely that DALI 1.2 will get done before DAP, so can we just change this now in the draft?

More generic DAP such that specific parameters define ObsDAP analogous to ObsTAP?

The draft says:

These choices are not suitable for all domains; the values are chosen to enable the query resource to be used to search for most standard observational astronomy data. If they are not suitable for a specific domain of interest (e.g. planetary science) then it is feasible to write a very short standard that re-uses the DAP query capability but redefines the hard-coded systems and units. This new standard would have a new standardID to distinguish services that implement it from those that implement the capability defined here.

As TAP defines a generic interface and ObsTAP is how we refer to a TAP service that serves a database in the ObsCore data model, should this also be more generic? The doc says that you'd need a separate standard to define the equivalent for planetary. But in the TAP case, we already have a document describing each data model, and a document describing the TAP protocol, so there's no need for an additional document describing ObsTAP. Should we set this up similarly such that only one new standard is needed, clarifying how any generic data model is used in the simple, parametrized, generic Data Access Protocol? Otherwise this needs to be renamed to indicate its dependency on ObsCore, because it is not generic.

(Though since "ObsDAP" sounds a lot like "ObsTAP" when spoken, that would argue for calling this generic protocol something else, maybe with a "Simple" in the name or a "Parametrized".)

Remove NaN from POS description

SIAv2 contains the following throwaway reference implying that a NaN might be acceptable in a POS query parameter:

Valid coordinate values are in [0,360] for longitude and [-90,90] for latitude (or NaN).

@timj pointed this out in the course of implementing an SIAv2 interface associated with the Rubin Observatory "Data Butler".

There is no other reference to NaN in the standard, and it's in conflict with language a few sentences above that suggests the use of infinities as special values (language which has already been suppressed by SIAv2 Erratum 1).

It seems to very likely be an editing oversight, a vestige of the 20150730 PR of SIAv2, in which NaN was used in RANGE where the final standard uses infinities.

Can we a) remove this vestige from DAP and b) issue an erratum on SIAv2?

Question about FORMAT specifier's purpose and usefulness in a DataLink-dominated world

Colloquially, the FORMAT parameter is intended to apply a constraint to the persistence format of the data product associated with a row in a conceptual underlying ObsCore table. (Obviously there doesn't have to be such a table in order to implement SIAv2/DAP, but the standard is written with reference to a table model.)

From a user perspective, this is clearly meant to enable, for instance, limiting a query to data in FITS format. In the DAP era where tabular datasets may be available from a service, a user might say "I'm only interested in Parquet".

The way the standard is written, though, it's clear that FORMAT is meant to be evaluated against the value of access_format in the query response. In many archives (CADC, Rubin, probably at least parts of IRSA in the future) we have adopted the "DataLink model" for providing ObsTAP/SIAv2 dataset access, though, where the actual access_format value is always the DataLink MIME type.

The standard even acknowledges this:

This column describes the format of the response from the access_url (see 3.1.3) so the values could be data file types (e.g. application/fits) or they could be the DataLink MIME type.

It seems like we've accidentally shot ourselves in the foot here. No non-IVOA-aware science user would be expected to be know about the "DataLink model" -- they go to Firefly, say, do a query, and, if possible, Firefly will show them the #this target from the DataLink links service, rather than making them navigate the indirection on their own. Unless they deliberately click on the part of the UI that lets them see the links response and any associated additional datasets, they won't be aware of DataLink at all. That's a good thing.

So in this situation if someone does FORMAT=fits they are likely to be very surprised by the results.

I realize the difficulties involved in potentially prying FORMAT off its mandatory link to access_format, but I think it would be worth our having a conversation about whether an interpretation "if access_format is the DataLink MIME type, then evaluate the restriction against the content_type for the #this entry in the resulting links table" would be sustainable, at least as an option.

I recognize that this might require data publishers to add a column to the underlying table to make sure that the "real" data type is efficiently queryable.

FORMAT seems sufficiently useful that it's a shame to, in effect, be forced to lose its usefulness in exchange for all the other big advantages of the "DataLink model". For Rubin (IMO) it's still a worthwhile tradeoff if we can't fix this, but... let's try to think this through and fix it.

My guess is that this must have been discussed before, but I haven't found the trail yet.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.