opencadc / caom2 Goto Github PK

View Code? Open in Web Editor NEW

5.0 5.0 11.0 6.53 MB

Common Archive Observation Model

License: GNU Affero General Public License v3.0

Java 95.60% Roff 0.06% HTML 0.44% XSLT 3.54% Shell 0.01% Makefile 0.19% Python 0.16%

caom2's People

Contributors

Stargazers

Watchers

Forkers

yeunga javierduranarenas hjeeves andamian brianmajor esdc-esac-esa-int jburke-cadc bsipocz at88mph kgillies pdowler

caom2's Issues

allow hierarchy of composite observations

The current UML diagram indicates that the members of a CompositeObservation are always SImpleObservation(s).

Allowing other CompositeObservation(s) to be members would more clearly describe the intended aggregation of observations.

SUBARU resolver URL encodes the necessary query parameter

The frameinfo=<YYYY-MM-DD>/<FRAMEID> format is being URL encoded by the SubaruResolver class. When a user requests SUBARU data from the browser, it also encodes it which results in a double-encoded value and is unreadable.

fits2caom2 help is inaccurate.

The -help content suggest that -test does not persist to the database but really -test turns off writing to a file.

Are there cases where fits2caom2 writes to a database anymore?

option A: remove -test but that might break code that expects -test

option B: Change the behaviour so -test still writes the .xml (also might break current usgage)

option C: Change the help line to indicate that no file will be write to -out, impacts likely minimal.

python module validation of observation_id field - too stringent?

Several of our collections at IRSA have observationid values that include spaces or slashes, and we're running into the following error trying to export them with the python caom2 library:

invalid observation_id: may not contain space ( ), slash (/), escape (\), or percent (%)

I reached out to @pdowler, who said this is to ensure that observation_id does not contain any characters aren't URI-safe, and suggested I open an issue here for discussion.

We don't use that field for URI population and haven't been able to find any documentation indicating that the observation_id string has any restrictions.

Is the caom2 module possibly being too stringent in how it validates this column?

(FYI we use CAOM 2.3, and the python module caom2 2.3.8.4)

Requiring pixel values for some WCS

While validating CAOM v2.3 XML files, I’ve run into an error which appear to be driven by the schema design: in the SpectralWCS, PolarizationWCS, and TemporalWCS (energy, polarization, and
time in Chunk, respectively) there is an axis element that contains a bounds element of type CoordsBound1D. This element consists of samples of start/end points defined by RefCoord which currently require both pix and val to be provided. While the physical value (val) should always be present, it’s not clear that a pixel value (pix) will be and that’s been the case for some of the data I’ve been working with.
A simple fix may be to set these as minOccurs="0" in the schema.

Add NAIF ID to Target?

Most (but not all) moving object targets have a NAIF ID. Target already has a keywords field, so we can add it there. It might be better for there to be a separate field for it. I am not sure, though.

Print URL in message field on failed artifact downloads

In caom2-artifact-sync, print the URL in the message field of the END JSON message when there is a failure.

add support for position.bounds = circle

add to xml serialisation ;

also add support in other repos: caom2tools.git (py) and caom2db.git

resolve smoka artifact URIs.

The artifact URIs for the smoka observation records need to be resolved.

To get data make a POST request as shown in this example:
data requests

To get preview PNG files make a get request on the service as described here:
preview requests

fields with multiple values in an observation

this comes from an archive partners slack discussion started by David Rodrigues

proposal.id
telescope.name
instrument.name
plane.energy.bandpassName

... maybe more

Schema description should include units.

The schema descriptions that appear in the TAP services should tell the user what UNITs quantities in the table are expressed in. e.g. Plane.energy_bounds_lower is described as:
'lower bound on energy axis (barycentric wavelength)' which is helpful, but without knowing what units those are (nm?) the user is left to guess.

One might thing that just declaring the entire table as following some convention (cgi? mks?) would be sufficient) but reminding the user with good column descriptions would be more friendly.

PublisherID constructor URI arg produces odd resource ID

The PublisherID constructor in caom2 with the URI argument constructs a resourceID from the path of the argument. However, the URI.getPath() contains the leading slash, which produces the following:

final PublisherID publisherID = new PublisherID(URI.create(PublisherID.SCHEME + "://com.myauth/MYCOLLECTION?OBSID/PRODID"));
System.out(publisherID.getResourceID()) // << Outputs ivo://com.myauth//MYCOLLECTION

The double slash MAY not be a problem for some parsers, but breaks tests.

Provenance from multiple versions

This sort of relates to Issue #66 and the question of cardinality.

Planes are often produced from an ensemble of software, not a single application. In the case of ALMA MS data, in particular, we have MS data that is calibrated using CASA XXX and then split using CASA YYY. The provenance of CASA YYY would tell you that you should use YYY to open these files (MS is not a standard format) but the CASA XXX part is needed to tell you what the calibration system was. In particular CASA XXX is what tells you about calibration trust while CASA YYY part is more about data form. How (if at all) should this be expressed in the provenance?

enhance position bounds to include coverage up to all-sky

The current Polygon definition is limited to less than all-sky. A MultiPolygon with two hemispheres could be constructed, but you cannot create an outer simple polygon that contains it.

Missing slash in MastResolver

The URLs produced by the MastResolver are missing the slash between the base URL and the scheme specific part of the URI.

The tests for MastResolver should be corrected so that they fail with this implementation. Then, the implementation should be corrected so that the tests pass.

Add data release date to Artifacts

JWST has file based release dates rather than observation based. Would like to be able to optionally add dataRelease to Artifact - we could then roll up to higher levels as needed.

support alternative representations of plane metadata

The current plane metadata is roughly:
position: ICRS (deg)
energy: (barycentric) wavelength (m)
time: MJD UTC (d)
polarization: list of states

These coordinate/reference systems and units are now listed in the "interoperable profile".
However, the CADC TAP service also provides some energy columns with frequency values because the typical query cannot be re-written in a way that can be (easily) indexed...

In principle, one could specify places to put values with alternate representations. This is most obvious for wavelength/frequency/energy/velocity, less so for position (icrs/other epochs/galactic?), and then maybe time frames....

support for caom2 metadata provenance in model

It would be nice to track which software and version was used to generate metadata for an observation. This is especially true when sharing metadata between sites.

add reference URL to proposal

to refer to the details of the propsal (abstract, document, etc)
some telescopes provide the endpoint

RFE: add checksum to Artifact

Proposal from the HST Archive Coordination Meeting: add a checksum (probably MD5) to the artifact so that metadata sharing of CAOM observation metadata provides sufficient information to enable partners to figure out which data files they need to download. In the case of new artifacts, the partner won't have the file (denoted by the Artifact.uri) at all. For changed files, they will detect this via the checksum. For changed arifact metadata, the partner would examine the artifact due to timestamp change but can determine from the checksum that they do not need to download the data again.

expand Instrument model

From HST Archive coordination meeting: add a detector name.

support for IVOA polygon data type is too missing

The Plane.position.bounds Polygon allows for disjoint pieces and holes while IVOA (DALI) polygon must be a simple outer hull.

energy.bounds and time.bounds Interval values are consistent with IVOA (DALI) interval.

allow EnergyBand to describe observations that span two regimes

Some filters span multiple predefined energy regimes (eg HST filters span optical and UV).

Add s/n (signal to noise) to Plane.Metrics

We have a new spectroscopic initiative for JWST work which would require having the observation signal to noise ratio (float value) available in the model. The obvious place is to add it to the plane.metrics class. Would it be possible to get this in v2.4 since it is a pretty simple addition?

Addition of *_calib_status "optional" ObsCore attributes to the model?

The ObsCore data model contains "optional" elements of the form *_calib_status, providing more information about the level of calibration of various axes, spatial, temporal, spectral, and observable. We are likely to include these in the Rubin Observatory's ObsCore tables, basically because the overall calib_level = 1 vs. =2 distinction doesn't adequately capture the way we create the planned data products, and in particular how the observable (flux-like) axis is calibrated.

In order to maintain the connection with CAOM2, we would like to suggest that the Position, Time, Energy, and Observable objects in the CAOM2 data model each be supplemented with a string-valued calib_status attribute with multiplicity [0..1] (i.e., optional).

Looking at the language in the ObsCore standard (quoted verbatim below) it doesn't seem like the enumeration values for these attributes are sufficiently standardized to be able to force them to be explicit enumerations in CAOM.

Attribute	Short title	Principal	Utype	Suggested values	Description
`s_calib_status`	Type of calibration along the spatial axis	1	Char.SpatialAxis .calibrationStatus	uncalibrated, raw, calibrated	A string to encode the calibration status along the spatial axis (astrometry). Possible values could be {uncalibrated, raw, calibrated} and correspond to the Utype Char.SpatialAxis.calibrationStatus. For some observations, only the pointing position is provided (s_calib_status =”uncalibrated”). Some other may have a raw linear relationship between the pixel coordinates and the world coordinates (s_calib_status = ”raw”).
`t_calib_status`	Type of time coordinate calibration	0	Char.TimeAxis .calibrationStatus	uncalibrated, calibrated, raw, relative	This parameter gives the status of time axis calibration. This is especially useful for time series. Possible values are principally {uncalibrated, calibrated, raw, relative}. This may be extended for specific time domain collections.
`em_calib_status`	Type of spectral coord calibration	0	Char.SpectralAxis .calibrationStatus	uncalibrated, calibrated, relative, absolute	This attribute of the spectral axis indicates the status of the data in terms of spectral calibration. Possible values are defined in the Characterisation Data Model and belong to {uncalibrated, calibrated, relative, absolute}.
`o_calib_status`	Type of calibration for the observable coordinate	1	Char.ObservableAxis .calibrationStatus	absolute, relative, normalized, any	This describes the calibration applied on the Flux observed (or other observable quantity). It is a string to be selected in {absolute, relative, normalized, any} as defined in the SSA specification (Tody, Dolensky and al. 2012) in section 4.1.2.10. This list can be extended or updated for instance using an extension mechanism similar to the definition of new UCDs in the IVOA process, following the feedback from implementations of ObsTAP services.

The following characteristics are in common for all four attributes:

Attribute	Value
Datatype	adql:VARCHAR / Enum string
Units	NULL
UCD	`meta.code.qual`
Mandatory	0
Index	0 (except for `o_calib_status`, where it's "TBD")
Std	1

RFE: add a flag to indicate that a simple observation is a member

Proposal from HST Archive Coordination Meeting: make it easy to exclude members (SimpleObservation) from search results.

If members could be flagged then the query would not need to include a subquery or join to determine this. In addition, such an on-the-fly approach would mean that composites created by someone else and included in the system (eg in the aggregate database at CADC) would cause simple observations to be hidden. A specific flag would mean the provider that curates the simple and composite observations would control this explicitly.

The form of the flag is TBD.

Add support for circle in plane.position

It currently only supports polygons.

fits2caom2 unit tests have system-dependent Date assertions

e.g. FitsMapperTest.testPopulateObservation, line 461 verifies a java.util.Date value by comparing a hard coded string (in PST) to {Date variable}.toString(). The latter relies on the locally configured timezone so fails if one is not in the pacific timezone.

It's also totally the wrong way to compare Date values, which have a perfectly good equals method.

change TargetType from enum to a kind of VocabularyTerm

RFE: add more values to DataProductType

Proposal from HST Archive Coordination Meeting to support additional values/details.

Currently allows ObsCore values plus catalog. We should consider moving to a more loosely coupled vocabulary to allow for extensions.

caom2-compute: WCS validator should check for restfrq|restwav if spectral axis is velocity

restfrq and restwav are technically optional, but if the spectral ctype is velocity then one of them is needed or the validation fails in an obscure way.

The validator should check that one is provided when the axis is velocity and throw an exception with a good error mesage like "one of restfrq or restwav is required for axis with ctype={the ctype value}"

Artifact.uri referring to a table in a TAP service

This is technically possible, but we should define a common practice for the URI structure so that other users can, in principle, understand the URI and do something useful.

NOAO target URL needs updating

The new URL should look like:
http://archive1.dm.noao.edu:7003/?fileRef=$image

instead of:
http://nsaserver.sdm.noao.edu:7003/?fileRef=$image

RFE: add concept of logical plane identifier

This is an identifier that is generated by the content originator and kept intact at mirror sites. The CAOM publisherID (publisher dataset identifier) is specific to each data centre.

Move target name to the plane level

For ALMA, product planes could refer to separate targets and the target name at the observation level is not sufficient.

restrict Algorithm.name values for simple observations

There are some defacto standard algorithm names in use (exposure) and a few others that could be restricted for use with simple observations only (simulation was the one that prompted allowing other names in the first place.

specsys is mandatory in VO-DML but XSD has minOccurs="0"

invalid combination of naxis values in Chunk passes WCS validation

Chunk.naxis=3
positionAxis1=null
positionAxis2=null
energyAxis=null
timeAxis=3
time =

This should fail because no axes are assigned to specified as 1 and 2.

caom2-compute doesn't recognise CTYPEs: FDEP or RM

FDEP is Faraday Depth
RM is Rotation Measure

Radio cubes from the GMIMS project at DRAO have CTYPE3=RM (rotation measure) and FDEP is another usage that we need to support.

change Status and Quality to vocabularies

generalise CompositeObservation concept to DerivedObservation

DerivedObservation would incldue composites (eg stacks), but also:

observations that are extracted subsets of other observations
virtual observations that are define groups but don't have products

caom2-compute: chunk validation failure for naxis = null

A Chunk with null naxis fails validaiton, but the model allows naxis to be null.

Correct behaviour: null naxis is allowed; the presence of other metadata just describes the "blob"

add Range subclass of Shape to conform to DAL standards

CutoutUtil.getPositionBounds(SpatialWCS, Shape) throws an UnsupportedOperationException

ultimately needed for complete IVOA SODA support

possible solution: add the class to the java library to support SODA but don't add it to the model since it isn't needed to describe data bounds?

Extend Observation Intent Type

STScI has been ingesting observations encapsulating images created by the Office of Public Outreach into our CAOM database. We've needed to use intent_type=science for these, but ideally we would prefer an outreach intent for them.
Partially implemented via PR opencadc/caom2tools#171 but opening an issue here for further discussion.

RFE: add metadata checksum

From thinking about the plan that emerged from the HST Archive Coordination Meeting:

Usage scenario: if site B has a mirror of the caom metadata from site A, they can harvest new/changed observation documents by looking for recent maxLastModified but they cannot feasibly perform a validation that the content they have is correct in detail.

Proposal: add a checksum of the metadata that could be transmitted with in the observation document and found via a query that lists observation identifiers, maxLastModified, and the checksum.

add structure to keywords to allow serialising special characters

For example:

keyword is a phrase with spaces
keyword has special characters like single-tick, quote, etc...

support range of values for plane*resolution

Plane.position.resolution [0..1] Interval
Plane.energy.resolvingPower [0..1] Interval
Plane.time.resolution [0..1] Interval

SUBARU artifact resolution has invalid hostname

The URL resolution in the caom2-artifact-resolvers/src/main/java/ca/nrc/cadc/caom2/artifact/resolvers/SubaruResolver.java class contains a base URL for the CADC site, but the logic is contained on the CANFAR site.

The base URL should be http://www.canfar.net rather than http://www.cadc-ccda.hia-iha.nrc-cnrc.gc.ca.

Add lunar keywords

A number of our observatories, include space missions like WISE, record characteristics about the moon. IRTF has the most complete information

LUN_FLI: Fraction Lunar Illumination (FLI) is the percent of the Moon's visible disk illuminated by the sun. Range is 0.0 to 100.0.
LUN_LIGHT = The lunar light level based on the lunar elevation (EL), and Fraction Lunar Illumination (FLI) values from JPH Horizon. Values are:
- dark = 0% <= FLI <25.0%, or Moon Elevation < 0 degrees.
- gray = 25% <= FLI < 75.0% with Moon Elevation > 0 degrees.
- bright = 75.0 <= FLI, and Moon Elevation > 0 degrees.
LUN_SEP = The lunar separation in degrees of RA,DEC – moon.
LUN_EL = The lunar position's Elevation in degrees, +90.0 to -90.0
LUN_AZ = The lunar position's Azimuth in degrees. 0-360. 0=North, 90=east.

This seems common enough, especially for ground based observatories, such that at least some of these keywords should be supported.

CutoutUtil.initCutout in caom2-compute does not implement correct axis order

The comment in the method is correct but the appending of codes to create the template is only done in typical axis order and not by using values of axis indices in the chunk.

For input with Chunk.naxis=3, energyAxis=1, positionAxis1=2, positionAxis2=3 (legit) the current initCutout would create px,py,ee instead of ee,px,py

opencadc / caom2 Goto Github PK

caom2's People

Contributors

Stargazers

Watchers

Forkers

caom2's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs