semiceu / core-location-vocabulary Goto Github PK

View Code? Open in Web Editor NEW

14.0 18.0 5.0 9.83 MB

A vocabulary that describes the basic elements of location information, such as geometries and addresses.

HTML 97.77% CSS 0.95% JavaScript 1.28%

locn location-vocabulary geometry postal-addresses data-specification

core-location-vocabulary's People

Contributors

Stargazers

Watchers

Forkers

albeaufays aleksandralavreneva touchlesscode adminesary emielpwc

core-location-vocabulary's Issues

address property and registeredAddress used in other EU core vocs

if you define here the address property you may also want to define the property registeredAddress, which is used in CPV, CPOV and Core Business Voc,. as sub property of the generic address. It would make sense; however, as it is currently modelled in CLV you cannot because the domain of address is Location not owl:Thing (and Person, or Public organisation or Legal Entity are not locations but agents).
What I am saying is that if you say that a resource or a thing has address and geometry you may also say that a thing has a specific type of address that is the registeredAddress and used it in the other vocabularies (with prefix clv).

Properly referencing the vocabulary

First of all, thank you for the great job.

I want to cite the vocabulary as a whole as part of a publication and I can't find the right way to do it. Do you have a recommended way? Maybe should I just list the authors from the release I am using? Or should cite the European Union as stated in the license?

Is every release being archived somewhere and getting a DOI? if so, is the DOI consistent with the IRIs?

I think it would be best if you had a citation.cff as part of the repository or some sort of section in the specification pointing to how to cite properly.

I hope you can help me, have a great holiday season.

Terminology and definitions in the core vocabularies

see SEMICeu/CPOV#12

General comment : modelling approach

I think Core Location Vocabulary now has a more limited usage wrt to what was possible in the past. In the past address and geometry were two properties very general to be attached to owl:Thing (or better rdfs:Resource). Currently, in order to specify an address you have to pass through a location? Why?
I think an analogous discussion was done in the context of the SDGSandbox work.

BTW: AdminLevel1 and AdminLevel2 are already named places being components of the address (according to INSPIRE data model). Therefore, it is difficult to understand why modelling in such a way, unless with named place i.e., Location you mean something very broad like also a Site of an Organisation, a parking, a bus stop, or anything with a geographical dimension.

I think this part should be really clarified.

politicalGeocodingURI needed not only adminUnitL2

AdminUnitL2 is of data type text. The examples given use the nuts code.

In Germany we have several nuts-alike statistical geocoding keys that range from
a) the list of streets (Straßenschlüssel - Level 5)
b) the list of districts (Landkreisschlüssel - Level 4)
c) the list of municipalities (Allgemeiner Gemeindeschlüssel AGS- Level 4)
d) the list of regions (Regionalschlüssel - Level 3)
e) the list of states (States - Level 2)

After having streamlined thoughts with German KoSIT (#6) what Germany needs here is:

the possibility to express those geopolitical encoding in zero-to-many keys that are not called "secondLine" but have a rather level-agnostic name. Perhaps we should use something like "politicalGeocodingURI"

In DCAT-AP.de we extend the Core Location in the following manner to respond to German key-needs:

the level of the geopolitical key dcatde:politicalGeocodingLevelURI https://www.dcat-ap.de/def/politicalGeocoding/Level/
the key itself, dcatde:politicalGeocodingURI that then links to e.g. the municipalityKey https://www.dcat-ap.de/def/politicalGeocoding/municipalityKey/

AdminUnit1 and AdminUnit2

In the definition you say that AdminUnit1 is country/region/state. I am not sure if AdminUnit2 is a country again. Isn't it modelled with the AdminUnit1?

Why AdminUnit1 has a Code as type while AdminUnit2 has text? If it is recommended to use NUTS for AdminUnit2 its type should be a Code, right?

In the table of the definitions for AdminUnit1 the type is missing.

Typo in diagram class locn:Address - use namespace locn for postCode

The diagram under https://www.w3.org/ns/locn shows lcon:postCode instead of locn:postCode

Inconsistency between data types in XSD (text type) and specification (code)

The specification says that the CoreLocation.Address.AdminUnitL1 is a Code and that a valid URI from the Publications Office Gender codelist has to be used.
The XSD file in V1.0 (did not find yet the xsds for 2.2) uses a data type "text" for AdminUnitL1. (same seems true for Core Person Gender SEMICeu/Core-Person-Vocabulary#18 ).

The recommendation is to use the data type xs:anyURI or an own data type for codelists but not text.

Property adminUnitL2

The current range of the property is Text meaning that it is a datatype property, with the name of the region/county/state. However, in the usage note there is a recommendation of using some controlled vocabularies, making the range of that property Code (object property, no longer a data type property).

I think this point should be better specified. If the ideal world is to use a controlled vocabulary, the range of that property should be revised in Code.

Finally, since you recommend the use of NUTS, among others, if I want to use level 3 of NUTS, which property of the class Address should I use? The adminUnitL2 seems the latest level foreseen for expressing the territorial hierarchy, at least looking at the names of the property (well, yep there is postName which is the city that is another layer of the hierarchy). Can I use NUT3 anyway? Should I use adminUnitL2 for NUT3 (it seems to me not appropriate from a semantic perspective)?

Remove round brackets in JSON-LD context

It seems round brackets are not allowed in JSON-LD playground, giving difficult for people to use it.

Location of RDFS definition of m8g:* properties and classes?

@EmidioStani where is the RDFS definition of m8g:* properties and classes located? I can't find it in the SEMICeu GitHub organization and the IRI, e.g. http://data.europa.eu/m8g/registeredAddress dereferences only to the generic Joinup page.

Originally posted by @jakubklimek in #26 (comment)

Expected Range "String" in the class Address should be replaced by "Literal"

"String" (xsd:string) is now in (and only in) the class Address (locn:Address) used as range for the properties which have "text without lang-tag".

According to Core Location Vocabulary, except for locn:locatorId where no range specified, the other properties should have rdfs:Literal as range.

[Disclaimer: I haven't checked thoroughly, so I may be wrong about some of them]

Readme file is largely about the Core Person Vocabulary

The read me file has not been updated enough since being cloned from the Core Person Vocabulary. In particular, it doesn't have a link to the Core Location Vocabulary that is being discussed in this repository.

Core Vocabularies 2.0.1 package contains outdated versions of XML schemas

See SEMICeu/Core-Business-Vocabulary#5

Registered address

During the Core Vocs webinar dd. 2021-04-23, a suggestion was made to include the registered address property within the Core Location Vocabulary, as it is currently used by both the Core Person Vocabulary and the Core Business Vocabulary.

Not that also within the Core Public Organisation Vocabulary, the address property is used.

Range of locn:geometry

In the specification, the range of the property locn:geometry is set to the class locn:Geometry. Nevertheless, in the usage note, it is mentioned that literals and URIs are also accepted ranges (see also the examples). We therefore propose to make the range a owl:unionOf of those three.

Is an Address a Spatial Object?

in https://github.com/inspire-eu-rdf/inspire-rdf-vocabularies/blob/master/ad/ad.ttl which is INSPIRE address data model in RDF, the class Address is subclass of gsp:Feature (geoparql ontology) and it is defined as follows: "An identification of the fixed location of property by means of a structured composition of geographic names and identifiers".
The class is also subclass of locn:Address.

From these definitions it seems to me that Address is more a spatial object because gsp:Feature should be subclass of gsp:SpatialObject (also the geometry in geosparql ontology is subclass of SpatialObject and it is disjoint from Feature, and a Feature can have a geometry).

Not sure if this clarifies or not :)

Datatypes of SEMIC-Address:Address.adminUnit and INSPIRE-Address:AddressRepresentation.adminUnit differ

If SEMIC-Address:Address maps to INSPIRE-Address:AddressRepresentation, why then Address.adminUnit and Addressrepresentation.adminUnit have different datatypes? The first actually points to an AdministrativeUnit, while the latter is a GeographicalName (which more or less corresponds to a geographical LanguageString), meant to represent the name of the AdminstrativeUnit. AddressRepresentations are meant to be much more flexible than a structured Address, allowing to just put down the name of a municipality or other locality rather than having to reference an AdminstrativeUnit object. To make things even more difficult, this object only has a code and a level attribute. Even in the INSPIRE structured Address the Address does not connect directly to the AdminstrativeUnit, it is associated with an address component of type MunicipalityName (which in stead is supposed to connect with AdministrativeUnit). The SEMIC Address in its curent version goes one step too far in structuring something that is no more than an AddressRepresentation, a human-readable representation of an Address for use as a label on a letter or on a map. Plus that it no longer maps 1:1 to the INSPIRE AddressRepresentation. And if adminUnit is now a reference rather that a name, why not do the same for postName or postcode or even thoroughfare?

Geometry of an address

In the past version of CLV (I am referring to this one https://www.w3.org/ns/locn) it was possible to specify a geometry for an address. Indeed an address may have a geometry.

How can I do that now? A location, i.e., "named place", has a geometry. Is an address a location? (probably yes BTW).

simplify structure of releases

https://joinup.ec.europa.eu/solution/core-location-vocabulary/releases
and https://joinup.ec.europa.eu/release/core-location-vocabulary/100
show 9 zips for 1.00

should be able to download all info at once (one consolidated zip)
also should be able to access conceptual/descriptive info on the web, without having to download and open zips, eg:
- Core_Vocabularies-Business_Location_Person_v1.00_Conceptual_Model_0.zip: Core_Vocabularies-Business_Location_Person_v1.00_Conceptual_Model.png
- https://joinup.ec.europa.eu/site/core_location/rdfs.html does not show as webpage but as raw html??

Requirements identified after the first release of LOCN

In view of possible revisions to the Core Location Vocabulary (LOCN), I include below a summary of the requirements identified after the release of the first version of LOCN and based on implementation experiences, other working groups and related specifications.

Based on their scope, such requirements can be grouped into three main classes:

Representation of spatial / temporal coordinates
Representation of addresses
Mapping LOCN with other relevant vocabularies - especially for the representation of addresses

Representation of spatial / temporal coordinates

This set of requirements comes from three main working groups (listed in chronological order):

The W3C Locations and Addresses Community Group (LOCADD), who is responsible for the maintenance of LOCN in W3C space.
The ISA GeoDCAT-AP Working Group (GeoDCAT-AP), who developed the geospatial extension to DCAT-AP by defining mappings rules for INSPIRE / ISO 19115 metadata.
The W3C/OGC Spatial Data on the Web Working Group (SDW), who provided guidelines on the representation of spatial data on the Web, based on existing best practices, and identified a set of gaps yet to be addressed.

Requirement	LOCADD	GeoDCAT-AP	SDW
Ability to specify bounding boxes	✓	✓	✓
Ability to specify centroids	✓		✓
Ability to specify spatial / temporal resolution	✓	✓	✓
Ability to specify spatial / temporal reference systems	✓	✓	✓
Availability of an XML / RDF datatype for GeoJSON		✓	✓
Ability to specify start / end date(time) for temporal coverage	✓	✓	✓

For some of these requirements, solutions have been proposed in GeoDCAT-AP and in the W3C Data Quality Vocabulary (DQV), which have been documented by the SDW Working Group in their best practices.

Ability to specify bounding boxes & centroids

LOCN has a general property, namely, locn:geometry, to associate a geometry with a resource. However, in some contexts, it is necessary to clarify whether the specified geometry is a not the actual geometry, but rather is the point corresponding to its (geographic) centre (centroid) or a rectangle representing its extent (bounding box).

None of the standard / most popular spatial vocabularies (GeoSPARQL included) provides properties and/or classes to model this information. The only exception is schema.org, which defines a schema:box property, but it supports only a specific encoding for the coordinates of a bounding box, whereas locn:geometry supports any standard geometry encoding. The support for a representation of centroids and bounding boxes more flexible than schema:box, and compatible with locn:geometry is also a requirement for GeoDCAT-AP.

In order to address this issue, LOCADD discussed about the definition of subproperties of locn:geometry for centroids and bounding boxes.

Ability to specify spatial / temporal reference systems

GeoDCAT-AP provides a mechanism to associate a (spatial / temporal) reference system with a dataset by using dct:conformsTo, which can also be used for geometries - or any other resource.

Since dct:conformsTo is a very general property, the fact that the object is a spatial / temporal reference system is currently addressed by using dct:type with the relevant code list values from the INSPIRE Glossary, as shown in the following example:

a:Dataset a dcat:Dataset ;
  dct:conformsTo a:SpatialReferenceSystem .
  
a:SpatialReferenceSystem a dct:Standard ;
  dct:type <http://inspire.ec.europa.eu/glossary/SpatialReferenceSystem> .

Moreover, an experimental RDF representation of reference system from the OGC CRS Register has been developed, mapping additional information (as the "name" of the CRS).

⚠️ The Git repository including the the mapping proposal referred to from the mail above has been moved to GitHub: https://github.com/SEMICeu/epsg-to-rdf

This approach could be adopted "as is" in LOCN, although it might be desirable to have a more specific property than dct:conformsTo and/or use a stronger typing rather than using dct:type with a term from the INSPIRE Glossary.

In such a case, an initial proposal for the definition of specific classes / properties is documented here:

https://joinup.ec.europa.eu/mailman/archives/dcat_application_profile-geo/2015-July/000157.html

Moreover, the new version of the W3C Time Ontology includes a class time:TRS that could be used to type temporal reference systems.

Ability to specify spatial / temporal resolution

GeoDCAT-AP currently models spatial / temporal resolution as free text (with rdfs:comment), recognising that, at the time when the GeoDCAT-AP specification was released, no existing vocabularies provided a means to model this information.

However, this requirement has been brought to the attention of the W3C Data on Web Working Group, and a solution has been documented in the W3C Data Quality Vocabulary (DQV), as reported here:

https://joinup.ec.europa.eu/mailman/archives/dcat_application_profile-geo/2016-May/000367.html

Basically, DQV models this information as observations / measurements of a given quality metric (which corresponds to a given type of resolution).

This solution was also included by the SDW Working Group in their best practices, and it could be readily adopted in LOCN.

This would however require the definition of two groups of individuals:

Those corresponding to the different types of resolution (denoting a quality metric).
Those corresponding to each of the different levels of resolution (denoting the measurement of a specific quality metric).

As far as the first group is concerned (i.e., the different types of resolution), these individuals can be defined in DQV as follows:

:SpatialResolutionAsEquivalentScale a dqv:Metric;
  skos:definition "Spatial resolution of a dataset expressed as equivalent scale,
	  by using a representative fraction (e.g., 1:1,000, 1:1,000,000)."@en ;
  dqv:expectedDataType xsd:decimal ;
  dqv:inDimension dqv:precision .
    
:SpatialResolutionAsDistance a dqv:Metric;
  skos:definition "Spatial resolution of a dataset expressed as distance"@en ;
  dqv:expectedDataType xsd:decimal ;
  dqv:inDimension dqv:precision .

This initial list can be further extended. E.g.:

:SpatialResolutionAsHorizontalGroundDistance a dqv:Metric;
  skos:definition "Spatial resolution of a dataset expressed as horizontal ground distance"@en ;
  dqv:expectedDataType xsd:decimal ;
  dqv:inDimension dqv:precision .
    
:SpatialResolutionAsVerticalDistance a dqv:Metric;
  skos:definition "Spatial resolution of a dataset expressed as vertical distance"@en ;
  dqv:expectedDataType xsd:decimal ;
  dqv:inDimension dqv:precision .
    
:SpatialResolutionAsAngularDistance a dqv:Metric;
  skos:definition "Spatial resolution of a dataset expressed as angular distance"@en ;
  dqv:expectedDataType xsd:decimal ;
  dqv:inDimension dqv:precision .

The question is in which space such individuals should be defined (inside LOCN? in a separate code list - as the ones maintained by the EU Publications Office?).

The definition of individuals in the second group is however more problematic, since the level of resolution and unit of measurement are arbitrary (1:1000, 1:100, 1m, 1km, 100m, 10 decimal degrees, etc.).

Possible options include the following ones:

Define only the individuals corresponding to the types of spatial / temporal resolution, whereas the individuals expressing the actual resolution will be defined at the data level. This solution is not optimal, since it will result in multiple definitions of the same individuals.
Define individuals only for some levels of resolution and units of measurements - e.g., the most common ones. This solution may address the majority of (but not all) the cases.
Set up a URI space supporting arbitrary levels of resolution and units of measurements. This register will dynamically generate the corresponding individuals based on information included in their URI.

An example of the last option, including also a proposal for how these individuals could be defined, is available at:

http://geodcat-ap.semic.eu/id/resolution/

XML / RDF datatype for GeoJSON

Property locn:geometry can be used to specify geometries also by directly using syntax encoding schemes. In such a case, it is useful that the used geometry encoding is specified by using a typed literal, and this is actually what is done in GeoDCAT-AP.

Although RDF datatypes exist for WKT and GML (they are defined in GeoSPARQL), an XML / RDF datatype for GeoJSON is missing.

To address this issue, GeoDCAT-AP uses the URL of the corresponding IANA media type (namely http://www.iana.org/assignments/media-types/application/geo+json), but this solution is not optimal, and it would be preferable to define a specific datatype.

This can be done in LOCN, but other options might be considered - e.g., a reference register for syntax encoding schemes maintained by an authority, as the EU Publications Office.

Ability to specify start / end date(time) for temporal coverage

Currently, this information is specified in DCAT-AP by using schema:startDate and schema:endDate, respectively, following ADMS. GeoDCAT-AP follows the same approach.

This issue has been brought to the attention of the W3C Dataset Exchange Working Group (see UC27), so a possible solution might be contributed in that context.

Representation of addresses

After the release of LOCN, examples of RDF representations of INSPIRE datasets concerning addresses has been released.

Two notable examples are:

The Dutch Base Registry of Addresses and Buildings (BAG).
The Flanders address register, based on the OSLO2 framework.

The detailed requirements are yet to be collected. However, in general, they concern two main issues:

Adding properties covering all the information included in the INSPIRE Address data schema.
Allowing the specification of non-literal values for address components.

Mapping LOCN with other relevant vocabularies

After the release of LOCN, a number of use cases have been reported to enable to mapping of LOCN-encoded data into other popular vocabularies, in particular vCard and schema.org, especially for the representation of addresses.

A mapping proposal has been developed by JRC, and illustrated here:

https://joinup.ec.europa.eu/mailman/archives/dcat_application_profile-geo/2016-August/000373.html

⚠️ The Git repository including the documentation of the mapping proposal referred to from the mail above has been moved to GitHub: https://github.com/SEMICeu/locn-mapping

Conclusions

Among the revisions listed above, the definition of subproperties for centroids and bounding boxes is the least problematic, and it can be readily carried out. The same applies to the missing GeoJSON datatype.

Addressing the issues concerning reference system and spatial resolution requires additional discussion on what needs to be defined (e.g., which types of resolution, which types of reference systems). A starting point can be the requirements coming from other ISA specifications, as GeoDCAT-AP, where the types of reference systems and spatial resolution used are those included in ISO 19115.

The revisions concerning addresses are currently the least consolidated, and require a detailed requirement analysis - as already said in the relevant section.

In all these cases, the question is in which space these terms should be defined. A possible option (that was also discussed in LOCADD) is to define them in separate LOCN extensions - e.g., we could have one for geometries (locn-geo) and one for addresses (locn-ad).

Finally, the mapping of LOCN with vCard and Schema.org can be considered rather stable, since most of the mappings are pretty straightforward, and it includes only a very limited number of issues. However, it should be desirable to be reviewed and tested.

Attribute for street name and number

In the XML notices for the standard forms there is an address field that contains street name plus street number (see example below):

The definition for locn:thoroughfare does not propose a number, only the street name. The definition for locn:fullAddress propose a complete address (including zip code, city name, etc.) written as a string.

How do you recommend to this use case?

Controlled vocabulary / Codelist on regional, municipality or even Street level needed

In Germany and other MS specific keys and codes do exist on regional and municipality level. A concept going a level deeper than adminUnitL1 (country) and adminUnitL2 (county) might be needed.

The current Core Location “Address” component only has code data type in adminUnitL1 only and adminUnitL2 as literal.

In order to express national code values as URIs / Codes in address components an additional field adminUnitL2 (or adminUnitL3) data type “URI” should be added. Another option would be an optional, repeatable field AddressCode, so that we can - in our use case - assign the Code for the Street as well as coding of the municipality.

Exemplary values of a German municipality key called “Allgemeiner Gemeindeschlüssel” (AGS) that could be used in a core location address element can be found here in the German DCAT-AP.de. DCAT-AP.de extends Core Location at several points and adds own national URIs for Geocoding https://www.dcat-ap.de/def/politicalGeocoding/municipalityKey/

Other Member States have URI-encoded geopolitical encodings too and there is the EU nuts code that does not fit properly in the adminUnitL2 as "Text".

A structural way to represent address supplements.

The SEMIC team received the following request through email:
Is there a structural way to represent address supplements such as c/o, Rear house 2nd floor etc. in http://www.w3.org/ns/locn#Address ?
For address supplements like c/o (in care of) would it be possible to add an "addressSupplement" property?

How to use m8g:AdminUnit with NUTS?

I am unclear on how exactly to use m8g:AdminUnit with NUTS published by the EU Publications Office.

Let's say I have an address and I want to map it to NUTS http://data.europa.eu/nuts/code/CZ01.
I have:

<#Address> a locn:Address ;
   m8g:adminUnit <IRI1> .
<IRI1> a m8g:AdminUnit ;
   m8g:code <IRI2> ;
   m8g:level <IRI3> .

What are IRI1, IRI2 and IRI3 expected to be?

Naturally, I would expect that I can do:

<#Address> a locn:Address ;
   m8g:adminUnit <http://data.europa.eu/nuts/code/CZ01> .

and be done with it. But then, what about code and level?
The NUTS codes have:

<http://data.europa.eu/nuts/code/CZ01> <http://data.europa.eu/nuts/level> 2 .

where the level is a literal. The range of m8g:level is skos:Concept though, so it should be an IRI. Is there a NAL for levels of adminUnits to be used?

And what about the code? Sure, I can do:

<#Address> a locn:Address ;
   m8g:adminUnit <http://data.europa.eu/nuts/code/CZ01> .
<http://data.europa.eu/nuts/code/CZ01> a m8g:AdminUnit ;
   m8g:code <http://data.europa.eu/nuts/code/CZ01>.

But that seems weird. Is the instance of m8g:AdminUnit supposed to be something else than http://data.europa.eu/nuts/code/CZ01?

Consolidated diagrams Core Vocabularies

As introduced during the fifth webinar, two consolidated diagrams have been produced combining all core vocabularies: Core Person Vocabulary, Core Location Vocabulary, Core Business Vocabulary and Core Public Organization Vocabulary. These diagrams intend to give an overview of the classes and properties of the different vocabularies.

The Consolidated diagram in an exhaustive manner while the Simplified version focuses on the main concepts of each vocabulary and their connections.

With this issue, we would like to invite you to provide feedback on these diagrams.

Representing the vocabulary in UML/HTML

see SEMICeu/CPOV#11

URI of RegisteredAddress

As reported by @giorgialodi, the URI of RegisteredAddress does not sit in the legal namespace but in the m8g (Core Vocabularies).

Datatype for the property locn:locatorDesignator should be text (rdf:langString)?

In the class Address, the example in the Usage for the property "locator designator" (locn:locatorDesignator) says: For an address such as "Flat 3, 17 Bridge Street", the locator is "flat 3, 17".

This property may therefore have values that are language-dependent, e.g. the word "flat" in the example above. The datatype for this property should therefore be Text (rdf:langString) rather than String (xsd:string).

Resource.location and Resource.address have the wrong associated uri

Resource.location and Resource.address are associated with m8g instead of http://www.w3.org/ns/locn#location.

Administrative Unit: need for clarification

During the Core Vocs webinar dd. 2021-05-20, there was some misunderstanding on what exactly is understood with an Administrative Unit and what it covers. It was unclear whether administrative units refer to addresses and locations that should be seen as separate from jurisdictional rights, not withholding the fact that there could be a relationship between the two.