Light

Add reasonForNonRelease to schema about project-open-data.github.io HOT 10 CLOSED

project-open-data commented on July 19, 2024

Add reasonForNonRelease to schema

from project-open-data.github.io.

Comments (10)

waldoj commented on July 19, 2024

Marina, what do you envision going in this field? Free-form descriptive text? I don't know anything about the processes within government that will track this documentation—is the most logical way to relate a dataset to its private-only rationale to do so within a field like this?

from project-open-data.github.io.

BernHyland commented on July 19, 2024

Hi Marina,
Apparently I replied to wrong address. Resending.

Begin forwarded message:

From: Bernadette Hyland [email protected]
Subject: Re: [project-open-data.github.io] Add operatingUnit Field (#89)
Date: July 29, 2013 4:31:50 PM EDT
To: "project-open-data/project-open-data.github.io" [email protected]
Cc: Fadi Maali [email protected], John Erickson [email protected]

Hi Marina,
A lot is known about the problem you describe & some really smart people have already cracked this nut. Agencies collect, curate and publish data in all sorts of ways with all sorts of contact information, often office emails & telephone numbers. Contact info comes in a wide variety of formats containing no name, first name, last name, salutation, title, agency, organization, and addresses -- the bane of many data administrators existence. There are policy & technology issues at play here.

IMO, the Open Data Project could do a great service to get behind, through providing input, a standard vocabulary for describing government published datasets. One such effort that has benefited from some really smart people dedicated to Web standards & government transparency is the RDF vocabulary called DCAT.[1] I'm sure there are others too, but I'm familiar with this project.

DCAT is nearly publication as an open Web standard and has been produced in a transparent, peer-reviewed manner. I encourage you to post your questions & feedback to [email protected] so we can work cooperatively to advance open government publication efforts. If you're facing some things that haven't been contemplated by DCAT, now would be a great time to address this.

Cheers,

Bernadette Hyland
CEO, 3 Round Stones, Inc
co-chair W3C Government Linked Data WG

[1] http://www.w3.org/TR/vocab-dcat/

On Jul 22, 2013, at 8:51 PM, MarinaMartin [email protected] wrote:

While datasets are ultimately owned by an agency, they are really collected and maintained on an operating unit basis. While contact names and emails may change, a dataset's associated operating unit probably will not. Making this a new, required field makes it clearer where to go with questions for the public consuming the data, the agency officials responsible for updating the metadata, and other agencies looking to access the data. It can also help agencies assess internal compliance with publishing data, and is likely to be part of an agency internal data management system for workflow purposes.

Different agencies call their sub-units different things: departments, POCs, bureaus, etc. In asking around, "operating unit" was most generic, but I'm open to an even more generic term.

What do you all think?

—
Reply to this email directly or view it on GitHub.

from project-open-data.github.io.

seanherron commented on July 19, 2024

Ideally, there would be something like a FOIA-type system, where if data doesn't meet one of a number of criteria for nonrelease it would be required to be released, and thus this field would need to be one of the predefined criteria. Logistically, however, this may be too ambitious. It may be good for us to create a set of "acceptable" criteria that we could give to agencies as suggested guidance for why a dataset may not be releasable (and the reverse).

What about NonReleaseJustification or RestrictionJustification?

from project-open-data.github.io.

MarinaNitze commented on July 19, 2024

@waldoj Yes I envision it as being a free-text field. The agencies already have to collect this information for each new dataset created/collected that's not going to be released, going forward. So isn't it logical to store this reason in the Enterprise data inventory (which, remember, is private -- not the public inventory)? They're storing it anyway -- but without a field they will, if I were to guess, store them separately and in a harder-to-find-internally spot.

@seanherron I think the list of options here is way too broad and will be defined by agencies' general counsels. I would suggest leaving this as a free text field and not providing criteria.

from project-open-data.github.io.

MarinaNitze commented on July 19, 2024

P.S. I have no problem with changing the name of this suggested field.

from project-open-data.github.io.

MarinaNitze commented on July 19, 2024

@BernHyland We made great efforts to match DCAT in this schema wherever possible -- the only two existing fields that do not match DCAT are accessLevel and systemOfRecord. This issue is specifically about giving agencies a place to document the reason for NOT releasing a particular dataset, in their internal-only enterprise data asset inventories. I'm not so so sure that is widely applicable enough to warrant inclusion in a standard like DCAT but I appreciate the reminder to stay involved in those conversations!

from project-open-data.github.io.

gbinal commented on July 19, 2024

Does the benefit of encouraging better behavior outweigh the complexity that adding this brings? I don't think it's an overly large addition to the agency workload but it is an added lift. In general, I always worry about the Christmas tree effect when it comes to adding further to what each agency is required to do.

from project-open-data.github.io.

seanherron commented on July 19, 2024

@gbinal I think if some sort of rationale isn't included people will either a) assume that the intent is nefarious and that we are hiding the data for no good reason, b) email the POC and ask for clarification/release, or c) forget about it entirely. For high-volume and frequently desired datasets (maybe some of the HHS data that has potential for PII, etc) putting a reasonable statement out there as to why it's private is good for transparency and will reduce the number of queries to the POC and angry tweets if they assume it's private for a questionable reason.

My concern would be that agencies wouldn't provide this information for legal reasons or would provide obtuse legalese that is difficult to parse and understand. If it's not going to be used, then there's not a lot of value in adding it.

from project-open-data.github.io.

konklone commented on July 19, 2024

I understand the Christmas tree argument, but in this case it seems merited. It'll only add work for non-released datasets (which are already saving a lot of work by not being released!), and agencies should have an on-the-record reason for not releasing a dataset anyway.

I also support keeping this a free text field, rather than selecting a preset exemption, to encourage a descriptive rationale. The field won't mean much if it doesn't communicate more than a category.

from project-open-data.github.io.

MarinaNitze commented on July 19, 2024

The discussion has moved towards combining the intent of this proposed field with the accessDetails field in #90. I'm closing this discussion -- please chat over at #90.

from project-open-data.github.io.

Related Issues (20)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs