GithubHelp home page GithubHelp logo

smd-open-science-guidelines's Issues

Publications guidance expansion

Expand publications guidance to include:

  • guidance on how to share materials other than journal articles (e.g., technical reports, conference materials, and books)
  • specific guidance for mission publications
  • sharing preprints; may want to meet with arXiv to discuss how potential added functionality could address SMD requirements
  • guidance on what constitutes the author’s accepted manuscript; see DOE PAGES example; address formatting requirements (e.g. question about files generated with a LateX template)
  • guidance for researchers on their rights to share an author’s accepted manuscript (for civil servants and non-CS). What do authors need to know about copyright, licensing, agreements with journals? What if a publisher tells them that they cannot share their accepted manuscript without a 12 month embargo? See some related Q&A(https://www.osti.gov/pages/faqs#what-about-copyright-transfer-and-government-rights and https://publicaccess.nih.gov/faq.htm#779) from OSTI.

How do we specify internal system identifiers as RelatedIdentifiers

Per https://github.com/nasa/smd-open-science-guidelines/blob/main/request_for_comment/draft/rfc_002_how_to_make_data_fair.md#first-steps-2, it states:

Ensure an identifier is included in the DOI metadata directing to detailed product metadata. Use the RelatedIdentifier field with relationType IsDescribedBy.

When we say identifier here, what are we talking about? Is this an internal system PID? If so, what should our relatedIdentifierType be? For instance, in the Planetary Data System (PDS), we have an internal PID for all products,. Which of these types should we use?

cataloging research data

SPD-41a "SMD-funded data collections shall be indexed as part of the NASA catalog of data."

Technically this is data.NASA.gov. However, data hosted at a NASA repository or indexed by ADS will be, eventually through the SDE, automatically listed there. If proposers follow the guidance on selecting an appropriate repository for their research data, then they do not need to take any additional action to meet this requirement.

Data guidance on training data for AI/ML

Develop guidance on sharing training data, including:

  • build on existing SPD-41a FAQ on this topic
  • Training data is in scope of SPD-41a, especially if needed to validate the results of a scientific finding (e.g., would need training data used to reproduce findings resulting from AI/ML models).
  • In general, you should provide all training data. However, there are considerations (list examples) for why it would not be appropriate to share complete training data. In that case, what can you share? Work with Manil on this.
  • Recommendations on how/where to share training data (repository selection), and considerations based on size of training dataset
  • commercial data used for training and implications for sharing
  • Examples of how training data are being shared openly - ESDS ACCESS projects - work with Cerese and Manil on this.

Update links to division OSDMP templates

Following web modernization, update links to the division OSDMP templates to ensure that any future updates will be captured. The best approach may be to link to template pages on science.nasa.gov: example

Open hardware guidance

Not included in SPD-41a, but consider adding some basic guidance on considerations for open hardware
See CERN Open Science policy and/or notes from CERN open source sessions

Seeking Clarification on Section F3

Section F3 states that:

It is a reminder to curators that the globally unique, persistent identifier *must *be propagated to the metadata for the target data itself, not merely cataloged in an institutional database.

  • Does that imply the DOI needs to be included in the data files themselves (e.g. header in a fits file)?

  • What happens to a file that is associated with multiple DOIs. For example, an image file from a telescope observation can have a DOI that points to it as a data product, but that image file is also part of an observation that have its own DOI.

FAQ additions

How does sharing technology openly intersect with an investigator's ability to patent their work?

For proposers who work with tribal nations and have concerns about data sovereignty and SMD's data sharing requirements, what guidance is available?

Software guidance expansion

  • List examples of journals for open source software (e.g. Software Heritage, ACL); similar to how we give suggestions for selecting external data repositories
  • Examples of a software code of conduct 1,2 / contributing guide; guidance on what should be included in these
  • add link to this reference
  • review NIH best practices for sharing software for any areas not already addressed by the OSS Guidance

Standards for Codes-of-Conduct documents

Issue:

It is not currently clear what is an appropriate Code of Conduct document for a codebase on NASA's GitHub organization. Can a standard — or at least, guidance on a — Code of Conduct document for repositories in NASA's GitHub be established? I believe it could be beneficial to streamline this for repository contributors, and it would be helpful for other folks visiting and interacting with repositories on NASA's GitHub.

Context:

Preparing code repositories (including on NASA's GitHub org.) to follow Open Science guidelines typically includes adding a Code of Conduct — as a file named CODE_OF_CONDUCT.md. In general, Codes of Conduct set expectations for how people interact with the project and with each other, and they are expected to be present for certain applications. For example, pyOpenSci explicitly asks for a CODE-of-CONDUCT.md to be included in a submitted repository.

Clarifying guidance for Lab experiments

Lab experiments are different than the observations and theory research which are done in many of our divisions. It would be helpful to provide more guidance on what data or software would be useful to make available and how to make it available.

Examples:

Flow charts for compliant publications, data, software

Develop flow charts to guide researchers/proposers/reviewers through set of questions to ask when considering whether an OSDMP is compliant with the policy at the proposal stage, and/or whether actions taken to manage and share scientific information are compliant with the policy.
PSD created a nice set of these that have been shared internally within the OSSI. A public version of this information may be a helpful addition to this guidance.

update guideline when final policy is release

The RFC you are commenting on: Registration of DOIs for data citation

Type of comment:

  • Editorial - grammar, spelling, word choice, etc.
  • Minor - would not affect the overall direction of the guideline but could provide clarity, a small correction, a minor change to process, etc.
  • Substantive - objection to the guideline or change to its intent; recommendation to use another standard or service than what the guideline suggests; a major change to a process; etc. a major change to a process or or use of a particular standard the
  • Question - Any question for clarity, scope, purpose, etc.

Section(s) of document referenced: C.1.

Comment/Question: Reference to SPD-41a should be updated when the policy is approved.

Remove old guidance on publication sharing

The original guidance on publications
smd-open-science-guidelines/guidance/research_publications.md

has now been replaced with
smd-open-science-guidelines/OSS_Guidance/Publications.md

We should remove the old guidance or somehow indicate that it has been replaced since it is less comprehensive than the new version. For review by @nasacrawford

Add a link to best practices for data in Astronomy literature

This is a really useful document by Tracy Chen and collaborators on best practices for data in astronomical literature. It would be useful to link to in either the data pages or resources pages to provide further information.

As for links of the best practices, use the ADS link to the article (https://ui.adsabs.harvard.edu/abs/2022ApJS..260....5C/abstract), or the link to the NED BP page (http://ned.ipac.caltech.edu/Documents/Guides/BestPractices) with the convenient checklist included.

Link to TOPS material

This repository can link to the NASA Transform to Open Science (TOPS) initiative information so that these guidelines are provided within a broader context.

Guidance on restricted data and software

Expand upon existing guidance on restricted information given in data and software sections.
Add more background on the restrictions that generate exceptions to information sharing (e.g. ITAR, EAR, HIPAA).

Provide more concrete direction for linking to DOI access documentation

Per https://github.com/nasa/smd-open-science-guidelines/blob/main/request_for_comment/draft/rfc_002_how_to_make_data_fair.md#first-steps-4, are there any ideas for more concrete direction for how we can provide an integrated look at how DOIs can be accessed across SMD? Maybe a page on SciX for how to access all the information about DOIs across the various systems?

e.g.

System Web Search Access API Access
System X link TBD link TBD
System Y link TBD link TBD

update guidance to encourage data citation

As requested in https://mastodon.social/@crawfordsm/109676705818114890, I'm creating this issue to suggest that while the current guidelines for data generally require making data CC0 licensed, and encourage metadata and identifiers that make the data citable, the guidance does not actually require or encourage citation of the data when it is used. One could argue that existing scholarly norms already do this, but it would be useful if NASA at least encouraged this as well.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.