GithubHelp home page GithubHelp logo

nceas / arctic-data-training Goto Github PK

View Code? Open in Web Editor NEW
10.0 10.0 12.0 405.55 MB

Training activities for the Arctic Data Center

Home Page: http://training.arcticdata.io

JavaScript 3.06% CSS 0.95% HTML 95.97% R 0.02% Shell 0.01%

arctic-data-training's People

Contributors

aebudden avatar amoeba avatar dependabot[bot] avatar jeanetteclark avatar kameyer avatar mbjones avatar vlraymond avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

arctic-data-training's Issues

package requirements for Linux users

Feedback from Bryan Brasher:

I noted there were a few times that I fell behind when installing packages due to issues I had with running R on a Debian linux distro.

I found that for the packages we needed for class, that i needed to have the following linux packages installed:
libssl-dev
libcurl4-openssl-dev
libxml2-dev
libnetcdf-dev
libgeos-dev

For the next training, when we send the preparation email we should ask Linux users to contact us directly to try to get ahead of these linux package install issues

reorganization of the repo

Question:

How might the Arctic Data Center outreach and training team structure and organize its training materials for maximum value to both the training team and its users?

Overview:

This document outlines proposed changes to the arctic-data-training repository in order to establish 1.) a central location for all training components that 2.) enables modular use of existing training materials, 3.) provides exact record of training events, 4.) that is easy to use for both ADC team providing training and ADC users participating in training.
This approach is modeled after the Mapbox workshops github repo (https://github.com/mapbox/workshops) which serves as both living record & step-by-step training guide for Mapbox developers giving trainings. I was introduced to their approach at Open Street Maps 2017 State of the Map US. See the Mapbox SoTM workshop training page here as an example of a direction I think ADC training could go: https://github.com/mapbox/workshops/tree/sotm2017-osmcha/SoTM-2017

Proposed organization:

Below are proposed changes to the /materials section and a proposal to create an /events folder. NB: This plan hinges on the idea that the ADC team finds a github page in markdown an acceptable format for serving training content. Currently the training repo is geared towards internal users only. The full team should discuss this to make sure all are open to this idea, or the proposed changes below should be adjusted to the team’s consensus on appropriate content serving platforms.

NCEAS/arctic-data-training/materials

This section should be organized by training “module”. All training materials should be stored in a folder pertaining to the appropriate training module. This means that each materials folder corresponds to a training topic, which includes the following training materials:

  • Presentation slide deck
  • Sample data sets
  • Any instruction / how-to guide
  • Any other content used in the training

Goal:
For ADC team:

  • Easy, modular use & reuse of existing training materials
  • Training materials are linkable in training agenda
  • Serves as a record and a central location for all training materials
    For users:
  • Easy to peruse content during, or after the training
  • Ability to perform some informal self-guided training

Example of Materials Folder organization:

  • R training: would house arctic-data-center-training.rproj, Bulk-data-upload, Data packaging, Hierarchical packaging, Query and download
  • NSF data policies
  • ADC: would house Overview, How to connect with ADC, ADC Submission Best practices
  • Data management plan training: would house ArcticDataCenter_DMP.pdf
  • Metadata: would house What is metadata, Preparing complete metadata
  • Data management best practices: including Versioning your dataset, Open Source formats
    Identifiers
  • Data Discovery and Re-use
  • Data provenance

NCEAS/arctic-data-training/events

This section readme should list all training events which link to a corresponding training/workshop folder. Each workshop folder readme should, in turn, provide overview of workshop, workshop agenda, and include links to appropriate content stored in arctic-data-training/materials.

Additionally, each workshop folder should include a license detailing appropriate terms for re-use.

Topics

The training is very focussed on data, DMPs, and repo. Is there room within offering to mention/explore how this effort aligns with reproducible and open science for the Arctic Research Community specifically? Be really nice to see that in there, even briefly.

Rename the master branch to main

From an inclusivity perspective, I suggest we rename the master branch to main and generally use more welcoming terms as we describe our software architectures. The use of the the terms master and slave in computing are not welcoming. For context, see the Inclusive Naming project and this repo.

Add to Programming Metadata section

I was helping someone who has taken the course before and now they want to publish their own data. Filing this issue hoping to address some of the gaps when taking the example they used in the training to their own datasets.

(1) There was some confusion on what to do when you have multiple csv files and how to document them in the metadata
(2) How to add custom units

A small supplemental section so users that have taken the course can refer to it and if we are responding to emails we can easily point to as a resource.

(3) The relationship between the metadata and data ids in this section of code and how all these object relate to one another

# Add our data file to the package
sourceObj <- new("DataObject",
                 id = data_id,
                 format = "text/csv",
                 filename = "files/my-data.csv")

dp <- addMember(dp, sourceObj, mo = metadataObj)
dp

Something graphical to help connect the concepts might help so that it feel a little less abstract.

create programming metadata section

This section should cover

  • how to create metadata using EML
  • how to publish a data package using datapack

a draft of the chapter (not numbered) is in the 2019-10-training branch. awaiting review from @mbjones

Solve-a-problem exercise

On day#2, can tracking data provenance be reduced a bit in time to explore the solve-a-problem model with ADC sample datasets? Either in (deposit) or out (download). The out model is download a dataset, see if it is easy/viable to parse meta-data, then do one analysis/data viz with someone else's data. Great exercise and good practice using repo.

Alt - way to add into agenda, block 1hr to being process at 4pm on Monday, then revisit again on Tuesday at 4pm again to provide gestation and exploration time.

Missing a slash in the online distribution URL

The notes in Chapter 17 - Programming metadata is missing the slash before object. For example it looks like this in the online distribution URL on the test site

doc$dataset$dataTable$physical$distribution$online$url <- paste0(mn@endpoint,
                                                                 "/object/",
                                                                 data_id)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.