ncar / xdev Goto Github PK

View Code? Open in Web Editor NEW

6.0 11.0 4.0 4.22 MB

The NCAR Experimental Development Team

Home Page: https://ncar.github.io/xdev

xdev's People

Contributors

Stargazers

Watchers

Forkers

matt-long bonnland jukent kmpaul

xdev's Issues

Xdev Team Process Document

We need a document to document our process.

I recommend a markdown document (e.g., process.md) in the Xdev repo, or a wiki page. We need to describe how we work, how we use GitHub (e.g., forks vs branches), and other things.

Zulip Server Running on vRealize?

Using vRealize VMs made available by the EITO group in CISL, we can get a VM running CentOS 7 and install the Zulip server.

Open questions that need to be addressed are:

Can we get the Zulip production server running on a VM?
Can we run the Zulip development server on the same VM?

And issues related to these. Answers to follow.

Project Board User Guide

https://help.github.com/en/github/managing-your-work-on-github/filtering-cards-on-a-project-board

JupyterHub on HPC invited talk: Pangeo Tools plenary session

Slides available here: https://andersonbanihirwe.dev/talks/jupyterhub-on-hpc-pangeo-2019.html

Team Programming Session - March 18

Opening this issue to solicit ideas for the next TPS.

Approve xdev-bot collaborator invitations to repos

@andersy005 You need to approve adding the xdev-bot as a collaborator to the following repos:

NCAR/ACOM-Python-Tutorial
pangeo-data/benchmarking

We are Xdev! Blog Post

Need to finish... get input from Matt Long.

Incorporate project status dashboard in the main site

JupyterHub Planning

Let's list the things we need to discuss in meeting tomorrow with Jared et al.

@NCAR/xdev

Create post on discourse on how to log in and do work on Cheyenne

We need the kernel updated on Cheyenne before doing this.

GitHub App for Project Management

I think we need a GitHub App to help us track and manage our work.

I think we need to put together a list of requirements for what this looks like.

For example, do we use Webhooks or poll the GitHub API or both (e.g., webhooks in repos we own, poll repos we do not own)?

What events do we need to look for? What do we need to poll? How do we want the GitHub Project Board(s) to look/work?

Zarr/NetCDF update and plans (7min)

Contant Ward Fisher about update (and Ryan May).

Overcoming deficiencies in our current workflow

I'd like to discuss how better to organize ourselves and deal with deficiencies in our workflow. To begin, I would like to explicitly declare what deficiencies we currently have in our workflow.

Personally, I believe that GitHub should be our primary organizational utility, with GitHub notifications being our primary means of communicating with each other and other collaborators. However, GitHub notifications are good for some things, and bad for others.

GitHub notifications are good for general developer communication, namely:

informing you when someone else has an issue,
informing you when someone else needs your input (@-mentions), and
informing you when changes are made to software you are working on.

GitHub notifications are bad for:

letting you know what the rest of the team is working on (i.e., transparency),
letting you know what you should work on next (i.e., prioritization), and
keeping you focused on the task currently at hand (i.e., focus).

I believe that the Xdev Project Board is very good for making our work transparent, but I still believe that we have a problem with prioritization and focus.

I would like to open this issue to solicit two things:

What other deficiencies do we currently have with our workflow (i.e., other than focus and prioritization)?
What solutions might exist to help with these deficiencies (e.g., tools or services)?

Getting up to speed with CuPy

Create a twitter list with people of interest to Xdev

From #84 (comment)

One additional possible use of twitter would be to create a publicly-available list of folks whose work is of interest to us -- such as Xdev members themselves (if they are have work-centric twitter accounts) and that large fraction of the Pangeo community you mention. I'm imagining something like Katharine Hayhoe's scientists who do climate list but geared toward's Xdev's mission.

Assigning @kmpaul since he's the one with the keys to the account, but maybe we could all suggest accounts to add to the list in this ticket?

Write a blog post on time

Part 1 in a series 'Growing Pains of a scientist learning to think like a software engineer: documenting my mistakes so you don't repeat them' working title.

Discussing time and the tendency to rewrite my own functions to deal with date time before understanding the functionality already available, or benefits of storing things in specific data types.

Dask Benchmarks

The goal of this project is to build a suite of benchmarks that stresses the scheduler and then use profiling tools on it and analyzing the results. This would certainly be a useful exercise.

Xdev Proposal Plan

We need a plan for how to move forward into the future as a team that assists scientists in developing analysis workflows:

Develop collaborations with "nearby" scientists to develop technology that assists and makes possible curiosity-driven data analysis workflows.
Develop a process for starting and ending short-term prototyping projects with these scientists, perhaps internally for the first quarter of 2020.
Reach out to scientists working on new diagnostic tools for the model.
Need to help scientists take over ownership of our prototypes when the project ends.
Need to invest in Xarray.
Need to watch our level of support vs development activities.
Move forward as originally planned, with these caveats.
We should model good behavior in our tutorial.

Jupyterhub Checkpoint: 2020-01-29

Use this issue to create the agenda for the next Jupyterhub Checkpoint meeting. Add items to discuss in comments below.

Add Z5 support to IOR

For the benchmarking effort, need to do the following:

Write a C wrapper for the z5 C++ library
Convert wrapper to "abstract IOR interface" (aiori)
Add aiori file to IOR
Submit pull request to IOR

WE are working in our own fork, now. Haiying is taking the lead on this.

Contact Bill Skamarock for MPAS Diagnostics

Bill Skamarock (https://staff.ucar.edu/users/skamaroc) is the chief scientist for MPAS. I ran into him on the shuttle recently, and he is very interested in moving some of their code to python.

Someone should reach out to him and let him know that we're trying to build a diagnostics framework that is usable by scientists using different simulation models at NCAR.

Try out alternate discussion package: utterances

Mentioned here:

#42 (comment)

CI Proxy Service

As discussed with Mick Cody and Sidd Ghosh, there is a way forward here that involves creating a proxy service, running on Cheyenne, that can query a publicly running CI service. A general sketch looks like:

CI is setup for a repo (like CircleCI)
A CI job is triggered on the repo, which launches the CI jobs/steps on the CI VM
One of the steps in the CI job is to send CI job information to a "proxy service" (running where? heroku?)
A cron-job, running on Cheyenne, periodically queries the "proxy service"
- If the proxy service has a job waiting for it, then it sends the job back to the cron-job as its response

If the proxy service has no jobs in its queue, then it sends back an "empty" response

If the cron-job gets a non-empty response, it runs the job.
When the job is complete, the result is sent back to the proxy service
The proxy service sense the result back to the CI job
The CI job completes with the appropriate success/failure based on the Cheyenne test results

Can we mock up something like this fairly quickly? A heroku bot could act as the proxy. Can we do something fairly simple and generic? Perhaps more importantly, has this been done before?

Write a series of short blog post(s) on tips for profiling Python code

I would like to write a series of blog posts as answers to NCAR/python-toolbox-faq#5

Xarray Documentation Plan

Is there a template from Xarray functions with good functionality? Find a model so these can be consistent, talk to Deepak.

List of functions used in tutorials (to start with -- please add other functions that are commonly used and check which of these are adequately documented):
Xarray.open_dataset
xarray.DataArray.isel
xarray.DataArray.sel
xarray.DataArray.where
xarray.DataSet.groupby
Xarray.DataSet.mean
Xarray.apply_ufunc

Each XDev member with interest in learning Xarray can pick one of these functions, then we have a 10 minute presentation on it during meetings. (One person a meeting?)

Xarray Learning Lesson Blog Series?

After a little exploration to solve an Xarray dilemma with @jukent, I’m wondering if it would be nice to take these lessons and put each one in a new blog post as a “Learning Lesson”.

What do the rest of you think, @NCAR/xdev?

Sprint Ideas for Pangeo Meeting

CC @jukent @andersy005

Generalize the Bias Correction code
- How to deal with different calendars?
- Different calendar matching options?
Open discussion on Xarray
intake-esm ideas
- Integration with Globus?
- HPSS testing/usage?

CMIP6 Hackathon Debrief

What worked? What didn't? Why?

Test

NSF Branding on blog

References #13.

Need to talk to Eric about what NSF branding we need on the blog site.

Project Automation

Currently, we have the xdev-bot to feed items into the Project Board "Backlog" column. However, I am noticing that issues that are in the xdev repo itself do not show up in the Project Board. ...So, I'm wondering if we can set up that automation.

GitHub can automate the Project Board for us for issues in the same repo as the Project Board itself. That is the case here. However, I think there are some changes that need to happen to make this possible:

The "Backlog" column needs to be changed to be the "To Do" column...I think...and automation needs to be enabled in the Project Board.
We need to make sure the Bot does not do anything to xdev issues, because I think that will interfere with the builtin GitHub automation. I suspect that this just requires check in all of the "Move Card" actions to make sure that the card is not "owned" by the Bot.

Blog post on python tutorial

Why Jupyter? Blog Post

I’ve been thinking about writing a blog post to answer the question that we always get from scientists: “But what is Jupyter?”

The idea would be to explain how Jupyter is the modern successor to ssh with X11 tunneling. And explain in high level terms how Jupyter does this.

But I only want to write it if people think it would be useful (i.e., someone hasn’t written this up before).

@NCAR/xdev Thoughts?

Add xdevbot-testing repo to the Watch List

Adding xdevbot-testing repo to the watch list:

/add-repo campaign:core repo:NCAR/xdevbot-testing

Deploy to GitHub Pages stage fails but CircleCI reports that everything is okay!

From CircleCI's logs I am noticing this message

ERROR: The key you are authenticating with has been marked as read only.
fatal: Could not read from remote repository.

This issue causes the deployment stage to fail. However, CircleCI doesn't halt the build and/or report that there's something wrong

https://circleci.com/gh/NCAR/xdev/24

Cc @kmpaul

Tutorial/Hackathon responses

Cc @bonnland

Data TWG plenary plan (30min)

Items to consider:
- Overview
- Recent progress (Pangeo Datastore, AWS LENS upload, ...)
- Sprint Ideas

Improve blog site theme

Need to improve the xdev blog theme. We like the carpet theme, but that only works with Nikola version 7. We need something that we can use with Nikola version 8.

Fix and publish Xdev blog site

Need to clean up the Xdev blog site and publish the first blog post, which is massively overdue.

ESM Diagnostics Challenge

Background:

We are thinking about scoping out what the future of an ESM diagnostics package should look like. In general, I believe that the package itself should be agnostic with respect to the actual model used (e.g., CESM). So, thinking about diagnostics from a general sense, I would say that such a package should have the following features:

provides data pipelines from ESM output to specific diagnostic products (e.g., images), with the option of saving intermediate states (e.g., climatologies)
orchestrate multiple pipelines at once (i.e., workflow orchestration)
scale efficiently with the size of the input data
leverage the same technology that can be used for interactive analysis (i.e., share the computation routines)

The Xdev Team (@NCAR/xdev) should expect this to be a significant effort that we contribute to over the next year or more.

Issues:

What packages already exist to provide the features described above?
What features have I missed from the above list?
Of existing packages that claim to do some of the necessary steps for ESM diagnostics, which ones can best integrate?

intake-esm - Anderson. Slides: https://andersonbanihirwe.dev/talks/intake-esm-pangeo-2019
Bias Correction example - Julia. Slides: https://docs.google.com/presentation/d/1GB57yuV5BM903Ktbh_eDNYNzntVhLIejs7me_BMekC8/edit?usp=sharing

I have one idea:

https://kubernetes.io/docs/tutorials/

Do you have others?

Team Programming Session - March 25

@andersy005 @jukent

Submit ideas for the agenda here

ncar / xdev Goto Github PK

xdev's People

Contributors

Stargazers

Watchers

Forkers

xdev's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs