GithubHelp home page GithubHelp logo

xdev's People

Contributors

andersy005 avatar hackmd-deploy avatar jukent avatar maboualidev avatar matt-long avatar xdev-bot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

xdev's Issues

Xdev Team Process Document

We need a document to document our process.

I recommend a markdown document (e.g., process.md) in the Xdev repo, or a wiki page. We need to describe how we work, how we use GitHub (e.g., forks vs branches), and other things.

Zulip Server Running on vRealize?

Using vRealize VMs made available by the EITO group in CISL, we can get a VM running CentOS 7 and install the Zulip server.

Open questions that need to be addressed are:

  • Can we get the Zulip production server running on a VM?
  • Can we run the Zulip development server on the same VM?

And issues related to these. Answers to follow.

JupyterHub Planning

Let's list the things we need to discuss in meeting tomorrow with Jared et al.

@NCAR/xdev

GitHub App for Project Management

I think we need a GitHub App to help us track and manage our work.

I think we need to put together a list of requirements for what this looks like.

For example, do we use Webhooks or poll the GitHub API or both (e.g., webhooks in repos we own, poll repos we do not own)?

What events do we need to look for? What do we need to poll? How do we want the GitHub Project Board(s) to look/work?

Overcoming deficiencies in our current workflow

I'd like to discuss how better to organize ourselves and deal with deficiencies in our workflow. To begin, I would like to explicitly declare what deficiencies we currently have in our workflow.

Personally, I believe that GitHub should be our primary organizational utility, with GitHub notifications being our primary means of communicating with each other and other collaborators. However, GitHub notifications are good for some things, and bad for others.

GitHub notifications are good for general developer communication, namely:

  • informing you when someone else has an issue,
  • informing you when someone else needs your input (@-mentions), and
  • informing you when changes are made to software you are working on.

GitHub notifications are bad for:

  • letting you know what the rest of the team is working on (i.e., transparency),
  • letting you know what you should work on next (i.e., prioritization), and
  • keeping you focused on the task currently at hand (i.e., focus).

I believe that the Xdev Project Board is very good for making our work transparent, but I still believe that we have a problem with prioritization and focus.

I would like to open this issue to solicit two things:

  1. What other deficiencies do we currently have with our workflow (i.e., other than focus and prioritization)?
  2. What solutions might exist to help with these deficiencies (e.g., tools or services)?

Create a twitter list with people of interest to Xdev

From #84 (comment)

One additional possible use of twitter would be to create a publicly-available list of folks whose work is of interest to us -- such as Xdev members themselves (if they are have work-centric twitter accounts) and that large fraction of the Pangeo community you mention. I'm imagining something like Katharine Hayhoe's scientists who do climate list but geared toward's Xdev's mission.

Assigning @kmpaul since he's the one with the keys to the account, but maybe we could all suggest accounts to add to the list in this ticket?

Write a blog post on time

Part 1 in a series 'Growing Pains of a scientist learning to think like a software engineer: documenting my mistakes so you don't repeat them' working title.

Discussing time and the tendency to rewrite my own functions to deal with date time before understanding the functionality already available, or benefits of storing things in specific data types.

Dask Benchmarks

The goal of this project is to build a suite of benchmarks that stresses the scheduler and then use profiling tools on it and analyzing the results. This would certainly be a useful exercise.

Xdev Proposal Plan

We need a plan for how to move forward into the future as a team that assists scientists in developing analysis workflows:

  • Develop collaborations with "nearby" scientists to develop technology that assists and makes possible curiosity-driven data analysis workflows.
  • Develop a process for starting and ending short-term prototyping projects with these scientists, perhaps internally for the first quarter of 2020.
  • Reach out to scientists working on new diagnostic tools for the model.
  • Need to help scientists take over ownership of our prototypes when the project ends.
  • Need to invest in Xarray.
  • Need to watch our level of support vs development activities.
  • Move forward as originally planned, with these caveats.
  • We should model good behavior in our tutorial.

Add Z5 support to IOR

For the benchmarking effort, need to do the following:

  • Write a C wrapper for the z5 C++ library
  • Convert wrapper to "abstract IOR interface" (aiori)
  • Add aiori file to IOR
  • Submit pull request to IOR

WE are working in our own fork, now. Haiying is taking the lead on this.

Contact Bill Skamarock for MPAS Diagnostics

Bill Skamarock (https://staff.ucar.edu/users/skamaroc) is the chief scientist for MPAS. I ran into him on the shuttle recently, and he is very interested in moving some of their code to python.

Someone should reach out to him and let him know that we're trying to build a diagnostics framework that is usable by scientists using different simulation models at NCAR.

CI Proxy Service

As discussed with Mick Cody and Sidd Ghosh, there is a way forward here that involves creating a proxy service, running on Cheyenne, that can query a publicly running CI service. A general sketch looks like:

  1. CI is setup for a repo (like CircleCI)
  2. A CI job is triggered on the repo, which launches the CI jobs/steps on the CI VM
  3. One of the steps in the CI job is to send CI job information to a "proxy service" (running where? heroku?)
  4. A cron-job, running on Cheyenne, periodically queries the "proxy service"
    • If the proxy service has a job waiting for it, then it sends the job back to the cron-job as its response
  • If the proxy service has no jobs in its queue, then it sends back an "empty" response
  1. If the cron-job gets a non-empty response, it runs the job.
  2. When the job is complete, the result is sent back to the proxy service
  3. The proxy service sense the result back to the CI job
  4. The CI job completes with the appropriate success/failure based on the Cheyenne test results

Can we mock up something like this fairly quickly? A heroku bot could act as the proxy. Can we do something fairly simple and generic? Perhaps more importantly, has this been done before?

Xarray Documentation Plan

Is there a template from Xarray functions with good functionality? Find a model so these can be consistent, talk to Deepak.

List of functions used in tutorials (to start with -- please add other functions that are commonly used and check which of these are adequately documented):
Xarray.open_dataset
xarray.DataArray.isel
xarray.DataArray.sel
xarray.DataArray.where
xarray.DataSet.groupby
Xarray.DataSet.mean
Xarray.apply_ufunc

Each XDev member with interest in learning Xarray can pick one of these functions, then we have a 10 minute presentation on it during meetings. (One person a meeting?)

Xarray Learning Lesson Blog Series?

After a little exploration to solve an Xarray dilemma with @jukent, I’m wondering if it would be nice to take these lessons and put each one in a new blog post as a “Learning Lesson”.

What do the rest of you think, @NCAR/xdev?

Sprint Ideas for Pangeo Meeting

CC @jukent @andersy005

  • Generalize the Bias Correction code

    • How to deal with different calendars?
    • Different calendar matching options?
  • Open discussion on Xarray

  • intake-esm ideas

    • Integration with Globus?
    • HPSS testing/usage?

Project Automation

Currently, we have the xdev-bot to feed items into the Project Board "Backlog" column. However, I am noticing that issues that are in the xdev repo itself do not show up in the Project Board. ...So, I'm wondering if we can set up that automation.

GitHub can automate the Project Board for us for issues in the same repo as the Project Board itself. That is the case here. However, I think there are some changes that need to happen to make this possible:

  1. The "Backlog" column needs to be changed to be the "To Do" column...I think...and automation needs to be enabled in the Project Board.

  2. We need to make sure the Bot does not do anything to xdev issues, because I think that will interfere with the builtin GitHub automation. I suspect that this just requires check in all of the "Move Card" actions to make sure that the card is not "owned" by the Bot.

Why Jupyter? Blog Post

I’ve been thinking about writing a blog post to answer the question that we always get from scientists: “But what is Jupyter?”

The idea would be to explain how Jupyter is the modern successor to ssh with X11 tunneling. And explain in high level terms how Jupyter does this.

But I only want to write it if people think it would be useful (i.e., someone hasn’t written this up before).

@NCAR/xdev Thoughts?

Improve blog site theme

Need to improve the xdev blog theme. We like the carpet theme, but that only works with Nikola version 7. We need something that we can use with Nikola version 8.

ESM Diagnostics Challenge

Background:

We are thinking about scoping out what the future of an ESM diagnostics package should look like. In general, I believe that the package itself should be agnostic with respect to the actual model used (e.g., CESM). So, thinking about diagnostics from a general sense, I would say that such a package should have the following features:

  • provides data pipelines from ESM output to specific diagnostic products (e.g., images), with the option of saving intermediate states (e.g., climatologies)
  • orchestrate multiple pipelines at once (i.e., workflow orchestration)
  • scale efficiently with the size of the input data
  • leverage the same technology that can be used for interactive analysis (i.e., share the computation routines)

The Xdev Team (@NCAR/xdev) should expect this to be a significant effort that we contribute to over the next year or more.

Issues:

  • What packages already exist to provide the features described above?
  • What features have I missed from the above list?
  • Of existing packages that claim to do some of the necessary steps for ESM diagnostics, which ones can best integrate?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.