GithubHelp home page GithubHelp logo

swcarpentry / hpc-novice Goto Github PK

View Code? Open in Web Editor NEW
36.0 28.0 29.0 6.76 MB

Novice introduction to high performance computing

License: Other

Makefile 2.85% HTML 31.51% CSS 7.50% JavaScript 0.91% R 4.18% Python 52.65% Shell 0.19% Ruby 0.21%
hpc introduction cluster cloud lesson on-hold english

hpc-novice's Introduction

hpc-novice (deprecated)

Development is now happening in the hpc-carpentry GitHub organization. There is an introductory lesson at hpc-intro and other lessons with additional topics.

For announcements and updates about Carpentry style material for HPC training, please join the mailing list.

hpc-novice's People

Contributors

aaren avatar abbycabs avatar abought avatar amyrhoda avatar bkatiemills avatar brandoncurtis avatar christinalk avatar ctb avatar dwinston avatar evanwill avatar fmichonneau avatar ianlee1521 avatar iglpdc avatar jdblischak avatar jduckles avatar jsta avatar k8hertweck avatar mawds avatar maxim-belkin avatar naught101 avatar naupaka avatar neon-ninja avatar pbanaszkiewicz avatar petebachant avatar pipitone avatar rgaiacs avatar synesthesiam avatar tbekolay avatar timtomch avatar wking avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hpc-novice's Issues

Cluster manager specific examples

For the future but long term it'd be great to see examples in all the common cluster managers/schedulers
-eg say; slurm, condor, moab, and pbs

milestone added and how to move forward?

dear all, especially @jduckles and @shwina,

I am opening this issue to collect some feedback. As you might have spotted, I added a milestone as I've been asked to perform a workshop of this kind end of March in Germany. Given that

  1. I am trying to be a good community member and hence do refrain from merging my own PRs
  2. people might be less interested in contributing while I am driving the material forward

I was wondering how to move forward. I personally see 3 possible directions (feel free to add more):

a) I'll develop the material for the above mentioned milestone in my forked repo of hpc-novice and after completion create a huge PR
b) I'll continue pushing a material into a branch of this repo dedicated to the milestone
c) I get your permission to move ahead, push everything I do to gh-pages and we'll review the material later on

I'd appreciate a rather swift response.
Thanks -

target audience description needs tad more motivation for parallel run(s)

In reading the target audience description, I think there needs a tad more that motivates Alice's need to move analysis from her laptop to a lager, parallel capable system. Data too big? All the runs will take too long? Maybe something like...

Alice needs to complete all her runs before her next research group meeting and doing them on her laptop will take days. She needs a more capable compute resource that can finish all her jobs (or maybe her one really big job) in a few hours.

reviving this repo?

I send a message to swcarpentry Discuss to infer if the community is still interested in supporting this repo and pushing its contents.

Possibly cover qbatch

@pipitone and I have been developing a serial farming tool which is cluster-type independent for serial farming.

https://github.com/pipitone/qbatch

Plugs right into the ideas of simple pipelining from shell-novice

Supports SGE and PBS so far, LSF in testing.

PRs also welcome of course for other cluster systems (I have plans for XCode, Slurm)

Current draft of HPC novice

hi!
I'm teaching a 1-day workshop on HPC at the end of May, I'd like to use HPC Carpentry material instead of preparing my own and provide some feedback and improvements.
Looking around I think the closer material to hpc-novice is https://github.com/psteinb/hpc-in-a-day from @psteinb.
Should I start from there or anybody has a better suggestion?
Thanks!

who is the maintainer of this repo?

@jduckles @gvwilson who is the maintainer of this repo? there is some constructive activity currently and I am a bit afraid that so far, no one with contributor permissions has replied to any PR or issue here. Can you please help out?

thanks -
P

merging new lesson template in

hi to all, I was wondering if someone could merge the most recent swc lesson template in or at least give me a hand in doing so? if people have some layout templates at hand that they would love to share, feel free to submit PRs.

sprint to finalize this repository?

Folks,
I'm leading the organization of this conference and as part of it we would like to have some smallish "getting-things-done" 1-day sessions.
Would people be interested in participating (in person) in such a session?
Thanks,
Davide

longer introduction to ssh.

Hi,

I'm going through the lesson material, and the step from 11-hpc-intro.md to 12-cluster.md feels a bit rough.

Go ahead and log in to the cluster.
```
[user@laptop]$ ssh remote
```
{: .bash}


Very often, many users are tempted to think of a high-performance computing installation as one
giant, magical machine. Sometimes, people will assume that the computer they've logged onto is the
entire computing cluster. So what's really happening? What computer have we logged on to? The name
of the current computer we are logged onto can be checked with the `hostname` command. (Clever users
will notice that the current hostname is also part of our prompt!)

```
[remote]$ hostname
```

Now we can assume that user now about shell, and guess that ssh is a command and remote is an argument. But from experience assuming that they understand the [remote]$ prompt is a prompt of the remote machine if a big leap of faith.

I see a number of step missing here that IMHO need to be explained.:

  • ssh allow you to connect to a remote machine – as if you plugged a screen/keyboard/(mouse?) to a remote computer and open a terminal.
  • the remote argument is not actually the word remote, but the actual remote address of the cluster (given to you by your admin). You also (likely) want to prefix it by <username>@
  • You will (likely) need to type your password and it won't show up on the screen while you type.
  • if the password is correct your terminal should now see a welcome message from the cluster, and everything you run in this terminal is now executed on a remote machine.
    (details may varies between installations)

I believe that would be the basic of what need to be covered, but I feel like understanding local machine vs remote machine and when we are where is critical. Alternatively this could be in a separate lessons, but I don't see it in the shell-novices.

SC16 Workshop Call

Moving Discussion from this [Discuss] thread to GitHub

Here is the call from Paul Wilson:
sc16-tutorials-call.pdf

It seems there is some momentum to put a tutorial together and do some lesson development sprinting over the summer. From the [Discuss] thread it looks like Ashwin Trikuta, Dana Brunson, and Kate Hertweck seem to have thought about HPC and done some work in the context SWC pedagogy and workshop methods. There are lots of others with material for particular systems, so the first thing to decide is probably what should be included and what should be left out within the length of time allotted at SC tutorials.

If we're going to pull off an hpc-carpnetry workshop for SC16, I suggest we use this thread to form the team, then start opening issues in this repository and get hacking.

suggestions for 00-intro

Suggestions of key points to cover:

  • clusters aren't just "super fast" computers
  • importance is parallelization -- running things simultaneously (somehow)
  • talk about sample problems that parallelize well (and touch on the fact that not all problems will parallelize well)
  • requirements to use distributed system: parallel code OR capability to submit parallel jobs OR just one big problem you want off of your computer

Conclusion: this workshop will show you not only how to use a cluster/distributed system but when/why you would want to use one and how to use it WELL.

Explain resources

Explain the resource before " "30-batch-system.md".

  1. What is a node
  2. What is a core, what do you get when you ask for "ntasks" "cpu"
  3. What will happen if number of nodes not specified (where in how many nodes will your job be distributed)

feedback needed on explaining Amdahl's law

I am closing in on reaching version 1.0 for hpc-in-a-day. After this is done, I'll start sending PRs to this repo.

After some debate, I decided to include Amdahl's law in my material. I know that hpc-novice is likely not to adapt this part of my course about parallelisation. But in any case, I'd love to get some feedback on whether I got the level right:
https://psteinb.github.io/hpc-in-a-day/03-02-parallel-estimate/

I'll close this issue once the first comments come in on hpc-in-a-day or when my next workshop is over on Jan 24th.
Thanks in advance - P

Ideas for the Introduction lesson

Couple comments on the "00-why-hpc" introduction lesson.

  • <trivial, but> We are very used to calling our machines "clusters". But this jargonism term needs to be introduced before first use ("An HPC system, often informally called "cluster" because it is made of many individual computers", etc).
  • Both Key Points of the lesson are concentrating very much on the distributed/parallel character of the clusters ("...your computations require more than one computer" and "Because a cluster is distributed, it is only useful for certain types of computational problems"). This feels like a bit too general of a statement, and it effectively contradicts one of the declared target audience groups (those who "already write and run domain-specific software on "smaller" computers, and now need to scale up/out"). I would argue that even the most serial code can benefit from being ran on a cluster node vs Alice's laptop - just because it would not swap as much due to sheer amount of the node's RAM :)

Additionally, I think that there needs to be a "01-what-is-a-cluster" (a.k.a "General anatomy of an HPC system") section as part of the introduction lesson. Nothing major, but a visual diagram and a narrative along the lines of "The cluster is made of login nodes, compute nodes, fast interconnect fabric, scheduling system and attached storage. We will cover each of them in details later, but here's an overview how it all plays together".

Having this general overview before delving into specifics of each individual subsystem would create a good reference point, especially for people with less technical backgrounds.

choice of scheduler

@zonca wondered which scheduler to aim for. I'd vote for starting with SLURM as the majority of systems I am able to use run on it. I personally would volunteer to convert the slurm commands to other batch systems. I have access to a PBS and a LSF system.

technically, I'd be nice to have some (javascript?) magic available as the Apache Spark docs have (check out how one can switch between the Scala version of the example codes and the python version).

target audience?

what is the target audience for this lesson?

for me it's the same target audience as any other swcarpentry-novice courses, namely those learners that need the content. I'd assume nothing about them. prior knowledge about using the terminal is beneficial but not required.

of course this question is tightly bound to what this course should teach.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.