earthlab / abc-classroom Goto Github PK
View Code? Open in Web Editor NEWTools to automate github classroom and autograding workflows
Home Page: https://abc-classroom.readthedocs.io/
License: BSD 3-Clause "New" or "Revised" License
Tools to automate github classroom and autograding workflows
Home Page: https://abc-classroom.readthedocs.io/
License: BSD 3-Clause "New" or "Revised" License
Please see this repo for my submitted assignment!
https://github.com/earth-analytics-edu/ea-spring-2019-lwasser/commits/master
This appears to be a kernel error. see circleCI feedback here
jupyter_client.kernelspec.NoSuchKernel: No such kernel named conda-env-earth-analytics-python-py
Exited with code 1
But i also wonder why check() isn't
working locally and if this is all related.
hey @betatim so i submitted a test partially complete assignment over the weekend her. See last commit: 3de3e428081703ba5e1b546191c8c21f0bc4c2ce
Circled Ci failed to run it LOOKS like it was hung up in the ok module
Here is my assignment
https://github.com/lwasser/autograded-course-starter/blob/master/master/week-05/time-series-homework-wk-5.ipynb
I think i may need to clean it up more but am not sure where we landed with the points and grading. Can you kindly have a look so i better understand what's going on with circle ci?
The create template repo should do the following
This is a clever feature. I wonder if we want to stop a user from doing this by mistake!!?? Maybe if a user selects a date runing nbauthor --date that is before now
we can ask them if that is their intended task?
Or better yet what if it doesn't ever remove lessons but we can add a remove assignment if we want. that way removing things is more purposeful. I will look more into this tomorrow but i was surprised by all of my student/* dirs disappearing when i chose 2018-01-01 as a date by mistake
When asking the user for a commit message we check the EDITOR
and VISUAL
environment variables to try and figure out what (command-line?) editor the user prefers. If we can't find anything vi
is used. There is no particularly good reason to use vi
as fallback.
There are two things to do:
Have nbdistribute --template
fail gracefully when the repo already exists.
When running nbinit
check localconfig.yaml
for existing token and reuse it.
If there is no token then use something unique in the note for creating the token as we can't create two tokens with the same name.
In the hackmd document, @lwasser notes that we could write a little bash script to install all of the nb extensions. This file is easy to make and needs to be added to the workflow.
previous there was an issue with nbdistribute that i thought i'd fixed with a pr -- but now i'm getting another error here - it's returning the appropriate message but still trying to clone their repo.
Fetching work for jlpalomino...
Student jlpalomino does not have a repository for this course, maybe they have not accepted the invitation yet? Skipping them for now.
Traceback (most recent call last):
File "/Users/leah-su/anaconda3/envs/earth-analytics-python/bin/nbdistribute", line 11, in <module>
load_entry_point('grading', 'console_scripts', 'nbdistribute')()
File "/Users/leah-su/Documents/github/2-autograding/grading-workflow-experiments/grading/__main__.py", line 253, in distribute
token=config['github']['token'])
File "/Users/leah-su/Documents/github/2-autograding/grading-workflow-experiments/grading/github.py", line 68, in fetch_student
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
File "/Users/leah-su/anaconda3/envs/earth-analytics-python/lib/python3.6/subprocess.py", line 418, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['git', 'clone', 'https://[email protected]/earth-analytics-edu/ea-spring-2019-jlpalomino.git']' returned non-zero exit status 128.
We will create a clean, new, simple example repository (the course-starter
repo) that people can use to start a new course. It will contain:
config.yml
that will be filled out by readers when they follow the quick startCreate a way to have a "dummy-student" directory locally that you can "distribute" to and then "grade". It would allow the instructor to pretend to be a student, attempt the assignment, see what the assigned grade is.
There will be a quick start guide (hackmd to get going) to walk people through the process of creating a new class based on the quick start repository.
course-starter
repo and end with them having a fully working courseThis quick start guide will be part of the documentation (RTDs) that is in this repository.
There will be additional documentation about the components in this repository as well.
Need to modify the repo so that autodoc works. The goal is to be able to run `make -B docs' on the repo to build all the docs.
This issue actually addresses the larger workflow:
#58
Note that some of the functions we may not use for now:
https://github.com/earthlab/abc-classroom/blob/master/abcclassroom/github.py#L76
Currently each time i merge a PR, the branches in that repo build up (please see image below). I think we need to clean up the branch created each time by the PR. otherwise it could get difficult to manage. i am not sure what the best approach to this would be. One option could be to clean up the previous branch each time a new pr is created? so then there is always just one branch that is there unless the student deletes it when they merge the PR. Or could the PR come from the template repo rather than from a branch within the student's repo? that would be the cleanest option. i'm just unclear as to what is happening with this PR process.
pygithub is more active and has many more users. github3 hasn't been updated in over a year.
Let's consider moving to pygithub
Below is some quick code that creates a repo in the earth analytics edu org!
The one thing that i couldn't find examples for was authenticating and creating a token which does exist now via abc-init. but it seems like this should be doable via subprocess if need be altho i did see an authenticator in pygithub just no good examples of calling it.
from github import Github
# using username and password
g = Github(gh_username, password )
g = Github("token-string-here")
organization = g.get_organization("earth-analytics-edu")
# Get all repos and print them out for the earth analytics edu org
for repo in organization.get_repos():
print(repo.name)
organization.create_repo(
name="test-template",
allow_rebase_merge=True,
auto_init=False,
description="yaas done from the API",
has_issues=True,
has_projects=False,
has_wiki=False,
private=True,
)
This can then be used to
We can potentially use other parts of jed's scripts OR abc-classroom to create and initialize the local repo. there is an init function here in abc-classroom.
Create an rst file that overviews:
What does this command line call do?
what does it require
-- git needs to be setup locally
-- will ask for username and password
Output:
You can use the github.rst file to flesh out these docs!
again this should be straight forward! we want to follow the exact structure of matplotcheck and earthpy in this setup.
The code below provides a solution for a student plot. At the bottom it has this
### DO NOT REMOVE LINE BELOW ###
ax1 = nb.convert_axes(plt)
I believe that we need to grab the axis object and not call plt.show() but i'm not 100% sure. I am curious why that code disappeared for the first PLOT assignment cell of homework 5 but not for the second.
# For PLOT 1&2: use the data/colorado-flood/precipitation/805325-precip-dailysum-2003-2013.csv file
# PLOT 1: a plot of precipitation from 2003 to 2013
# DO NOT USE plt.show() anywhere in this cell
# BEGIN SOLUTION
f = "data/colorado-flood/discharge/06730200-discharge-daily-1986-2013.txt"
discharge = pd.read_csv(f,
skiprows=23,
header=[1, 2],
sep='\t',
parse_dates=[2])
discharge.columns = discharge.columns.droplevel(1)
discharge = discharge.set_index(["datetime"])
monthly_max_all = discharge.resample("M").max()
monthly_max = monthly_max_all['1990':'2014']
# Create scatter plot
fig, ax = plt.subplots(figsize=(10, 8))
ax.scatter(x=monthly_max.index,
y=monthly_max["17663_00060_00003"],
color="purple")
ax.set(xlabel='Date', ylabel='Daily Precipitation (inches)',
title='Homework plot 1: Precipitation-Boulder, CO\n 2013-2014')
# END SOLUTION
### DO NOT REMOVE LINE BELOW ###
ax1 = nb.convert_axes(plt)
When i run the cell with check("time-series-homework-wk-5/q-a34ccf2.py")
it doesn't know how to find that function. Where is that function and are there standard imports that belong at the top of all notebooks?
here is the cell for plot 2. it seems to retain that call to grab the plot.
I found the code in nbclean but am unsure if there is the ability to parse each cell. The alternative would be to allow for one comment to persist at the top of the cell.
Below an issue I might file with github3.py if I cna't work out why this is happening. Tim is seeing a whole bunch of info
level log messages from github3.py which shouldn't appear.
There are two places (here and here) in the library where log messages are emitted that come "out of nowhere" for a user. When using github3 as a library to automate some tasks on GitHub I was surprised to see the messages. What I don't understand is why I see these.
My library (from which I call github3.py) doesn't import logging
anywhere, the code in github3 uses a package specific logger, and the default log level is warning so the info messages should not appear.
I searched for logging, logger and log to see if I could find relevant discussion but couldn't.
Python 3.6.7 |Anaconda, Inc.| (default, Oct 23 2018, 14:01:38)
I can't reproduce it with a minimal example. This is part of the mystery.
This package needs a better name, especially as it grows up and starts being useful instead of a collection of experiments.
My current proposal: "teaching and learning with notebooks" shortened to tln
(or tlnb
). Commands would become tln-distribute
and tln-author
instead of nbdistribute
and nbauthor
.
It is descriptive, can be shortened to a TLA (three letter acronym) and for vanity contains the initials of Tim and Leah ๐
Scrub all print outs/tracebacks so that they do not contain access tokens.
This probably means we need to wrap all calls in try: except:
blocks where we customise the printouts. Currently some of the git clone
calls will reveal the users personal access token in the log. They might then copy&paste that to a help forum or similar.
Update the readme and the contributing file with installation instructions:
https://hackmd.io/_4GDiG9wSwe6lBq18bDM7w#Step-2-Setup-Your-Local-Python-Environment
@betatim if it does, i can go ahead and add this today as well.
I pretended that i was a student. i downloaded week 5. but i don't think i made any purposeful changes. Regardless, when i just pushed week 6, i ended up with a potential conflict where git asked me to change my edits to week 5.
Because this is inherit to github, i think we should manage nbauthor
and nbdistribute
so that you can push a single assignment at a time. And once that is pushed, you can't update it unless there is some clever way to update something a student is already working on without a conflict. I currently am not sure how to accept the new assignment without stashing or deleting /checking out my work to week 5.
Right now abc-author
adds all assignments to the student directory. Similarly abc-distribute
will generate a PR that modifies everything.
Instead we want to have abc-author assignment1
which only adds assignment1 related files to the already existing student/
directory. abc-author -d assignment1
would remove the assignment from the student/
directory again.
abc-distribute
(w/o arguments) would generate a PR based on everything in student/
(copy over whole student/
directory). abc-distribute assignment1
would only create a PR with changes for that assignment (copy over only the assignment specific sub-directory from student/
).
One thing that will be tricky to figure out is the .circleci/config.yaml
which we will need to edit as we do this.
Below are items that we can work on now associated with populating the template repo. I'd like the abc-create-template-repo (not a good function call name but just a start) to automatically create the readme and .gitignore like jed does. let's add these as helpers in a
create-repo or a git module?? i like jed's approach but i'd like to customize it to use the config file.
Jed's approach: Jed's approach is interesting because
we could populate that with
so it would function like jeds. but it could also be added to if need be.
I think we could similarly use jed's add_readme function but again make it a bit more flexible to grab information from the config file based upon the assignment name.
Similar to the .gitignore about, we'd want it to create the readme but provide optional text that a user could create via a yaml element associated with the assignment. if they dont' create text something default would be populated like what jed has.
just in case, i think we should have the option of turning "off" the auto add readme file with a yaml element that is boolean (optional and defaults to TRUE).
The big difference with this approach vs abc classroom is abc classroom has a list of files and it looks for those files. this approach creates the files for the user and allows them to customize them (or not). This seems like a nice middle ground.
Ideas for grading in containers https://github.com/data-8/materials-x18/tree/master/grading
Create two steps per notebook in circle CI config one for papermill
and one for nbconvert
. Otherwise we won't get an artifact for failed notebooks as circle stops executing a step once a command fails.
There are a few ways to handle rosters in our current grading workflow. Here are some of hte issues
@jlpalomino can you kindly add some clarification around the pieces needed in a roster? i think we just need to streamline our scripts to ensure all of the pieces are there. Will a first and last name ever be available in the github classroom roster? ie if you add me as a student will my first name and last name be there?
When running nbdistribute
check there are no open PRs from the instructor in a student's repository.
If there is an open one close it before creating a new one.
@jlpalomino and I spent some time working through the workflow for github classroom.
Essentially, we'd like to take the scripts here and turn them into components for abc classroom that can be run at the CLI.
The high level workflow is outlined below. From this we will have several issues and questions to address before we can do further work! The one SIGNIFICANT chance from those scripts will be using a config.yml file like we use here in abc-classroom. More on that will be specified below.
Below we begin to proposed the structure of a config file. This config file would map closely to abc-classroom's existing functionality however, it would have some additional components as well.
config.yml assumptions:
- org_name:
- course_name:
- assignments:
# List all assignments as they are created here.
- assignment-name-one
- deadline:
- ???
# Additional files that will always be added to the student assignment repository
# In this example the file called `student_README.md` will be copied to the
# student's repository as `README.md`
- assignment_repo_files:
- README.md: student_README.md
- environment.yml: environment.yml
- .gitignore: .gitignore
The above is a start to the config file use. Note that the original config also included student GH usernames. there will need to be an independent discussion surrounding how the student roster is populated.
UPDATE: I DON'T BELIEVE THAT WE CAN AUTOMATE CREATING A GITHUB CLASS. This is ok as it's a one time step.
Here the instructor needs to create an assignment that will be distributed to students. It involves several steps.
These tasks can all be implemented using nbgrader
. It would be good for us to document this in our docs just to have it here. We have some of this documented in our hackmd document. Earth Lab can develop these docs.
The steps here are contingent upon another question - I think a wrapper is created that both creates the assignment via nbgrader and produces the github repo all in one step. But you could also do these things independently to make this more versatile. we have all of the information needed at this step to implement both.
Here is the github module with the GH functions
AND here it the function that we want to call - YAY!! essentially this should make the template and push it to our org!
This should be possible. as a baby step we have a abc-create-assignment
I suggest that for now we refactor our scripts to implement the steps below using the config file which will remove many of the arguments that are used over and over again. i think each function will just need to call the assignment name and the rest should work... More soon on the roster issue which is a whole different can of worms.
If there is a environment.yml
teach CircleCI to setup the environment and run stuff in it.
Set the kernel name to just "Python" when running a notebook in papermill
Tag a cell with a name bob1
and a tag for number of points p=15
. When creating the student/teacher version of the notebook we create a JSON file for the teacher in which we look up how much a question named bob1
is worth.
I think i'd prefer that we tag cells to be graded with something unique that then gets mapped when grading locally as one option? perhaps there are other options too!
Also - i'm not sure how to access my solution when grading. ie how to you build private tests. Right now it just cleans the solution from the notebook but where does that solution go and how do we use it to grade a notebook?
Finally private & public tests do not seem to run in my student notebook. yet they work in my master notebook. Oddly i don't even get an error i just get nothing when i run that cell.
As an example i just completed and submitted a homework assignment. And it just doesn't seem to run any tests locally public or private. please see here:
Below is a list of things that we'd like to check for in autograding.
Pep8
code format: one way to do this may be to write the notebook out as a .py file and run... flake8
? This won't check for capital comments however like this # Capitalize comments
Implement a way to assign grades using a late grade policy implemented via the config. This could be built as follows:
- lwasser
due: 2018-10-15
NOTE: I could see an issue where a student submits something (maybe even something they don't want graded) after the deadline. Do need a way to flag whether to grade a student using the commits that are AFTER the deadline or not?
via email we discussed
Can we modify the deadline for a student if need be via a yaml file?
Could we apply penalties to late assignments โ like
- 10% off after if submitted within 0- 24 hrs of deadline
- 20% 24-48, etc?
Penalties like this should be straightforward.
For per student things I'd go with handling it via command-line
arguments. So you'd run `nbgrade --student betatim --penalty 0
-assignment week1` to grade betatim's work for the week1 assignment
without any penalty. My impression is that handling individua cases
like this will be the exception and that each case will require some
different combo of penalty, assignment, student etc.
This is how I see the trade-off: Being able to specify things on the
command-line gives you full flexibility, but also means you have to
type it out each time. Configuring via the config file means easier to
repeat but more limited because we have to design what kinds of
modifications are possible up front. What do you think?
I could imagine this part happening at the command line and would involve a yaml file with individual student deadlines (only for exceptions, otherwise all would be the same) I might have 1-3 exceptions for any given assignment on average. Often less.
When trying to create a template repository that already exists print a useful error message.
Create a tool that an instructor can run to get an overview of which students have/don't have accepted the invitation to the course by checking if there is a repository for them
Started in #16 (comment)
Add a command to take https://github.com/betatim/autograded-course-starter and create a new, fresh course from it
We don't have any tests yet. Step one is figuring out how to do this nicely, step two is to write some tests.
abc-init
to work. If it does work for you, document how it works in the hackmd document (just add a new section)straight forward task. just need to start it.
Following this meta issue can we please have a way to both build notebooks (done) and then submit / grade them locally so we don't always have to wait for circle ci and push to github?
Following the notes here you can create an .rst file that describes the nbgrader class setup.
Create section on assignment setup here too.
Create-an assignment might work well as a separate page.
following earthpy and matplotcheck... this should be easy.
When fetching student repositories fail with a useful error message when the student repo does not exist.
Hey @betatim just wanted to document some of the issues i'm having with the CLI stuff.
I totally see how this is coming together. it's just not working for me quite yet so i'm hoping we can talk through this together today!!
nbinit
MacBook-Pro-4:autograded-course-starter leah-su$ nbinit
GitHub username: lwasser
Password for lwasser:
Traceback (most recent call last):
File "/Users/leah-su/anaconda3/bin/nbinit", line 11, in <module>
sys.exit(init())
File "/Users/leah-su/anaconda3/lib/python3.6/site-packages/grading/__main__.py", line 56, in init
two_factor_callback=two_factor)
File "/Users/leah-su/anaconda3/lib/python3.6/site-packages/github3/api.py", line 26, in deprecation_wrapper
return func(*args, **kwargs)
File "/Users/leah-su/anaconda3/lib/python3.6/site-packages/github3/api.py", line 59, in authorize
client_secret)
File "/Users/leah-su/anaconda3/lib/python3.6/site-packages/github3/github.py", line 462, in authorize
json = self._json(self._post(url, data=data), 201)
File "/Users/leah-su/anaconda3/lib/python3.6/site-packages/github3/models.py", line 156, in _json
raise exceptions.error_for(response)
github3.exceptions.UnprocessableEntity: 422 Validation Failed
nbdistribute --template
note: this course i think exists but if it exists can it fail gracefully please?
MacBook-Pro-4:autograded-course-starter leah-su$ nbdistribute --template
Using /Users/leah-su/Documents/github/2-autograding/autograded-course-starter/student to create the student template.
Loading configuration from config.yml
Creating template repository.
Traceback (most recent call last):
File "/Users/leah-su/anaconda3/bin/nbdistribute", line 11, in <module>
sys.exit(distribute())
File "/Users/leah-su/anaconda3/lib/python3.6/site-packages/grading/__main__.py", line 131, in distribute
config['github']['token'])
File "/Users/leah-su/anaconda3/lib/python3.6/site-packages/ruamel/yaml/comments.py", line 747, in __getitem__
return ordereddict.__getitem__(self, key)
KeyError: 'github'
nbauthor --date 2018-01-01
this works well so far! haven't tested working on a notebook yet.
This is still failing and i'm not sure why.
MacBook-Pro-4:autograded-course-starter leah-su$ nbdistribute
Using /Users/leah-su/Documents/github/2-autograding/autograded-course-starter/student to create the student template.
Loading configuration from config.yml
Fetching work for betatim...
Traceback (most recent call last):
File "/Users/leah-su/anaconda3/bin/nbdistribute", line 11, in <module>
sys.exit(distribute())
File "/Users/leah-su/anaconda3/lib/python3.6/site-packages/grading/__main__.py", line 144, in distribute
token=config['github']['token'])
File "/Users/leah-su/anaconda3/lib/python3.6/site-packages/ruamel/yaml/comments.py", line 747, in __getitem__
return ordereddict.__getitem__(self, key)
KeyError: 'github'
I am missing something here. i just made new assignments but i only see one week1 folder in my student dir.
MacBook-Pro-4:autograded-course-starter leah-su$ nbauthor --date 2018-12-01
Processing /Users/leah-su/Documents/github/2-autograding/autograded-course-starter/master/week1/homework.ipynb
Processing /Users/leah-su/Documents/github/2-autograding/autograded-course-starter/master/week-03/time-series-homework.ipynb
Processing /Users/leah-su/Documents/github/2-autograding/autograded-course-starter/master/week-04/time-series-homework-2.ipynb
Inspect `/Users/leah-su/Documents/github/2-autograding/autograded-course-starter/student/` to check it looks as you expect.
MacBook-Pro-4:autograded-course-starter leah-su$
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.