microsoft / pybryt Goto Github PK

View Code? Open in Web Editor NEW

59.0 7.0 18.0 1.8 MB

Python library for pedagogical auto-assessment

Home Page: https://microsoft.github.io/pybryt

License: MIT License

Python 99.59% Makefile 0.41%

auto-assessment pybryt-library python-library educators

pybryt's Introduction

PyBryt - Python Library

PyBryt is an auto-assessment Python library for teaching and learning.

The PyBryt Library is a FREE Open Source Python Library that provides auto assessment of grading submissions. Our goal is to empower students and educators to learn about technology through fun, guided, hands-on content aimed at specific learning goals.
The PyBryt Library is a Open Source Python Library - focused on the auto assessment and validation of Python coding.
The PyBryt library has been developed under open source to support learning and training institutions to auto assess the work completed by learners.
The PyBryt Library will work existing auto grading solution such as Otter Grader, OkPy or Autolab.

Features

Educators and Institutions can leverage the PyBryt Library to integrate auto assessment and reference models to hands on labs and assessments.

Educators do not have to enforce the structure of the solution;
Learner practice the design process,code design and implemented solution;
Meaningful & pedagogical feedback to the learners;
Analysis of complexity within the learners solution;
Plagiarism detection and support for reference solutions;
Easy integration into existing organizational or institutional grading infrastructure.

Getting Started

See the Getting Started page on the pybryt documentation for steps to install and use pybryt for the first time. You can also check the Microsoft Learn interactive modules on Introductions to PyBryt and Advanced PyBryt to learn more about to use the library to autoassess your learners activities.

Testing

To run the demos, all demos are located in the demo folder.

First install PyBryt with pip:

pip install pybryt

Simply launch the index.ipynb notebook in each of the directories under demo from Jupyter Notebook, which demonstrates the process of using PyBryt to assess student submissions.

Technical Report

We continuously interact with computerized systems to achieve goals and perform tasks in our personal and professional lives. Therefore, the ability to program such systems is a skill needed by everyone. Consequently, computational thinking skills are essential for everyone, which creates a challenge for the educational system to teach these skills at scale and allow students to practice these skills. To address this challenge, we present a novel approach to providing formative feedback to students on programming assignments. Our approach uses dynamic evaluation to trace intermediate results generated by student's code and compares them to the reference implementation provided by their teachers. We have implemented this method as a Python library and demonstrate its use to give students relevant feedback on their work while allowing teachers to challenge their students' computational thinking skills. Paper available at PyBryt: auto-assessment and auto-grading for computational thinking

Citing Technical Report

@misc{pyles2021pybryt,
      title={PyBryt: auto-assessment and auto-grading for computational thinking}, 
      author={Christopher Pyles and Francois van Schalkwyk and Gerard J. Gorman and Marijan Beg and Lee Stott and Nir Levy and Ran Gilad-Bachrach},
      year={2021},
      eprint={2112.02144},
      archivePrefix={arXiv},
      primaryClass={cs.HC}
}

Citing of Codebase

Please use the citing this repositry on the repo menu or citation.cff file in the root of this repo.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

pybryt's People

Contributors

Stargazers

Watchers

Forkers

global-localhost global19 global19-atlassian-net donjayamanne tonybaloney chrispyles ekmixon qpc-database standardgalactic ranigb mitul3737 abrildur jimsow ytbryan test-mass-forker-org-1 markpatterson27 zabehcruz asker67743

pybryt's Issues

pybryt execute default output file name

when running pybryt execute abc.ipynb without specifying output path the documentation says:

If OUTPUT is unspecified, this defaults to
"./{SUBM.stem}.pkl" (e.g. for SUBM "submissions/subm01.ipynb", this is
"./subm01.pkl").

and therefore for the command pybryt execute abc.ipynb I expect that the file 'abc.pkl' will be created. However, the file student.pkl is created instead.

Test formatting

The tests directory is missing many docstrings and type hints. These should be added to make it easier to maintain. Also consider breaking up some of the longer methods for testing whole objects into smaller, more targeted ones that test specific behaviors.

Add an annotation asserting that a value is a return value

Add an annotation that is similar to a value annotation but which specifically asserts that a value was returned by a function (that is, was the value of arg when the trace function was called for the return event).

Consider matrix exponentiation A^p when p == 1: the matrix A always shows up in the memory footprint as long as it's used inside the function body, but the return value of the function is not necessarily A. This may also help design references for problems that suffer from the issues described in #62.

Make `check` context accept an annotation group parameter

StudentImplementations and ReferenceImplementations support checking specific groups of annotations in a single reference using a group parameter. This feature should be added to the check context.

Checking variable value within the check context

Describe the bug
We want the student to perform some computations in a notebook cell and store the answer in a variable. In the next cell, we want to check the value of that variable using PyBryt. We write:

with pybryt.check(...):
    # variable or operations on a variable

We expect that everything stored in memory within the context will be available to PyBryt for validation. However, we are not able to access the variable value unless we write a workaround function.

To Reproduce
We demonstrate the error in 02-code-outside-functions example in https://github.com/marijanbeg/pybryt-examples.

Expected behavior
We expect the behaviour similar to:

with pybryt.check(...):
    student_variable # or
    # tmpvar = student_variable

to expose the value of the variable to PyBryt.

Multiple exercises/assignments in a single notebook

Very often, in tutorial notebooks, there are exercises allowing students to self-assess their understanding of the material. It would be great if PyBryt would be able to cover this use case. More precisely:

Allow having multiple exercises/assignments in a single notebook
Test each exercise solution separately
Give feedback for each solution separately

It is not easy for me to see what the right user interface would be so that:

The additional code students have to write (if any) to instruct PyBryt what code belongs to an exercise and what to execute is minimal
Flexibility in terms of the number of cells used for a solution is allowed (solution is not bound to a particular cell number)
Solution code is isolated from the rest of the notebook (students are expected to write each solution from scratch and do not reuse variables from the previous parts of the notebook)

Possibly, a similar solution to tracing_on() and tracing_off() functions would be the most elegant solution. This way, a teacher could insert two extra cells (tracing_on('exercise-1.1') and tracing_off('exercise-1.1')) before and after the cell(s) where the solution is expected to be.

Exposed equivalence operator

Is your feature request related to a problem? Please describe.
At the moment, we annotate the value in the reference solution, and PyBryt checks if that value is present in the student's implementation. The way the equivalence is determined between two values (reference and student values) depends strongly on what is implemented in PyByrt (e.g. tolerances, invariants, etc.). Exposing the equivalence function to the user would allow any equivalence check, invariant, tolerance, etc. to be defined by the user in the reference solution. This way everything is exposed to the user, and the user does not rely on whether that functionality was implemented in PyBryt or not.

Describe the solution you'd like
Let us say we want to check if the student has a list in their memory footprint and we want all values to be within some absolute and/or relative tolerance.

def my_eq1(ref, value):
    return np.allclose(ref, value, atol, rtol=1e-5)  # here value can be any iterable

def my_eq2(ref, value):
    return isinstance(value, list) and np.allclose(ref, value, atol, rtol=1e-5)  # here value must be a list

def my_eq3(ref, value):
    return isinstance(value, (list, tuple)) and np.allclose(ref, value, atol, rtol=1e-5)  # here value can be a list or a tuple


pybryt.Value(solution, eq=my_eqX)  # because equivalence is exposed, full freedom is given to the user

Simplified user interface for temporal annotations

Is your feature request related to a problem? Please describe.
We find the current interface for defining temporal annotations a bit too complicated and error-prone. Have you considered the solution to define a "container" to which individual value annotations are added? The order of annotations in the container would define "before" and "after" relationships. This could also simplify the way of defining success_message and failure_message for temporal annotations.

An example of this code can be found as an example 04-temporal-annotations in https://github.com/marijanbeg/pybryt-examples repository.

Describe the solution you'd like
The current reference implementation for the Lucas series using temporal annotations is:

import pybryt
import numpy as np


def lucas(n):
    lucas_series = np.zeros(n, dtype=int)

    lucas_series[0] = 2
    curr_value = pybryt.Value(lucas_series,
                              name='first_element',
                              success_message='SUCCESS 1: Your first element is correct.',
                              failure_message='ERROR 1: Please check your first element in the series.')
    if n == 1:
        return lucas_series

    lucas_series[1] = 1
    new_value = pybryt.Value(lucas_series,
                             name='second_element',
                             success_message='SUCCESS 2: Your second element is correct.',
                             failure_message='ERROR 2: Please check your second element in the series.')
    curr_value.before(new_value)
    curr_value = new_value
    if n == 2:
        return lucas_series

    for i in range(2, n):
        lucas_series[i] = lucas_series[i-1] + lucas_series[i-2]
        new_value = pybryt.Value(lucas_series,
                                 name='other_elements',
                                 success_message='SUCCESS 3: You are generating n>2 elements right.',
                                 failure_message='ERROR 3: Hmmm... Are you summing the previous two elements right?')
        curr_value.before(new_value)
        curr_value = new_value

    return lucas_series


pybryt.Value(lucas(20),
             name='final',
             success_message='SUCCESS 4: Amazing! Your final solution is correct.',
             failure_message='ERROR 4: The Lucas series you computed is wrong.')

A suggestion for a possible simplification could be:

# This code is not working. It is a suggestion for defining temporal annotations.
import pybryt
import numpy as np


def lucas(n):
    container = pybryt.Container(sucess_message='SUCCESS 0: ...',
                                 failure_message='ERROR 0: ...')  # Container
    lucas_series = np.zeros(n, dtype=int)

    lucas_series[0] = 2
    container.add(pybryt.Value(lucas_series,
                               name='first_element',
                               success_message='SUCCESS 1: Your first element is correct.',
                               failure_message='ERROR 1: Please check your first element in the series.'))
    if n == 1:
        return lucas_series

    lucas_series[1] = 1
    container.add(pybryt.Value(lucas_series,
                              name='second_element',
                              success_message='SUCCESS 2: Your second element is correct.',
                              failure_message='ERROR 2: Please check your second element in the series.'))
    
    if n == 2:
        return lucas_series

    for i in range(2, n):
        lucas_series[i] = lucas_series[i-1] + lucas_series[i-2]
        container.add(pybryt.Value(lucas_series,
                                  name='other_elements',
                                  success_message='SUCCESS 3: You are generating n>2 elements right.',
                                  failure_message='ERROR 3: Hmmm... Are you summing the previous two elements right?'))

    return lucas_series


pybryt.Value(lucas(20),
             name='final',
             success_message='SUCCESS 4: Amazing! Your final solution is correct.',
             failure_message='ERROR 4: The Lucas series you computed is wrong.')

Annotation for requiring/forbidding the use of specific functions

Instructors should be able to write annotations for asserting the use or non-use of specific functions.

This change is somewhat complicated, as it will involve editing the trace function to track information that it is not currently tracking (which function is being traced and to what package it belongs, namely).

Edit `pybryt.utils.save_notebook` to only save in the classic Jupyter Notebook interface

The method used to force-save a notebook in pybryt.utils.save_notebook is incompatible with JuppyterLab and VSCode. This method should only be used when a user is in the classic notebook interface. The function should determine which interface a user is using and force save differently based on that information.

Tolerances not implemented for iterables

Describe the bug
When comparing values, tolerances allow the difference between the student's and reference values. Tolerances are implemented only for numeric (numbers.Real) types. In the case of iterables, the comparison should be performed element-wise, similar to np.allclose behaviour. This allows for two iterables to be considered equal if they are within tolerance.

Will be addressed in PR.

Caching mechanism for checks

Implement a mechanism for caching the memory footprints (and results?) generated from checks.

Thinking of pickling the footprints/results and storing the files in a hidden .pybryt_cache directory within the working directory. Use a filename structure like {ref.name}_footprint.pkl and {ref.name}_results.pkl.

Scaffold for custom complexity classes

Create a scaffold for custom complexity classes by subclassing pybryt.complexities.complexity. Need to add a field to the annotation to allow these to be considered when the annotation is created.

Add a public API for checking values against Value annotations

As titled. For example, to check whether a value annotation is satisfied by some object:

v = pybryt.Value(obj)
v.check_against(other_obj)

Allows instructors to test the robustness of their annotations and engage in unit testing assignments. Also useful for demonstration purposes.

Any method implementing this feature should just wrap the object in the tuple abstraction for a memory footprint observed value and use existing methods to validate the pseudo-footprint.

np.allclose causing issues in check_values_equal

I think np.allclose is causing issues related to overflow when comparing very large integers. In a recent project, I was trying to compare Fibonacci numbers and started experiencing issues around fib(94), which throws a TypeError when compared with numpy and thus results in check_values_equal returning False.

>>> np.isclose(19740274219868223167, 19740274219868223167)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<__array_function__ internals>", line 5, in isclose
  File "C:\Users\v-chpyles\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\numpy\core\numeric.py", line 2355, in isclose
    xfin = isfinite(x)
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Protected main branch causes docs auto-build to fail

The action that builds the documentation off of the main branch is broken currently because committing directly to main without an approved PR is disallowed. A workaround needs to be found to auto-build the documentation.

Unify CLI argument language

There's some small disconnects in the naming of CLI arguments, e.g. the interchangeability of dest and output. These should be fixed. This would also allow the inclusion of the output argument for student implementations in pybryt execute.

Release date in citation file not updating

See #134

Invariants page blank

https://microsoft.github.io/pybryt/html/annotations/invariants.html

Assuming "TODO", but it is blank :-)

Implement a method of turning tracing on and off within student code

There should be a method of telling PyBryt when to stop tracing and to start tracing again within student code, e.g. for pedagogical code that the instructor doesn't need graded.

An example usage might be:

def pow(x, a):
    return x ** a

x2 = pow(x, 2)

pybryt.tracing_off()
x3 = pow(x, 3)
pybryt.tracing_on()

The code above would capture x2 but not x3 or any values in the call to pow that was used to define x3.

Add a JSON output format

Add an API for outputting the results of a reference implementation as a JSON object.

Additional invariants

Currently, the invariant structure is set up but there is only one proof-of-concept invariant, string_capitalization. More invariants need to be added.

An (incomplete) list:

numeric type
matrix transpose
list permutation

Named annotation message filtering does not work correctly when no failure message is supplied

If no failure message is supplied but a success message is in a group of named annotations and one of the annotations is not satisfied, the success message is shown even though the reference registers as not being satisfied. In this case, no message should be shown.

Example:

max_ref = []
def maximum(l, track=False):
    m = max(l)
    if track:
        max_ref.append(pybryt.Value(m, name="list-maximum", success_message="Found the max!")
    return m

test_lists = [[1, 2, 3], [-1, 0, 1], [10, -4, 2, 0], [1]]
for test_list in test_lists:
    maximum(test_list, track=True)

max_ref = pybryt.ReferenceImplementation("maximum", max_ref)

def maximum(l):
    if len(l) % 2 == 0:
        m = min(l)
    else:
        m = max(l)
    return m

with pybryt.check(max_ref):
    for test_list in test_lists:
        maximum(test_list)

The problem of easy problems

Is your feature request related to a problem? Please describe.
We demonstrate the problem we encounter in example 01-easy-problems in https://github.com/marijanbeg/pybryt-examples

Summary

A student has an exercise to write a function with signature maximum(a), which finds and returns the largest element in a list a. The solution we expect in a beginner-level Python course is:

def maximum(a):
    res = a[0]
    for i in a:
        if i > res:
            res = i

    return res

The reference solution would be:

def maximum(a):
    res = a[0]
    pybryt.Value(res,
                 name='initial_value',
                 success_message='SUCCESS: Great! You declare the first element to be the largest before the loop.',
                 failure_message='ERROR: Hmmm... Did you declare the first element to be largest before the loop?')
    for i in a:
        if i > res:
            res = i
            pybryt.Value(res,
                         name='larger',
                         success_message='SUCCESS: Very nice! You are finding larger elements.',
                         failure_message='ERROR: Hmmm... Are you finding the larger elements than the declared one.')

    pybryt.Value(res,
                 name='res',
                 success_message='SUCCESS: Wow! You found the largest element.',
                 failure_message='ERROR: Hmmm... Something is wrong in your function.')
    return res


pybryt.Value(maximum([-3, 1, 0, 5, 19]), name='solution')

However, whatever the solution of the student's code is, we are not able to validate it because all elements of the input list are in the footprint anyway because of the for i in a loop, which validates any PyBryt annotation from the reference solution. Exercises like this are very common in beginner-level coding exercises where feedback on the student's implementation (PyBryt's main power) is essential.

Equality check for empty iterable types

Describe the bug
Equality checking for empty iterable types should be done as value == other_value and not using np.allclose(value, other_value, atol=atol, rtol=rtol). This bug prevents PyBryt from finding empty iterables in some cases.

This issue will be addressed in PR.

PyBryt GitHub Codespaces or MyBinder Environment

Ability to run the demos implementation from GitHub Codespaces or MyBinder

Describe the solution you'd like

Development of self contained GitHub Codespaces or MyBinder
DevContainer Build with preinstalled necessary libraries
Ability to click Codespaces and launch environment and use demos

Allow users to use time complexity analysis tools without annotations

Basically, add a way for people to use PyBryt's time complexity analysis without needing to tie it to an annotation. This could take the form of having pybryt.check_time_complexity automatically print a report, or adding another context manager for performing the checks using an iterator of inputs of increasing size.

Unexpected behaviour when checking student's wrong solution

Describe the bug
We investigated our reference solution by introducing different mistakes an imaginary student could make in the student's implementation. The student is expected to write a function square(a), which takes list a with elements as an input. It should square all elements individually and return a list of squared elements. Our reference solution is:

def square(a):
    res = []
    pybryt.Value(res,
                 name='empty_list',
                 success_message='SUCCESS 1: Great! You start with an empty list.',
                 failure_message='ERROR 1: Hmmm... Did you define an empty list before the loop?')
    for i in a:
        i_squared = i**2
        pybryt.Value(i_squared,
                     name='i_squared',
                     success_message='SUCCESS 2: Amazing! You are computing the squares of individual elements.',
                     failure_message='ERROR 2: Please check if you compute the squares of individual elements?')

        res.append(i_squared)
        pybryt.Value(res,
                     name='appending',
                     success_message='SUCCESS 3: Wow! You are appending the squared elements.',
                     failure_message='ERROR 3: Oops... Please check if you are appending the individual elements?')

    return res


pybryt.Value(square([-555, 13, 57, 0, 1, 2, -44]),
             name='final',
             success_message='SUCCESS 4: Your final solution is correct.',
             failure_message='ERROR 4: The final solution is wrong.')

The student's wrong solution is:

def square(a):
    res = []
    for i in a:
        if i < 0:
            i_squared = -i  # mistake introduced (-i instead of i**2 for negative elements)
        else:
            i_squared = i**2

        res.append(i_squared)
   
    return res

By checking the student's implementation, PyBryt gives feedback to the student that the solution is wrong, as we expected. However, the feedback messages are puzzling. More precisely:

with pybryt.check(reference(2)):
    square([-555, 13, 57, 0, 1, 2, -44])

REFERENCE: reference-2
SATISFIED: False
MESSAGES:
  - SUCCESS 1: Great! You start with an empty list.
  - SUCCESS 2: Amazing! You are computing the squares of individual elements.
  - ERROR 3: Oops... Please check if you are appending the individual elements?
  - ERROR 4: The final solution is wrong.

Although we moved the testing list away from "the origin" to increase the signal-to-noise ratio by using -555, the student gets SUCCESS 2 message. Is this a bug, or we are missing something because we do not expect 555**2 to be in the student's implementation footprint?

To Reproduce
The issue we encountered can be reproduced in 02-code-outside-functions example in https://github.com/marijanbeg/pybryt-examples repository.

Expected behavior
We expect the student to receive ERROR 2 instead of SUCCESS 2 message.

Add documentation for `Attribute` annotations

The docs currently don't mention the Attribute annotation (except for in the API reference), so this should be added.

Limiting the number of annotations in a reference only works when compiling a notebook, not when constructing the object manually

Consider the example:

max_ref = []
def maximum(l, track=False):
    m = max(l)
    if track:
        max_ref.append(pybryt.Value(
            m,
            name="list-maximum",
            limit=5,
            success_message="Found the max!", 
            failure_message="Did not find the max",
        ))
    return m

for _ in range(1000):
    test_list = np.random.normal(size=100)
    maximum(test_list, track=True)

max_ref = pybryt.ReferenceImplementation("maximum", max_ref)
max_ref.annotations

max_ref.annotations here has length 1000 even though the limit is set of 5 for all annotations in the reference. The filtering behavior should be moved into the reference implementation constructor instead of being in Annotation._track.

Change the use of `tracing_off` + `tracing_on` to be a context manager

Basically, the idea is to make calling tracing_off and tracing_on directly an anti-pattern and use context managers to control tracing. Currently, the time complexity check (#33) and individual question checks (#38) use context managers, and a unified approach is probably the best idea.

For example, to stop code from being traced during grading, something like

with pybryt.no_tracing():
    # some code

Update citation when a release is published

Currently, the version listed in CITATION.cff is not updated when a new release is published. The release script should be updated to make this change.

Create a way to diff student implementations

As titled: a way to diff two or more student implementations. This can be used in plagiarism and to compare e.g. a reference to a student implementation exactly.

Enforcing order in collections

Exercise:

The built-in Python function sum takes a list as an argument and computes the sum of the elements in the list:

>> sum([1, 3, 5, -5])
4

Implement your own version of the sum function and name it my_sum.

Reference implementation:

import pybryt

# PyBryt tolerance
rtol = 1e-5


def my_sum(x):
    s_col = pybryt.Collection(enforce_order=True,
                              success_message='SUCCESS: Great! In each iteration, you are adding a list element to the sum variable.',
                              failure_message='ERROR: Hmmm... you are not adding a list element to the sum variable.')

    # Why cannot we enforce order for i_col?
    i_col = pybryt.Collection(enforce_order=True,
                              success_message='SUCCESS: You are iterating over the list elements instead over indices.',
                              failure_message='ERROR: You are not iterating over the list elements.')
    s = 0
    pybryt.Value(s,
                 success_message='SUCCESS: Before the loop, you set the sum to be zero. Well done!',
                 failure_message='ERROR: Think about setting the sum to zero before the loop.')
    for i in x:
        i_col.add(pybryt.Value(i, rtol=rtol))

        s += i
        s_col.add(pybryt.Value(s, rtol=rtol))

    return s


pybryt.Value(my_sum([2.1, 98, -451, 273, 1111, 23.98]),
             name='final',
             success_message='SUCCESS: Your function returns the correct solution.',
             failure_message='ERROR: Your function returns a wrong solution.')

Student's implementation:

def my_sum(x):
    s = 0
    for i in x:
        s += i

    return s

Feedback:

with pybryt.check(pybryt_reference(1, 15)):
    my_sum([2.1, 98, -451, 273, 1111, 23.98])

REFERENCE: exercise-1_15
SATISFIED: False
MESSAGES:
  - SUCCESS: Great! In each iteration, you are adding a list element to the sum variable.
  - ERROR: You are not iterating over the list elements.
  - SUCCESS: Before the loop, you set the sum to be zero. Well done!
  - SUCCESS: Your function returns the correct solution.

Issue:

Enforcing order for i_col results in an error in the report. On the other hand, setting enforce_order=False results in a success message.

Is this the consequence of #62?
Should we update the timestamp of a value even if there is a duplicate?

PyBryt CLI Feature

PyBryt CLI

Goal : The ability to use a command line interface to create a reference implementation and grade a students work.

Describe the solution you'd like
Goal here is to allow the easy onboarding of educators to using PyBryt

So task would be

Educator can create a reference implementation
Educator uses PyBryt CLI to create a solution with reference implementations

Describe alternatives you've considered
CLI Features bases MVP

CLI to run a student code based on reference implementation (single submission)
Future features
Pickle Reference implementation
Use Pickle reference to undertake assessment
Or CLI to pickle and compare

Additional context
To be discussed.

PyBryt and ipykernel>=6.0 issue

PyBryt did not work with ipykernel>=6.0. As a temporary fix, I fixed ipykernel to 5.5.5 in requirements.txt before one of the releases a few months ago. I hoped this problem could resolve itself on its own when ipykernel matures, but it did not.

Catch when student implementation fails due to an error

PyBryt should notify an instructor when a student implementation errors out during execution. The current implementation continues executing despite errors, but does not notify the instructor.

exception when no steps

When executing a student code if no values are tracked an exception is thrown when computing the number of steps:

File "pybryt/execution/init.py", line 94, in execute_notebook
n_steps = max([t[1] for t in observed])
ValueError: max() arg is an empty sequence

Allow timeouts for submission execution

Allow instructors to specify a timeout for executing submissions, programmatically and via pybryt execute.

Setting up a method of citing PyBryt

Just found this out recently in the context of someone wanting to cite Otter-Grader: GitHub has added support for CITATION.cff files (link). Should we add something like this for PyBryt?

Import annotation

Create an annotation that asserts that a library was imported. This is needed because Value annotations can't serialize modules when they're saved.

e.g.

import numpy as np
pybryt.Import(np)

Add debug mode for assignment development

Per the discussion in #115, add a debug mode that disables some of PyBryt's assumptions around the user's intentions.

For example, allow exceptions raised by custom equivalence functions to propagate instead of returning False, or raise an error when atol/rtol are specified with an equivalence function.

Github pages builds - fix the docs build action, which is failing due to merge conflicts.

Describe the bug
At present some conflict issues exist when doing automated build

To Reproduce
see issue #45

Expected behavior
Expect docs update to merged

Comparison with reference implementation from within a notebook

For PyBryt tests (comparison with reference implementation) to be executed, it is necessary for the student to commit and push their solution to a repository. Most (beginner) programming and data science courses do not expect students to be familiar with version control. It would be convenient for a student to be able to run tests and get feedback right after they implement their solution. For instance,

Teacher can write a "testing cell" right after the cell where a student writes their solution (similar to the usual assert solution == 5 cell) to compare the student's solution to the reference one.
This way, student would get quick feedback in the same notebook without having to push their solution and waiting for CI to run and provide feedback.
For multiple attempts, this would substantially reduce the time student spends working on an exercise.

Finish testing

Additional test coverage is needed for the following features/modules:

Feedback from multiple references

Let us say we have N reference implementations for a particular problem. The student works on the problem and submits their solution. We, using PyBryt, compare the student's implementation against N different reference implementations. This results in N feedback reports (what annotations are (not) satisfied in each reference implementation). The question is: What feedback do we give back to the student?

Giving all N feedback reports to the student can be very confusing to the student and the student would not know what feedback to follow.
Could the solution be to derive a metric by which we can specify "how close" the student is to the particular reference implementation. This way, we provide the feedback report of a reference solution the student is most likely implementing.
Should there be a more sophisticated logic behind the scenes? For instance, if the student imported NumPy (or created an array of zeros), it is most likely they are following a particular reference.

This is the summary of some of the open questions we started brainstorming in one of the previous tech meetings to encourage the discussion. All ideas are welcome :)

Forbidding certain type to be in memory

Is your feature request related to a problem? Please describe.
It is very common to ask students in exercises to do something "using lists", "using tuples", "not using numpy", etc. This makes them practice solving problems by not using some of the operations they normally use or not using the solutions they can find on the internet.

We believe that forbidding a certain variable type in memory without specifying the exact value would simplify writing reference solutions.

Describe the solution you'd like
For instance, pybryt.ForbidType(numpy.ndarray) would give an error to the student if a variable of numpy.ndarray type is found in the memory footprint.

It is required for the teacher to inspect each output value separately and decide what is an acceptable tolerance.
Values such as lists, tuples, arrays, can contain values with different orders of magnitude and specifying a single absolute tolerance value is not sufficient.

Having the ability to define relative tolerance would help to resolve those difficulties and bring the comparison of values closer to numpy.allclose (https://numpy.org/doc/stable/reference/generated/numpy.allclose.html).