jaumeamoresds / nbmodular Goto Github PK

View Code? Open in Web Editor NEW

0.0 1.0 0.0 875 KB

Convert notebooks to modular code

Home Page: https://jaumeamoresds.github.io/nbmodular/

License: MIT License

Python 33.33% Jupyter Notebook 66.52% CSS 0.13% Shell 0.02%

data-science ipython-magic modularization nbdev notebooks software-engineering

nbmodular's Issues

clean code generated for pipeline

Docstring is no longer valid if the pipeline is not the default one
test argument doesn't seem to be used - we might just remove it?

check what happens when given function name is `test_my_function` and function is defined

example:

%%function --test
def test_my_function ():
   pass

transform cell code into list of lines

This avoids hack

if True: 
(...)

used in update_cell_code

in place changing the code in a previous function cell

Currently, this creates another code cell at the end of the list, keeping the previous ones. If --merge is not added, the previous ones are just set to not valid. Let's see this with an example:

%%function 
def add1 (x):
	y = x + 1
	return y
	
%%function --merge
def add1 (x, x2):
	y2 = x2 + 1
	return y, y2
	
%%function
def add1 (x, x2):
	y = x + 1
	y2 = x2 + 1
	print (f'{x} + 1 = {y}')
	print (f'{x2} + 1 = {y2}')
	return y, y2

The result is a list with three code cells: the first two with valid=False, and the last one with valid=True
Imagine the third cell replaced the first cell, i.e., the code in the third definition is written in the first cell. We could add a flag --replace-cell that makes it replace. But we need to add also an index of what cell to replace.

We could also add two magic lines:

%list_cells function_name listing the code cells existing for a particular function_name, along with the function indexes, to know which one needs to be replaced.
%delete funcion_cell function_name idx deleting the function cell in position idx of the list of function cells self.code_cells[function_name].

Explain advantage about using logger

We just initialize it before the function, use it, and it will automatically be added to function arguments
With visitor we might not even need to initialize before, provided we don't run the function

short term issues (pool of issues)

Exclude function from pipeline
Explain --data separately later: it just allows to have the test=False as argument, see examples where this can be useful.
(?) Magic that updates pipeline function (evaluates it)
(?) Remove names that are calls - no need to evaluate those objects. See defined function myfunc (don't remember what this is about)

I think these are already covered:

Set logging level with magic line
Indicate name of input/output to be used in pipeline, in case it is not the same as names in arguments/return_values
Indicate name of python file and path to it (relative to lib_path) as magic line.

Do linear scan for previous_variables

loaded names that have first been stored cannot be previous variables
this requires an ordered scan using a tree visitor. See ast_examples.ipynb in nbs/dev_tutorials folder

Explain that nbmodular parsing is currently approximate

debug function

Add magic debug_function which doesn't run function but instead debugs it. It could use the variables in memory or passed as arguments:

%%debug_function --input-values a=1 b=2
def add_and_print(a, b):
    c = a + b
    print (c)

indicate pipeline name

set_pipe (function_name, pipeline_name)

if pipeline_name != 'None' and pipeline_name not in self.pipelines:
   self.pipelines.append (pipeline_name)

for pipeline in self.pipelines:
   self.create_pipeline (pipeline)

def create_pipeline (pipeline=None):
    if pipeline is None:
        pipeline = self.file_name_without_extension
    function_list = self.get_function_list (pipeline=pipeline)

get_function_list (self, ..., pipeline=None):
   if pipeline is not None:
        function_list = (... and f.pipeline==pipeline)

add global boolean flag in CellProcessor that tells to just include the cell text when calling the magic `function` from now on

long term issues (Pool of issues)

Hierarchical objects with current values as attributes
Copy previous values and restore the variables to have the previous values after running the function
optional: warning message when return doesn't include variable name but function call
allow to exclude / include local variables to be stored in object, to avoid issues
- Do this in two ways:
  - Delete the variable (del), with the disadvantage that we won't be able to inspect it later on.
  - Delete the variable only when a new cell magic is executed, so that we can still inspect the variables created in the last cell, and then move on to execute the next cell, at which point we remove previous variables that were memory-consuming.
  - We might as well, more in the long-term future, delete variables based on how much memory they consume, using some threshold parameter.
allow to store previous_values in _info object as follows:
1. store the values in locals(), using the same code that is used now for storing current_values in info_object. This code is run before the code from the cell is run,
  so that the locals() are those from before the cell, i.e., the previous values
2. use the same trick as the one used in keep_variables_in_memory: introduce a boolean flag created_previous_values in dict => although it should work here.
3. if the flag does not exist, run second code that, instead of storing values in locals, stores them in disk => although this should not happen here.
Write ipython script where magic functions are written using https://stackoverflow.com/questions/10361206/how-to-run-ipython-magic-from-a-script
Using the AST, see if the first time a variable is stored, in the same statement it is also loaded. If that's the case, or if there was a load for the same variable prior to the current statement, the variable should be in the previous_variable list. Otherwise, it shouldn't. See if ast.walk preserves the ordering of statements when traversing the tree. I think so.
Have function_info object be attached to function object, <my_function>.info = current function info. Have the current values attached directly to the function object.

add options for not writing text and code cells every time a function magic is called

Allow different values for arguments that use defaults

When using kwarguments, do not initialize variables that already exist in memory
Allow to indicate the values of arguments when calling magic cell:

%%function --input-values a=1 b=2
def add_and_print (a, b=10):
   c = a + b
   print (c)

function with wrong result values

x, y, x, y = analyze ()

TO CHECK

keyword arguments
avoid calling previous code when merging! This can be very inefficient and wrong if code has side-effects.
- option 1) join the AST trees from different cells, but only run the last cell
- option 2) use current_values: if previous_values is in current_values of previous cells, it cannot be a previous value
- experiment with cases: x loaded in cell 1 (prev value), y created in cell 1, z loaded in cell 2 (new prev value), y loaded in cell 2 (not prev value, since it is in created_values of prev cell)
magic line %set allows to set any param.

jaumeamoresds / nbmodular Goto Github PK

nbmodular's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs