laike9m / cyberbrain Goto Github PK

View Code? Open in Web Editor NEW

2.5K 2.5K 159.0 1.85 MB

Python debugging, redefined.

Home Page: http://bit.ly/cyberbrain-features

License: MIT License

Python 70.87% Makefile 0.81% TypeScript 6.59% JavaScript 21.73%

debugging python

cyberbrain's People

Contributors

Stargazers

Watchers

Forkers

laixintao frostming maskcc vinsonzou arryboom dumpmemory crackercat wonderay xuacker jiangge virtumartoz gianozdp bzrrr ariattt camoryang landybird auspex-labs avarf formazione moneytech laranea 0xsirsaif nganmaplun tangchao1992 wooter-s trendingtechnology rocker9527 dingge2016 mathbeal jti107 studiovc wei-tian frankfan007 styanddty yxlwfds chenxianpao paultcn sdmikeho xuoutput davidalphafox guozenhua400 al0ng yibit geektan suchcoders hj3938 changjiannannan jschwarzwalder liusj2000 wh1994 teraniteak shuitianyise whiteashirt mu-l cooool-kin abhirkeesara kingctan stelectronic lilinj2000 xml2008 safino9 mgbin088 i-spark ydd9090 seanmaker henrywendy liushaoyang shuangg zhengfengran ababook asnblock isavagee joncv sourcery-ai-bot askformore lenmao source-code-reading-tools daozi-zll lycodeboy yinjiuming heriseex wxj0916 xrosliang rosieclementine wuhantop3 victorjzsun xiaojingdounaodan cocacolf eltociear b-xiang weedyqaq quintony spgreddy74 skyeking turtle24 moodykeke bluecipher idjaw xuanxuan177 rosonation

cyberbrain's Issues

Add examples from "Tiny Python Projects"

https://github.com/kyclark/tiny_python_projects

Add a social media image

https://www.freecodecamp.org/news/how-to-add-a-social-media-image-to-your-github-project/

Maybe use a screenshot of a trace graph as the background of the image.

Integrate with Github kanban

https://docs.github.com/cn/github/managing-your-work-on-github/creating-a-project-board

Support Cyberbrain in more editors and IDEs

There are countless editors and IDEs out there. For convenience, I'll call them environments. I'd like to see Cyberbrain integrated with all of them, but this simply is not possible given the limited time I have. Considering the technologies Cyberbrain is using, here's what I'm gonna do.

Cyberbrain will officially support major vscode-compatible environments
Support for non-vscode-compatible environments will rely on the community
I'm committed to provide as much help as I can, including but not limited to:
- Answering questions
- Audio/video 1:1
- Making necessary code changes
- Pair programming

The reason is simple. The only environment we now support is VS Code (local), thus it's much easier to support vscode-compatible environments than others.

Based on the strategy, the environments we will offically support include:

Please let me know if there's more.

The environments that we will rely on the community to support include:

All non-web IDEs (PyCharm/Eclipse/Visual Studio/etc)
Vim/Emacs/Sublime/Atom/etc
Jyputer notebook
Command line

I will create a formal specification of the internal API to help people build third-party tools.

There is no preset timeline for when each environment will be supported, or in which version. I want to keep it flexible, and most likely, the environments that more people requested for will be supported first. Once a new environment is supported, we'll release a new minor version.

For requesting support for another vscode-compatible environment, please open a separate issue.

If you want to migrate Cyberbrain to a non-vscode-compatible environment, please contact me directly on Twitter or Discord. I'm happy to discuss it anytime.

Mutation node should have a predecessor node linking to the original value

Right now there will only be nodes from arguments, like

a = []
a.extend(b)

There should be two links from a (line 1) and b to a (line 2), but right now there's one link from b.

Reduce the space between nodes on different lines

This is unacceptable.

Filed a feature request for vis-network: visjs/vis-network#1064

Create a dedicated repo for hosting online examples

So that we can detach it from the main repo.

Error: Failed to connect before the deadline

Need a larger timeout before client to fails to connect to server.

Put `InitialValue` nodes at the top of trace graph

What's left to do:

Put a text on the top line, like "Initial values" or "Arguments"

Rename "cb-experimental" to "Cyberbrain"

Refactor visualization code

We should refactor visualize.js to:

Create a separate class TraceData to:
- Manage raw events and loops, including visible events
- Rename getInitialState to initialize
- Add a updateVisibleEvents method (modified from Loop.generateNodeUpdate), which returns the current visible events
- Remove the visible events calculation logic from initialize.
Modify the TraceGraph class to:
- Calls TraceData.initialize upon receiving data from the backend
- Calls the TraceData.updateVisibleEvents after a loop counter is set

The most notable change is that we don't replace nodes anymore, but render all nodes again if a loop is updated.
Reasons:

Each method can only have one responsibility (previously getInitialState does multiple things)
More robust, because calculating nodesToHide and nodesToShow is tricky
Help split TraceData and TraceGraph class
No performance sacrifice, because the number of visible nodes are small.

Handles `RETURN_VALUE`

Improve tooltip

Tooltip text should be truncated.
Tooltip should not overlap with nodes.
We probably need to show tooltip on the same height with each node, starting from the right or left edge depending on the position of the node. To do that, at least we need to know the width of a node.

Make Cyberbrain more intuitive to use

Background

In #49 and cool-RR's email, an issue was mentioned:
It is not intuitive how to run a different program and open the trace graph with new data.

Quote:

I tried out the demo on Gitpod and followed the instructions. After viewing the first example I ran a different example but didn't know how to view the graph. Eventually I figured out that killing the first process with Ctrl+C made it possible to reinitialize Cyberbrain with a new example.

I changed the value passed to the function and reran, expecting to see the graph updated with the new values when I hovered with the mouse. It wasn't. I also tried the command "Initialize CyberBrain" again and yet the value wasn't updated.

Clearly we need to do better at this.

How It Works Now?

Currently, the Cyberbrain Python lib (abbr. cb-py) launches a server. When users run "Initialize Cyberbrain" in VS Code, the The Cyberbrain VS Code extension (abbr. cb-vsc) talks to the server and fetches data, then visualizes it. The server listens on a fixed port, thus there can't be multiple running servers.

Proposed Solution

Note: the below solution takes into consideration a feature which has not been implemented yet: multi-frame tracing. The original design of this feature is described here, which may differ from the below solution. But the core idea keeps unchanged: let users pick the frame to visualize.

Overview

cb-vsc automatically starts a long-running server, let's call it coordination server (abbr. cs). The workflow looks like this:

When cb-py finishes tracing a Python program
- 2.1. If there's only one frame
  cb-py sends this frame to cs/cb-vsc, cb-vsc generates a new trace graph.
- 2.2. If there are multiple frames
  3. cb-py sends the locations of these frames (aka FrameLocaterList) to cs/cb-vsc
  4. The user picks one frame (details TBD), cs/cb-vsc sends the location of this frame (FrameLocater) back to cb-py
  5. cb-py sends the selected frame to cs/cb-vsc, cb-vsc generates a new trace graph.

Note that in case 2.2, step 3-5 could repeat multiple times to allow visualizing different frames in the same execution. This requires server-side streaming RPC, so cs/cb-vsc can send multiple FrameLocaters to cb-py. Also cb-vsc should persist the locations of all available frames.

The proposed solution has a few benefits compared to the existing implementation:

Users only need to run the Python program, no need to run "Initialize Cyberbrain".
The experience keeps unchanged for running multiple programs and/or multiple times.

The Coordination Server

Requirements:

cs should remain active as long as VS Code is open, presumably with a periodic status check.
The listening port should be configurable. Potentially, we can use a config file ~/.cyberbrain_config and let cb-py and cb-vsc read it.

Open questions:

Will vsc launch multiple coordination servers when there are multiple opened window? If yes, we need to handle it gracefully.

Things to Take into Consideration

Stateless

The infrastructure should be as stateless as possible, otherwise it would be very complicated to maintain and extend.

Future Proof

The design should work well with future features to add, including (but not limited to) multi-frame tracing, though this is hard since things may change or news features are planned.

Needs to work well with codelens #34

The report sent from cb-py could potentially carry information to tell cb-vsc where to show codelens, so that users are aware of the click-to-enable-trace-graph feature. This (and supporting for multi-frame tracing) also means that the Python program needs to stay alive before manually terminated.

Remote Debugging Friendly

The design should be able to support remote debugging.

AI: Learn how remote debugging works.

Third-party Friendly

We should not rely on VS Code specific things, and if users want to build their own cs and frontend, they should be able to do so.

The trace graph should show frame information

filename
function/method name
lineno
callsite info

Currently we only have relative lineno, we need to get the absolute line number of function start.

Lines connecting nodes should have arrows

Solve edge overlap issue

Sometimes edges seriouly overlap with each other, like:

Related:
visjs/vis-network#84

One possible approach:

Inspect all edges, if there's any vertical edge (from.x = to.x), adjust from.x = from.x + 10.

If two ends are on adjacent levels, do nothing.

This may require #21 to be implemented first, as the redraw of graph may lead to overlap of event nodes and lineno nodes.

Store JSON-searialized objects directly and get rid of diffs

Right now we store diffs for mutations and restore the value at each snapshot only when returning the tracing result to frontend. A problem is that we need to deepcopy (and pickling internally) the values, but many objects are unpickable.

This problem can be avoided, because eventually we'll pass JSON to the frontend, so there's no need to store the original values —— we can serialize them to JSON early and only store the JSON.

The problem of extra memory usage still exists. But since we're storing JSON, they can be dumped to the disk and loaded back easily. The optimization is out of the scope of this issue, but it will be a lot simpler and robust than using deepdiff.

Gitpod support

#24

Publish the extension to openvsx.
Claming namespace: EclipseFdn/open-vsx.org#170
Reinstall online example's extension from openvsx, and verify it can work.

Refs:

Most things already work except Devtools.

Issues:

Devtools log can work in Gitpod too, but we need to find a way to automatically opens it. What is the canonical way to let an extension know that it’s running in Gitpod?

It seems to be impossible to auto open devtools.

Only allow triggering tracing once

For now, if users call the start() method multiple times, for the invocations after the first call, the method should do nothing.

To achieve this, we should record the whether the method has been called.

Allow dumping tracing result

If program stucks and people use ctrl+c to interrupt the execution, we should be able to dump everything from memory to a file.

Show links for replaced nodes in loops

When nodes in a loop are replaced, the lines involving replaced nodes disappear. I believe even if these edges are not accurate after modifying loop counters, they still provide useful information thus should be kept.

We can use dotted lines for this, but we need to add a caption or tooltip to show what it means.

Lineno nodes should not overlap with event nodes

Automate release process

Actually, the process is fine.

Binding events should only be logged if old and new value are different

Measure extra memory usage

Improve Cyberbrain API

Provide a decorator API

Sometimes a decorator is more convenient than .start() and .end(), especially when the function can return from different places.

Also since at this point we don't support multi-frame tracing, a decorator API should be more user-friendly.

Provide a default tracer object

Provide a way to disable tracing

This would allow users to control whether to enable tracing via a flag, so that they don't need to modify their code.

Specifics TBD.

Support generator functions

run the below code raises AttributeError

from cyberbrain import trace


def main():
    @trace
    def fib_gen(count):
        a, b = 1, 1
        while count := count - 1:
            yield a
            a, b = b, a + b

    for fib_num in fib_gen(10):
        print(fib_num)


if __name__ == "__main__":
    main()

stdout/err

Starting grpc server on 50051...
fib_gen <cyberbrain.frame.Frame object at 0x7f7908a26d30>
jumped: False
1
fib_gen <cyberbrain.frame.Frame object at 0x7f7902c9c850>
jumped: False
Traceback (most recent call last):
  File "/workspace/.pip-modules/lib/python3.8/site-packages/cyberbrain/value_stack.py", line 130, in emit_event_and_update_stack
    handler = getattr(self, f"_{instr.opname}_handler")
AttributeError: 'Py38ValueStack' object has no attribute '_YIELD_VALUE_handler'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/tmp/vscode-extensions/[email protected]/extension/pythonFiles/ptvsd_launcher.py", line 43, in <module>
    main(ptvsdArgs)
  File "/tmp/vscode-extensions/[email protected]/extension/pythonFiles/lib/python/old_ptvsd/ptvsd/__main__.py", line 432, in main
    run()
  File "/tmp/vscode-extensions/[email protected]/extension/pythonFiles/lib/python/old_ptvsd/ptvsd/__main__.py", line 316, in run_file
    runpy.run_path(target, run_name='__main__')
  File "/home/gitpod/.pyenv/versions/3.8.6/lib/python3.8/runpy.py", line 265, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/gitpod/.pyenv/versions/3.8.6/lib/python3.8/runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/gitpod/.pyenv/versions/3.8.6/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/workspace/Cyberbrain/examples/gen.py", line 17, in <module>
    main()
  File "/workspace/Cyberbrain/examples/gen.py", line 12, in main
    for fib_num in fib_gen(10):
  File "/workspace/Cyberbrain/examples/gen.py", line 9, in fib_gen
    yield a
  File "/workspace/Cyberbrain/examples/gen.py", line 9, in fib_gen
    yield a
  File "/workspace/.pip-modules/lib/python3.8/site-packages/cyberbrain/tracer.py", line 235, in _local_tracer
    self.frame_logger.update(raw_frame)
  File "/workspace/.pip-modules/lib/python3.8/site-packages/cyberbrain/logger.py", line 131, in update
    self.frame.log_events(frame, instr, jumped)
  File "/workspace/.pip-modules/lib/python3.8/site-packages/cyberbrain/frame.py", line 133, in log_events
    event_info = self.value_stack.emit_event_and_update_stack(
  File "/workspace/.pip-modules/lib/python3.8/site-packages/cyberbrain/value_stack.py", line 132, in emit_event_and_update_stack
    raise AttributeError(
AttributeError: Please add
def _YIELD_VALUE_handler(self, instr):

Implement the rest of instructions planned to be supported in V1

docs.google.com/spreadsheets/d/12jHOV9TFrdPySdKVWacAFcL20U-VRPXHiCUzcJZRO2M

What's left:

with releated instructions
Closure related: LOAD_CLOSURE, LOAD_DEREF, LOAD_CLASSDEREF, STORE_DEREF, DELETE_DEREF
Call related: CALL_FUNCTION, CALL_FUNCTION_KW, CALL_FUNCTION_EX, CALL_METHOD, BUILD_TUPLE_UNPACK_WITH_CALL, BUILD_MAP_UNPACK_WITH_CALL
Others: LOAD_BUILD_CLASS, ROT_FOUR

Postponded:

SET_ADD, LIST_APPEND, MAP_ADD (Used for list/set/dict comprehension, which we are not able to test due to call tracing not enabled)

Handles all cases of PREDICT

Handles dark themes

Trace graphs should also look good with dark themes

https://github.com/hediet/vscode-debug-visualizer
https://code.visualstudio.com/api/references/theme-color

Orange seems to be a good choice for the base color.

When dragging a node, all related nodes are all moved as whole

"related nodes" means all nodes that show value when hovering the dragged node.

The outcome includes nearly impossible for arranging node graph order in loops, since all values are all bind together.

Example code:

from cyberbrain import trace


@trace
def fib(n):
    a = 0
    b = 1
    for _ in range(n):
        t = b
        b = a + b
        a = t
    return b


if __name__ == '__main__':
    fib(10)

Python 3.9 support

LOAD_ASSERTION_ERROR
bpo-39156: IS_OP, CONTAINS_OP, JUMP_IF_NOT_EXC_MATCH
bpo-39320: LIST_TO_TUPLE, LIST_EXTEND, SET_UPDATE, DICT_UPDATE, DICT_MERGE
bpo-33387: RERAISE, WITH_EXCEPT_START
~~[ ] RuntimeError: cannot schedule new futures after interpreter shutdown, might be related to https://bugs.python.org/issue39812~~
Dependent on #55
Blocked by grpc/grpc#24344 (Windows not working)
Blocked by actions/runner-images#1740

CPython 3.9 bytecode changes

Handling dependent loop counters

Loop counter nodes need to be hidden or displayed as other loop counter changes. And a loop counter's max value can be affected by the current value of other loop counters' (see password.py).

e.g.

for i in range(10):
  if i == 2:
    for j in range(2):  # The visility of this loop counter node should be dependent on i,
      print(j)

Received message larger than max

Enable compression
But note that max message size settings are unrelated to any compression applied. Source
Set a higher payload size limit

gRPC has a 4MB payload size limit on the client side.

RESOURCE_EXHAUSTED: Received message larger than max (7904348 vs. 4194304)
	at Object.callErrorFromStatus (/Users/laike9m/Dev/Python/Cyberbrain/cyberbrain-vsc/node_modules/@grpc/grpc-js/build/src/call.js:31)
	at Object.onReceiveStatus (/Users/laike9m/Dev/Python/Cyberbrain/cyberbrain-vsc/node_modules/@grpc/grpc-js/build/src/client.js:176)
	at Object.onReceiveStatus (/Users/laike9m/Dev/Python/Cyberbrain/cyberbrain-vsc/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:342)
	at Object.onReceiveStatus (/Users/laike9m/Dev/Python/Cyberbrain/cyberbrain-vsc/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:305)
	at /Users/laike9m/Dev/Python/Cyberbrain/cyberbrain-vsc/node_modules/@grpc/grpc-js/build/src/call-stream.js:124
	at processTicksAndRejections (/Applications/Visual%20Studio%20Code.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/internal/process/task_queues.js:76)

Add CI for vsc tests

Reference:
https://github.com/microsoft/vscode-test/blob/master/sample/azure-pipelines.yml

https://code.visualstudio.com/api/working-with-extensions/continuous-integration

When hovering on a node, show values from more nodes

On the trace graph, show values on the trace path
In devtools, show all local variables at this point, using the form

console.log({identifer_1: value1, identifer_2: value2});

Link to sponsors

pypi, see https://pypi.org/project/Pillow/

https://docs.github.com/en/free-pro-team@latest/github/administering-a-repository/displaying-a-sponsor-button-in-your-repository

Also need to optimize the display of image on mobile.

How to view traces after the first one?

I tried out the demo on Gitpod and followed the instructions. After viewing the first example I ran a different example but didn't know how to view the graph. Eventually I figured out that killing the first process with Ctrl+C made it possible to reinitialize Cyberbrain with a new example. Is this the best way? If so, consider adding it to the instructions.

Does it not support recursive functions?

When I run the following code to calculate the Fibonacci sequence recursively, I got an AssertionError:

from cyberbrain import trace

@trace
def fibo(n):
    if n <= 1:
        return n
    else:
        return fibo(n - 1) + fibo(n - 2)

print(fibo(3))

I have also tested some other recursive functions, and they all have this problem. This may be a bug that needs to be fixed

Improve object inspection

Conext and Solutions

Right now we use

jsonpickle.encode(python_object, unpicklable=False)

to convert a Python object to JSON. Many information is lost in the conversion. If we use unpicklable=False, theoretically it's lossless, but we need to handle the extra information (like methods, scopes, etc).

See:

Fall back to repr is not a bad choice.

On the Js side, we could use eval to generate a more user-friendly output. So instead of a plain Js object, we could attach class information by

We can also hide the __proto__ attribute when logging:
https://stackoverflow.com/questions/11818091/hiding-the-proto-property-in-chromes-console

Create a formal specification of the internal API

As mentioned in #24 (comment)

Can't exit matplotlib window if Cyberbrain is enabled

Can reproduce with epsilon_greedy.py

Update:
Tried sending the request in another thread, didn't work. Since this use case is rare I made it P2.

Find a way to test visualization

Seems the only way is to compare screenshots.

See how vis-network does it
visjs/vis-network#33
https://github.com/visjs/vis-network/blob/master/cypress/integration/visual/label-rendering.spec.ts
https://www.cypress.io/

Create an online demo

Ask about gitpod snapshot expire time
https://community.gitpod.io/t/do-snapshots-expire/1920
Snapshots don't expire.

The pinned workspace is kept forever:

Allow mutual interaction between source code and the trace graph

Some ideas:

When clicking/hovering on a trace graph node, highlight the corresponding text in the source code panel
Vice versa

The difficult part is that if an identifer appears multiple times in the same line, we may not be able to tie them to the correct nodes accurately. But at least we could just highlight the whole line or all nodes in the same line, which is already useful.

from cyberbrain import trace


@trace
def fib(n):
    a, b = 0, 1
    for _ in range(n):
        a, b = b, a+b
    return b


if __name__ == '__main__':
    fib(3)

Result:

Maybe different name within one line can be separated? Alike this workaround:

from cyberbrain import trace


@trace
def fib(n):
    (a,
     b) = 0, 1
    for _ in range(n):
        (a,
         b) = b, a+b
    return b


if __name__ == '__main__':
    fib(3)