GithubHelp home page GithubHelp logo

laike9m / cyberbrain Goto Github PK

View Code? Open in Web Editor NEW
2.5K 32.0 159.0 1.85 MB

Python debugging, redefined.

Home Page: http://bit.ly/cyberbrain-features

License: MIT License

Python 70.87% Makefile 0.81% TypeScript 6.59% JavaScript 21.73%
python debugging

cyberbrain's Introduction

Cyberbrain: Python debugging, redefined.

support-version PyPI implementation PyPI version shields.io "GitHub Discussions" Discord Twitter

Cyberbrain1(电子脑) aims to free programmers from debugging. It lets you:

  • Backtrace variable changes.

  • See every state of program execution, including variables' values

  • Debug loops with confidence.

Never spend hours stepping through a program, let Cyberbrain tell you what happened.

Read more about existing features, and roadmaps for features to come.

I gave a talk at PyCascades 2021 about Cyberbrain, watch it here.

Install

Cyberbrain consists of a Python library and various editor/IDE integrations. Currently it supports VS Code and Gitpod. See our plan on expanding the support.

To install Cyberbrain:

pip install cyberbrain
code --install-extension laike9m.cyberbrain

You can also install from PyPI , VS Code marketplace or Open VSX .

Or, you can try Cyberbrain online: Open in Gitpod

How to Use

Suppose you want to trace a function foo, just decorate it with @trace:

from cyberbrain import trace

# As of now, you can only have one @trace decorator in the whole program.
# We may change this in version 2.0, see https://github.com/laike9m/Cyberbrain/discussions/73

@trace  # Disable tracing with `@trace(disabled=True)`
def foo():
    ...

Cyberbrain keeps your workflow unchanged. You run a program (from vscode or command line, both work), and a new panel will be opened to visualize how your program executed.

The following gif demonstrates the workflow (click to view the full size image):

usage

Read our documentation to learn more about Cyberbrain's features and limitations.

❗Note on use❗

  • Cyberbrain may conflict with other debuggers. If you set breakpoints and use VSC's debugger, Cyberbrain may not function normally. Generally speaking, prefer "Run Without Debugging" (like shown in the gif).
  • If you have multiple VS Code window opened, the trace graph will always be created in the first one. #72 is tracking this issue.
  • When having multiple decorators, you should put @trace as the innermost one.
    @app.route("/")
    @trace
    def hello_world():
        x = [1, 2, 3]
        return "Hello, World!"

Roadmaps

Updated 2020.11

Cyberbrain is new and under active development, bugs are expected. If you met any, please create an issue. At this point, you should NOT use Cyberbrain in production. We'll release 1.0 when it's ready for production.

Major features planned for future versions are listed below. It may change over time.

Version Features
1.0 Code & trace interaction (#7), API specification
2.0 Multi-frame tracing (👉 I need your feedback for this feature)
3.0 async support, remote debugging
4.0 Fine-grained symbol tracing
5.0 Multi-threading support

Visit the project's kanban to learn more about the current development schedule.

How does it compare to other tools?

PySnooper PySnooper and Cyberbrain share the same goal of reducing programmers' work while debugging, with a fundamental difference: Cyberbrain traces and shows the sources of each variable change, while PySnooper only logs them. The differences should be pretty obvious after you tried both.
Debug Visualizer Debug visualizer and Cyberbrain have different goals. Debug visualizer visualizes data structures, while Cyberbrain visualizes program execution (but also lets you inspect values).
Python Tutor Python Tutor is for education purposes, you can't use it to debug your own programs. It's a brilliant tool for its purpose and I do it like it very much.
Static analysis Cyberbrain is *NOT* static analyis. It's runtime tracing. Static analysis can't provide enough information for debugging.

Community

Interested in Contributing?

See the development guide. This project follows the all-contributors specification. Contributions of ANY kind welcome!

All Contributors

Thanks goes to these wonderful contributors ✨


Alex Hall

🤔

Frost Ming

🐛 📖

Funloading

💻

Ikko Ashimine

💻

Kaustubh Gupta

📝

Ram Rachum

🤔

Siyuan Xu

🐛

Victor Sun

💻 🤔

dingge2016

💵 💻

foo bar

💵

inkuang

🐛

laixintao

📖

yihong

💵 🤔

林玮 (Jade Lin)

🐛 🤔

Support

Cyberbrain is a huge and complicated project that will last for years, but once finished, it will reshape how people think and do debugging. Your support can help sustain it. Let's make it the best Python debugging tool 🤟!

❤️ Sponsor on GitHub

1: The name of this project originates from Ghost in the Shell, quote:

Cyberization is the process whereby a normal brain is physically integrated with electronic components to produce an augmented organ referred to as a cyberbrain.

cyberbrain's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cyberbrain's Issues

Solve edge overlap issue

Sometimes edges seriouly overlap with each other, like:

image

Related:
visjs/vis-network#84

One possible approach:
image
Inspect all edges, if there's any vertical edge (from.x = to.x), adjust from.x = from.x + 10.

If two ends are on adjacent levels, do nothing.

This may require #21 to be implemented first, as the redraw of graph may lead to overlap of event nodes and lineno nodes.

Store JSON-searialized objects directly and get rid of diffs

Right now we store diffs for mutations and restore the value at each snapshot only when returning the tracing result to frontend. A problem is that we need to deepcopy (and pickling internally) the values, but many objects are unpickable.

This problem can be avoided, because eventually we'll pass JSON to the frontend, so there's no need to store the original values —— we can serialize them to JSON early and only store the JSON.

The problem of extra memory usage still exists. But since we're storing JSON, they can be dumped to the disk and loaded back easily. The optimization is out of the scope of this issue, but it will be a lot simpler and robust than using deepdiff.

Received message larger than max

gRPC has a 4MB payload size limit on the client side.

RESOURCE_EXHAUSTED: Received message larger than max (7904348 vs. 4194304)
	at Object.callErrorFromStatus (/Users/laike9m/Dev/Python/Cyberbrain/cyberbrain-vsc/node_modules/@grpc/grpc-js/build/src/call.js:31)
	at Object.onReceiveStatus (/Users/laike9m/Dev/Python/Cyberbrain/cyberbrain-vsc/node_modules/@grpc/grpc-js/build/src/client.js:176)
	at Object.onReceiveStatus (/Users/laike9m/Dev/Python/Cyberbrain/cyberbrain-vsc/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:342)
	at Object.onReceiveStatus (/Users/laike9m/Dev/Python/Cyberbrain/cyberbrain-vsc/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:305)
	at /Users/laike9m/Dev/Python/Cyberbrain/cyberbrain-vsc/node_modules/@grpc/grpc-js/build/src/call-stream.js:124
	at processTicksAndRejections (/Applications/Visual%20Studio%20Code.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/internal/process/task_queues.js:76)

How to view traces after the first one?

I tried out the demo on Gitpod and followed the instructions. After viewing the first example I ran a different example but didn't know how to view the graph. Eventually I figured out that killing the first process with Ctrl+C made it possible to reinitialize Cyberbrain with a new example. Is this the best way? If so, consider adding it to the instructions.

Confusing graph when there are multiple variables in one line

Example code:

from cyberbrain import trace


@trace
def fib(n):
    a, b = 0, 1
    for _ in range(n):
        a, b = b, a+b
    return b


if __name__ == '__main__':
    fib(3)

Result:
image

Maybe different name within one line can be separated? Alike this workaround:

from cyberbrain import trace


@trace
def fib(n):
    (a,
     b) = 0, 1
    for _ in range(n):
        (a,
         b) = b, a+b
    return b


if __name__ == '__main__':
    fib(3)

Result:
image

Allow mutual interaction between source code and the trace graph

Some ideas:

  • When clicking/hovering on a trace graph node, highlight the corresponding text in the source code panel
  • Vice versa

The difficult part is that if an identifer appears multiple times in the same line, we may not be able to tie them to the correct nodes accurately. But at least we could just highlight the whole line or all nodes in the same line, which is already useful.

Does it not support recursive functions?

When I run the following code to calculate the Fibonacci sequence recursively, I got an AssertionError:

from cyberbrain import trace

@trace
def fibo(n):
    if n <= 1:
        return n
    else:
        return fibo(n - 1) + fibo(n - 2)

print(fibo(3))

I have also tested some other recursive functions, and they all have this problem. This may be a bug that needs to be fixed

Refactor visualization code

We should refactor visualize.js to:

  • Create a separate class TraceData to:
    • Manage raw events and loops, including visible events
    • Rename getInitialState to initialize
    • Add a updateVisibleEvents method (modified from Loop.generateNodeUpdate), which returns the current visible events
    • Remove the visible events calculation logic from initialize.
  • Modify the TraceGraph class to:
    • Calls TraceData.initialize upon receiving data from the backend
    • Calls the TraceData.updateVisibleEvents after a loop counter is set

The most notable change is that we don't replace nodes anymore, but render all nodes again if a loop is updated.
Reasons:

  • Each method can only have one responsibility (previously getInitialState does multiple things)
  • More robust, because calculating nodesToHide and nodesToShow is tricky
  • Help split TraceData and TraceGraph class
  • No performance sacrifice, because the number of visible nodes are small.

Allow dumping tracing result

If program stucks and people use ctrl+c to interrupt the execution, we should be able to dump everything from memory to a file.

Improve object inspection

  • Show absolute line number
  • Show class name for instances
  • Diffrenciate between tuple, list, set (needs to record the original type)
  • Deal with '{"py/type": "test_cellvar.test_closure.<locals>.Foo"}', which represents a Python class.
  • Numpy objects
  • Pandas objects
  • Show string with quotes
  • Fix: re.Match object is null in Js.
    ( As it turns out, jsonpickle only serializes an object's __dict__. A re.Match object has no __dict__, only attributes defined by descriptors, so was serialized to null. )
  • Pass truncated repr to FE to use as the tooltip text.
  • Repr truncated on Linux (alexmojaki/cheap_repr#13, alexmojaki/cheap_repr#15)

Conext and Solutions

Right now we use

jsonpickle.encode(python_object, unpicklable=False)

to convert a Python object to JSON. Many information is lost in the conversion. If we use unpicklable=False, theoretically it's lossless, but we need to handle the extra information (like methods, scopes, etc).

See:

Fall back to repr is not a bad choice.

On the Js side, we could use eval to generate a more user-friendly output. So instead of a plain Js object, we could attach class information by
image

We can also hide the __proto__ attribute when logging:
https://stackoverflow.com/questions/11818091/hiding-the-proto-property-in-chromes-console

When dragging a node, all related nodes are all moved as whole

"related nodes" means all nodes that show value when hovering the dragged node.

The outcome includes nearly impossible for arranging node graph order in loops, since all values are all bind together.

Example code:

from cyberbrain import trace


@trace
def fib(n):
    a = 0
    b = 1
    for _ in range(n):
        t = b
        b = a + b
        a = t
    return b


if __name__ == '__main__':
    fib(10)

Implement the rest of instructions planned to be supported in V1

docs.google.com/spreadsheets/d/12jHOV9TFrdPySdKVWacAFcL20U-VRPXHiCUzcJZRO2M

What's left:

  • with releated instructions
  • Closure related: LOAD_CLOSURE, LOAD_DEREF, LOAD_CLASSDEREF, STORE_DEREF, DELETE_DEREF
  • Call related: CALL_FUNCTION, CALL_FUNCTION_KW, CALL_FUNCTION_EX, CALL_METHOD, BUILD_TUPLE_UNPACK_WITH_CALL, BUILD_MAP_UNPACK_WITH_CALL
  • Others: LOAD_BUILD_CLASS, ROT_FOUR

Postponded:

SET_ADD, LIST_APPEND, MAP_ADD (Used for list/set/dict comprehension, which we are not able to test due to call tracing not enabled)

Handling dependent loop counters

Loop counter nodes need to be hidden or displayed as other loop counter changes. And a loop counter's max value can be affected by the current value of other loop counters' (see password.py).

e.g.

for i in range(10):
  if i == 2:
    for j in range(2):  # The visility of this loop counter node should be dependent on i,
      print(j)

Gitpod support

#24

  • Publish the extension to openvsx.
  • Claming namespace: EclipseFdn/open-vsx.org#170
  • Reinstall online example's extension from openvsx, and verify it can work.

Refs:

Most things already work except Devtools.

Issues:

Only allow triggering tracing once

For now, if users call the start() method multiple times, for the invocations after the first call, the method should do nothing.

To achieve this, we should record the whether the method has been called.

Improve tooltip

  • Tooltip text should be truncated.
  • Tooltip should not overlap with nodes.
    We probably need to show tooltip on the same height with each node, starting from the right or left edge depending on the position of the node. To do that, at least we need to know the width of a node.

Support generator functions

run the below code raises AttributeError

from cyberbrain import trace


def main():
    @trace
    def fib_gen(count):
        a, b = 1, 1
        while count := count - 1:
            yield a
            a, b = b, a + b

    for fib_num in fib_gen(10):
        print(fib_num)


if __name__ == "__main__":
    main()
stdout/err

Starting grpc server on 50051...
fib_gen <cyberbrain.frame.Frame object at 0x7f7908a26d30>
jumped: False
1
fib_gen <cyberbrain.frame.Frame object at 0x7f7902c9c850>
jumped: False
Traceback (most recent call last):
  File "/workspace/.pip-modules/lib/python3.8/site-packages/cyberbrain/value_stack.py", line 130, in emit_event_and_update_stack
    handler = getattr(self, f"_{instr.opname}_handler")
AttributeError: 'Py38ValueStack' object has no attribute '_YIELD_VALUE_handler'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/tmp/vscode-extensions/[email protected]/extension/pythonFiles/ptvsd_launcher.py", line 43, in <module>
    main(ptvsdArgs)
  File "/tmp/vscode-extensions/[email protected]/extension/pythonFiles/lib/python/old_ptvsd/ptvsd/__main__.py", line 432, in main
    run()
  File "/tmp/vscode-extensions/[email protected]/extension/pythonFiles/lib/python/old_ptvsd/ptvsd/__main__.py", line 316, in run_file
    runpy.run_path(target, run_name='__main__')
  File "/home/gitpod/.pyenv/versions/3.8.6/lib/python3.8/runpy.py", line 265, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/gitpod/.pyenv/versions/3.8.6/lib/python3.8/runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/gitpod/.pyenv/versions/3.8.6/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/workspace/Cyberbrain/examples/gen.py", line 17, in <module>
    main()
  File "/workspace/Cyberbrain/examples/gen.py", line 12, in main
    for fib_num in fib_gen(10):
  File "/workspace/Cyberbrain/examples/gen.py", line 9, in fib_gen
    yield a
  File "/workspace/Cyberbrain/examples/gen.py", line 9, in fib_gen
    yield a
  File "/workspace/.pip-modules/lib/python3.8/site-packages/cyberbrain/tracer.py", line 235, in _local_tracer
    self.frame_logger.update(raw_frame)
  File "/workspace/.pip-modules/lib/python3.8/site-packages/cyberbrain/logger.py", line 131, in update
    self.frame.log_events(frame, instr, jumped)
  File "/workspace/.pip-modules/lib/python3.8/site-packages/cyberbrain/frame.py", line 133, in log_events
    event_info = self.value_stack.emit_event_and_update_stack(
  File "/workspace/.pip-modules/lib/python3.8/site-packages/cyberbrain/value_stack.py", line 132, in emit_event_and_update_stack
    raise AttributeError(
AttributeError: Please add
def _YIELD_VALUE_handler(self, instr):

Make Cyberbrain more intuitive to use

Background

In #49 and cool-RR's email, an issue was mentioned:
It is not intuitive how to run a different program and open the trace graph with new data.

Quote:

I tried out the demo on Gitpod and followed the instructions. After viewing the first example I ran a different example but didn't know how to view the graph. Eventually I figured out that killing the first process with Ctrl+C made it possible to reinitialize Cyberbrain with a new example.

I changed the value passed to the function and reran, expecting to see the graph updated with the new values when I hovered with the mouse. It wasn't. I also tried the command "Initialize CyberBrain" again and yet the value wasn't updated.

Clearly we need to do better at this.

How It Works Now?

Currently, the Cyberbrain Python lib (abbr. cb-py) launches a server. When users run "Initialize Cyberbrain" in VS Code, the The Cyberbrain VS Code extension (abbr. cb-vsc) talks to the server and fetches data, then visualizes it. The server listens on a fixed port, thus there can't be multiple running servers.

Proposed Solution

Note: the below solution takes into consideration a feature which has not been implemented yet: multi-frame tracing. The original design of this feature is described here, which may differ from the below solution. But the core idea keeps unchanged: let users pick the frame to visualize.

Overview

cb-vsc automatically starts a long-running server, let's call it coordination server (abbr. cs). The workflow looks like this:

  1. When cb-py finishes tracing a Python program
    • 2.1. If there's only one frame
      cb-py sends this frame to cs/cb-vsc, cb-vsc generates a new trace graph.
    • 2.2. If there are multiple frames
      3. cb-py sends the locations of these frames (aka FrameLocaterList) to cs/cb-vsc
      4. The user picks one frame (details TBD), cs/cb-vsc sends the location of this frame (FrameLocater) back to cb-py
      5. cb-py sends the selected frame to cs/cb-vsc, cb-vsc generates a new trace graph.

Note that in case 2.2, step 3-5 could repeat multiple times to allow visualizing different frames in the same execution. This requires server-side streaming RPC, so cs/cb-vsc can send multiple FrameLocaters to cb-py. Also cb-vsc should persist the locations of all available frames.

The proposed solution has a few benefits compared to the existing implementation:

  • Users only need to run the Python program, no need to run "Initialize Cyberbrain".
  • The experience keeps unchanged for running multiple programs and/or multiple times.

The Coordination Server

Requirements:

  • cs should remain active as long as VS Code is open, presumably with a periodic status check.

  • The listening port should be configurable. Potentially, we can use a config file ~/.cyberbrain_config and let cb-py and cb-vsc read it.

Open questions:

  • Will vsc launch multiple coordination servers when there are multiple opened window? If yes, we need to handle it gracefully.

Things to Take into Consideration

Stateless

The infrastructure should be as stateless as possible, otherwise it would be very complicated to maintain and extend.

Future Proof

The design should work well with future features to add, including (but not limited to) multi-frame tracing, though this is hard since things may change or news features are planned.

Needs to work well with codelens #34

The report sent from cb-py could potentially carry information to tell cb-vsc where to show codelens, so that users are aware of the click-to-enable-trace-graph feature. This (and supporting for multi-frame tracing) also means that the Python program needs to stay alive before manually terminated.

Remote Debugging Friendly

The design should be able to support remote debugging.

AI: Learn how remote debugging works.

Third-party Friendly

We should not rely on VS Code specific things, and if users want to build their own cs and frontend, they should be able to do so.

Show links for replaced nodes in loops

image

When nodes in a loop are replaced, the lines involving replaced nodes disappear. I believe even if these edges are not accurate after modifying loop counters, they still provide useful information thus should be kept.

We can use dotted lines for this, but we need to add a caption or tooltip to show what it means.

Support Cyberbrain in more editors and IDEs

There are countless editors and IDEs out there. For convenience, I'll call them environments. I'd like to see Cyberbrain integrated with all of them, but this simply is not possible given the limited time I have. Considering the technologies Cyberbrain is using, here's what I'm gonna do.

  • Cyberbrain will officially support major vscode-compatible environments
  • Support for non-vscode-compatible environments will rely on the community
    I'm committed to provide as much help as I can, including but not limited to:
    • Answering questions
    • Audio/video 1:1
    • Making necessary code changes
    • Pair programming

The reason is simple. The only environment we now support is VS Code (local), thus it's much easier to support vscode-compatible environments than others.

Based on the strategy, the environments we will offically support include:

Please let me know if there's more.

The environments that we will rely on the community to support include:

  • All non-web IDEs (PyCharm/Eclipse/Visual Studio/etc)
  • Vim/Emacs/Sublime/Atom/etc
  • Jyputer notebook
  • Command line

I will create a formal specification of the internal API to help people build third-party tools.

There is no preset timeline for when each environment will be supported, or in which version. I want to keep it flexible, and most likely, the environments that more people requested for will be supported first. Once a new environment is supported, we'll release a new minor version.

For requesting support for another vscode-compatible environment, please open a separate issue.

If you want to migrate Cyberbrain to a non-vscode-compatible environment, please contact me directly on Twitter or Discord. I'm happy to discuss it anytime.

Improve Cyberbrain API

Provide a decorator API

Sometimes a decorator is more convenient than .start() and .end(), especially when the function can return from different places.

Also since at this point we don't support multi-frame tracing, a decorator API should be more user-friendly.

Provide a default tracer object

Provide a way to disable tracing

This would allow users to control whether to enable tracing via a flag, so that they don't need to modify their code.

Specifics TBD.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.