<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Thanks for the feedback! Yes, some of the code

Some further thoughts: Don't know if it would h

Code restructuring and collaboration about codebraid HOT 4 OPEN

gpoore commented on June 4, 2024

Code restructuring and collaboration

from codebraid.

Comments (4)

gpoore commented on June 4, 2024

Thanks for the feedback!

Yes, some of the code could be improved. With _run(), part of the issue is that I've rewritten it a half-dozen times or so at this point and still have a lot of features to add, so it's nowhere near final form. The other area that's particularly complex is the Pandoc AST handling. In that case, further simplification/refactoring may be limited by performance considerations.
Yes, there are many ways things could potentially be reorganized.
There are some type hints in function definitions. There will be more type hints once Python 3.5 support is dropped.
Pandoc has some disadvantages. It affects performance since it must be run several times and there is a lot of serialization/deserialization. However, I don't believe there are any other Markdown variants that have comparable built-in features. If you want Pandoc's power, it's best to use Pandoc from the beginning. At a minimum, Codebraid requires built-in support for a standardized attribute syntax for both code blocks and inline code, which I believe is still lacking in CommonMark after all these years. Another advantage is that if everything is based off the Pandoc AST, then in principle with a few changes Codebraid can add support for LaTeX and other formats supported by Pandoc.
Codebraid's Markdown support is fundamentally built on Pandoc. Without Pandoc, Codebraid would have to parse Markdown to locate code. That would require a Markdown parser, which then goes back to point 4. Many programs have done things similar to Codebraid by using regex to locate code. But that will fail at times unless at least a partial Markdown parser has been implemented, which again goes back to point 4. Codebraid could just use Pandoc to convert to Markdown, but if Pandoc is being used anyway it is more efficient to go straight to the final output format.
Jupyter kernels are great when used in the notebook, because you can get results back quickly without rerunning previous code or reloading data (though this can lead to problems when previous code should be re-executed but is not). Codebraid always starts a kernel, runs all code for a session, and then stops the kernel. At least for a lot of what I do, the built-in system can be finished before a Jupyter kernel can start. I have thought about adding a different mode that would keep kernels running in the background.

There are other limitations of Jupyter kernels. When you run code with a typical kernel, you are getting the result of the language plus the kernel. You aren't getting the result of sending the code through the language's standard compiler or interpreter. When I'm writing about code and giving example output, I sometimes want the default result without any Jupyter modifications or add-ons.

Also, my impression is that creating a Jupyter kernel for some languages, especially compiled languages, can be very complex. The built-in system just needs a few lines of template code to add a new language.

from codebraid.

teucer commented on June 4, 2024

Some further thoughts:

Don't know if it would help to think about architecture now or wait for the "final" solution before refactoring. There are a pros and cons with both approaches. Maybe there is a middle ground (?)
I was wondering if one could internally use decorators to achieve that.
Ok.
This is how e.g. marked parses the file. E.g.

`print(1 + 2)`{.python .cb.run example=true}

is parsed as

{type:"space", raw:"\n\n"}
{type:"paragraph", raw:"`print(1 + 2)`{.python .cb.run example=true}", text:"`print(1 + 2)`{.python .cb.run example=true}", tokens:[
  {type:"codespan", raw:"`print(1 + 2)`", text:"print(1 + 2)"}
  {type:"text", raw:"{.python .cb.run example=true}", text:"{.python .cb.run example=true}"}
]}

cmark has a similar AST. I believe this is much easier to work with than pandoc's AST. Besides with cmark (it has a python wrapper) it is all in memory, no need to call pandoc to save the file and reload with json.

This creates an internal dependency with pandoc. E.g. I have seen that you manually need to add pandoc's parameters. Having a single purpose tool is better imho.
Ok.

from codebraid.

gpoore commented on June 4, 2024

For 1 and 2: I agree that some of the code could be organized better, but I am hesitant to do any refactoring simply on that basis. I prefer to refactor when limitations of the current, working code become apparent, or when refactoring has objective, concrete, well-defined goals. At least in my experience, refactoring as part of adding functionality results in code that mirrors functionality. I've wasted a lot of time in the past by refactoring to make things look nicer, which ultimately produced an architecture that was incompatible with adding future functionality and thus eventually resulted in even more refactoring.

For 4: The ASTs produced by marked and cmark are simpler because they don't do nearly as much. Neither parses code block attributes and neither supports inline code attributes, so Codebraid would then have to perform additional AST modifications to get its own, custom AST that contains all the necessary information. Pandoc's AST is more complex, but that's because it does more and as a result is ready to use immediately. Also, everything already works with Pandoc's AST, and being compatible with Pandoc brings a lot of possibities for the future.

For 5: I think we simply have philosophical differences here. I'm primarily concerned with Codebraid being easy to use. Letting people run code in their Pandoc documents by adding codebraid in front of the command they would normally use is about as simple as it gets. Having users pipe codebraid output through pandoc makes things more complicated for users, and also adds a lot of extra overhead when speed is already a concern. Also, this is already the way things work, so changing things would mean that everyone currently using Codebraid would suddenly have to change their workflows.

from codebraid.

mfhepp commented on June 4, 2024

+1 for keeping Pandoc as a core dependency.

from codebraid.

Code restructuring and collaboration about codebraid HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs