entangled / filters Goto Github PK

Set of pandoc filters for literate programming

License: Apache License 2.0

Python 89.29% Dhall 4.09% C++ 0.37% Makefile 5.12% HTML 0.55% Dockerfile 0.58%

filters's Introduction

title	author
Entangled	Johan Hidding

Literate programming [/ˈlɪtəɹət ˈpɹəʊɡɹæmɪŋ/]{.phonetic} (computing) Literate programming is a programming paradigm introduced by Donald Knuth in which a program is given as an explanation of the program logic in a natural language, such as English, interspersed with snippets of macros and traditional source code, from which a compilable source code can be generated. (Wikipedia)

In short: you write Markdown containing code fragments. These code fragments are combined into working code in a process called tangling.

Entangled makes writing literate programs easier by keeping code blocks in markdown up-to-date with generated source files. By monitoring the tangled source files, any change in the master document or source files is reflected in the other. In practice this means:

Write well documented code using Markdown.
Use any programming language you like (or are forced to use).
Keep debugging and using other IDE features without change.
Generate a report in PDF or HTML from the same source (see examples at Entangled homepage).

Status

Entangled is approaching 1.0 release! It has been tested Linux, Windows and MacOS. Still, it is highly recommended to use version control and commit often. If you encounter unexpected behaviour, please post an issue and describe the steps to reproduce.

Features:

live bi-directional updates
(reasonably) robust against wrongly edited source files
configurable with Dhall
hackable through SQLite
create PDF or HTML pages from literate source
line directives to point compilers to markdown source

Building

Entangled is written in Haskell, and uses the cabal build system. You can build an executable by running

# (requires cabal >= 3.x)
cabal build

Install the executable in your ~/.local/bin

cabal install

Run unit tests

cabal test

Using

Entangled should be run from the command-line. The idea is that you run it from the root folder of the project that you're working on. This folder should contain a entangled.dhall file that contains the configuration. You can get an example config file by running

entangled config

This config asumes you have the markdown files in a folder named ./lit, and stores information in a SQLite3 database located at ./.entangled/db. To run the daemon,

entangled daemon [files ...]

where the [files ...] bits is sequence of additional files that you want monitored.

Syntax (markdown side)

The markdown syntax Entangled uses is compatible with Pandoc's. This relies on the use of fenced code attributes. To tangle a code block to a file:

``` {.bash file=src/count.sh}
   ...
```

Composing a file using multiple code blocks is done through noweb syntax. You can reference a named code block in another code block by putting something like <<named-code-block>> on a single line. This reference may be indented. Such an indentation is then prefixed to each line in the final result.

A named code block should have an identifier given:

``` {.python #named-code-block}
   ...
```

If a name appears multiple times in the source, the code blocks are concatenated during tangling. When weaving, the first code block with a certain name will appear as <<name>>=, while consecutive code blocks with the same name will appear as <<name>>+=.

Please see the Hello World and other examples!

Syntax (source side)

In the source code we know exactly where the code came from, so there would be no strict need for extra syntax there. However, once we start to edit the source file it may not be clear where the extra code needs to end up. To make our life a little easier, named code blocks that were tangled into the file are marked with a comment at begin and end.

// ~|~ begin <<lit/story.md|main-body>>[0]
std::cout << "Hello, World!" << std::endl;
// ~|~ end

These comments should not be tampered with!

Running `entangled`

Assuming you have created a Markdown file, say program.md, you can start entangled by running

entangled daemon ./program.md

in the shell. You may run entangled --help to get help on options, or check out the user manual.

Running `entangled` with Docker

Entangled is available as a Docker image.

Assuming you have created a Markdown file, say program.md, you can start entangled by running

docker run --rm --user $(id -u):$(id -g) --volume $PWD:/data nlesc/entangled daemon ./program.md

This command starts a Docker container with the current working directory mounted as /data and running with your user/group id so files are written with the correct ownership.

Distribution

If you've written a literate code using Entangled and would like to distribute it, one way is to include the tangled source code in the tar ball. You may also wish to use the pandoc filters included in entangled/filters.

Development

Credits

The following persons have made contributions to Entangled:

Michał J. Gajda (gh:mgajda), first implemented the line-directive feature
Danny Wilson (gh:vizanto), first implemented the project annotation

Generating manpage

pandoc lit/a2-manpage.md -s -t man | /usr/bin/man -l -

License

Entangled is distributed under the Apache v2 license.

filters's People

Stargazers

Watchers

Forkers

sverhoeven danielytics elviejo79 olemussmann numero-744

filters's Issues

Have a .noshow class so individual code blocks can be hidden.

In pandoc-annotate-codeblocks (or perhaps a new filter), it'd be cool if a noshow class can be added. It'd do two things:

It'll remove noshow code blocks from the document output
When references to noshow blocks are found in code, they'll be removed from the document output as well

This would be useful for code blocks that the author doesn't want to comment on or really show, since the details are too gratuitous, or otherwise out of scope of the document, but are still required for the generated file to function.

Create a module system for .eval

At the moment, I think .eval is just treated as a synonymn for .doctest, which is a jupyter thing. However, the idea of evaluating a codeblock and rendering the result is actually pretty general, and there could be many different ways of implementing that for different languages, and even different ways of doing that for the same languages.

If we had a module system that somehow exposed .eval blocks in a nice clean way, users could write their own filters for their use-cases, and contribute them back with a MR. This seems to me to be the fastest way to give this functionality a boost, and get it in the hands of the users, without requiring some universal REPL/documentation technology like jupyter.

Cache results from code evaluation

Code evaluation can take some time.

Add author field in card deck

When a bootstrap card refers to external work it would be nice to have a dedicated field to show the author of that work.

Allow the user to provide a list of jupyter kernels

The user should be able to provide a list of jupyter kernels in the event that they want to be able to evaluate multiple different languages. The correct kernel should be matched to the language of the codeblock being evaluated.

Pandoc eats last newline

When I have a code block like

foo
bar

I expected the file to contain foo\nbar\n but it contains foo\nbar.

Also

foo
bar

Gives foo\nbar

But

foo
bar

Gives foo\nbar\n

Format string for references and reference definitions

I think it would be interesting to add 2 format strings to the YAML header in pandoc-annotate-codeblocks that could change the way code block references and reference definitions are displayed.

Here's the syntax I think we could use for these format strings:

Reference Format Syntax	Brief Description	Example (with `<<my-reference>>`)
`%n`	Name	`"my-reference"`
`%N`	Display Name	`"My Reference"`
`%d`	Link to Definition	format string `<<[%n](%d)>>` might yield `<<my-reference>>`
`\%`	Literal %	only for cases of ambiguity
`\\`	Literal \	only for cases of ambiguity

Definition Format Syntax	Brief Description	Example (with `<<my-definition>>`)
`%n`	Name	`"my-definition"`
`%N`	Display Name	`"My Definition"`
`%r`	Comma-seperated list of references, using the user's specified reference format
`%R`	Comma-seperated list of hyperlinked references, using the user's specified reference format
`\%`	Literal %	only for cases of ambiguity
`\\`	Literal \	only for cases of ambiguity

Ideally, these format strings would be parsed for markdown content. So, if a string contained *ref:* %n, then ref: would be given emphasis. Once the markdown content for a reference or definition is rendered into an AST, we could just render it directly in the file.

#4, since %d would deliver the same functionality

Exception: unknown tag: ColWidth

Apologizes if I'm reporting in the wrong repository.

When following your tutorial, and after installing everything properly, I stumbled upon this error:

make site
pandoc --template style/template.html --css css/bootstrap.css --css css/mods.css -t html5 -s --mathjax --toc --toc-depth 1 --filter pandoc-bootstrap -f markdown+multiline_tables+simple_tables lit/index.md -o docs/index.html
Traceback (most recent call last):
  File "/home/ngirard/.local/bin/pandoc-bootstrap", line 8, in <module>
    sys.exit(main())
  File "/home/ngirard/.local/lib/python3.8/site-packages/entangled/bootstrap.py", line 160, in main
    run_filters([bootstrap_card_deck, bootstrap_fold_code], prepare=prepare, doc=doc)
  File "/home/ngirard/.local/lib/python3.8/site-packages/panflute/io.py", line 233, in run_filters
    doc = load(input_stream=input_stream)
  File "/home/ngirard/.local/lib/python3.8/site-packages/panflute/io.py", line 56, in load
    doc = json.load(input_stream, object_pairs_hook=from_json)
  File "/usr/lib/python3.8/json/__init__.py", line 293, in load
    return loads(fp.read(),
  File "/usr/lib/python3.8/json/__init__.py", line 370, in loads
    return cls(**kw).decode(s)
  File "/usr/lib/python3.8/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.8/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
  File "/home/ngirard/.local/lib/python3.8/site-packages/panflute/elements.py", line 1478, in from_json
    raise Exception('unknown tag: ' + tag)
Exception: unknown tag: ColWidth
Error running filter pandoc-bootstrap:
Filter returned error status 1
make: *** [Makefile:53 : docs/index.html] Erreur 83

Any ideas ?

Cheers

Add option to filter for doc-strings

Currently entangled does not work well with code that also contains doc-strings, specifically to generate Haddock, Doxygen or Sphinx docs. In a literate document these comments are often superfluous. The weave script should have an option to filter out doc-strings.

Cross-references with hyperlinks

Make references in code clickable, linking to their definition.

Unwanted tab to space conversion

I created a Makefile code block in https://github.com/NLESC-JCER/cpp2wasm/tree/0079515ab5da4786a2af299b9f55a23ca7db47ee/INSTALL.md with tabs as indentation.

Running docker run --rm -ti --user $(id -u) -v ${PWD}:/data nlesc/pandoc-tangle README.md INSTALL.md, generates a Makefile, but each tab have been converted to 4 spaces. This is invalid for make. I was expecting the tab characters would be retained.

The Docker image is based on commit 362341a.