janpalasek / pretty-jupyter Goto Github PK

View Code? Open in Web Editor NEW

284.0 284.0 11.0 14.29 MB

Creates dynamic html report from jupyter notebook.

Home Page: https://pretty-jupyter.readthedocs.io/

License: GNU General Public License v3.0

Python 67.33% Jinja 17.11% CSS 4.07% JavaScript 9.81% PowerShell 1.68%

jupyter jupyter-notebook nbconvert notebook pretty-jupyter

pretty-jupyter's People

Contributors

Stargazers

Watchers

Forkers

huhuhang matulad ampacimon jeffamaxey cyber-handle-enterprise peteralexandercharles lenapheno shenrun tamarjanssens chutommy dapper-magician

pretty-jupyter's Issues

Move showing of all code to a better place

Tokens in code cells

Intro

Add support to tokens in code cells. This can be e.g. in jinja markdown or possibly other format too.

Analysis

Adding support for tokens in code cells would allow us e.g. to use only code cells and completely ignored md cells.

Example:

## Header
[//]: <> (-.- tabset)
We could write variable's value {{ variable }} such as this.

There is, however, one overarching problem: what to do with input cells. When we use code cell as a markdown, it provides no useful additional information. The user won't probably want to look at the code of the markdown with variables, there's no interesting information in there.

Therefore we want to (by default) hide those kind of input cells.

Dynamically created tabsets

Tabset

Problematic input:

# Header 1
### Tabset
[]: <> (-.- tabset)

#### Tab 1

#### Tab 2

## Chapter

The algorithm doesn't know that the tabset needs to end before Chapter section and not include it.

There might be some problematic counter examples, need to investigate.

Errorneous table of content

Linkedin

1 Dataset
2 NER model
- 2.1 Something
- 2.2 Yes

Dataset

NER model

Something

Yes

Tried generating html with this using

jupyter nbconvert --template pj new_report.ipynb --to html

got weird toc like this:

Minimum version for nbconvert

Hi Jan,

It seems that your package depends on the new templating features of nbconvert introduced in nbconvert 6.0: described here

And it's not compatible with nbconvert 5.0 right? If so, I suggest adding in a minimum version to your package requirements.

Create templates

Test the visual appearance after generating the report

Selenium?
Something better?

Metadata

Move metadata to a sub-attribute. Overriding will happen only on this sub-attribute.
Print all metadata (except for reserved keywords) in meta tag.
Print author, date and title. Change the page header to look better in this situation.

Yaml metadata specification

Look at other jupyter project and decide whether it’s a good idea. Would it go along other such as papermill etc?
It might help reduce the pain of working with ipynb internal json structure, even though it is the usual way.

Itables

Itables without initialization currently doesn't work.

Code folding

Get inspired from Rmd styles to implement folding for:

inputs with codes (wrap/unwrap)
markdown (or jinja markdown) code sections

Not properly nesting TOC

# Chapter 1
## Ignore 
[//]: # (-.- .toc-ignore)

## Not Ignore

# Chapter 2

jquery.tocify will output this as a list with one-level with three entries:

Chapter 1
Not Ignore
Chapter 2

Instead of:

Chapter 1
- Not Ignore
Chapter 2

The problem occurs only if the first subheader in the section is set as ignored.

How to bypass: The first subheader of a section must not have toc-ignore attribute.

CI Fix

Problem with versions of chrome in selenium tests.

jinja markdown shortcut

Since %%jinja markdown is potentially written in half of the cells, it would be nice to have some kind of shortcut. E.g. jmd.

Frontend Pretty Jupyter tool with web-assembly

The description will be filled continously.

Find out whether it is possible to create a simple frontend pretty jupyter tool: Accepts file, transforms using WA and outputs html file for download
- Will pyodide support pretty-jupyter and nbconvert? Nbconvert likely yes, since there is a client-side jupyter JupyterLite.
- Interoperation javascript + nbconvert.
How to implement it? We likely cannot store file etc, we can just access it from javascript.
- Will some changes be needed on pretty-jupyter's part?
If it is possible, implement it.

Customising styles

Design an approach how to override and customise pages styles.

Long tables

Long tables lead to overflow, which is currently hidden because useless scrollbar were appearing. The useless scrollbars appeared because the title is not in the main-container.

Solution: Move the title with header to main container and enable scrollbars for very long tables.

Note: Move the hide/show all button as well.

Sphinx

Creating docs in sphinx and having it hosted on readthedocs would improve structuring, quality and versioning of the docs.

Add themes

Support for multiple themes.
Add more themes to the base package.
Allow to specify custom theme e.g. from url.
Add example for themes and add it to docs.
Link the themes as references (inside the file) and not embedded content. This will help in overriding the styles.
Consider whether to force some different color that bootstraps red default for <code> element.
Add new files to setup.py.
Check performance.

Section numbering

Support unnumbered instead of toc-ignore for numbering. Edit command.

Future work

Link to stylesheet
The execution order for each code cell
Interpret author etc as a markdown. Allow html tags.
Parametrize cell-level metadata.

<Tabsets> and <nbconvert --embed-images> can't be used together when exporting notebooks to html reports

Hello, Pretty Jupyter team,

Thank you for sharing this project and I really love the functionalities.

However, I think I found an issue that:

If I need to use the tabsets functionality, I can't use --embed-images in nbconvert when exporting, because using <--embed-images> will cause issues to the tabsets functionality.

Can you take a look at this issue? Many thanks!

Regards,
Chengfeng Liu

Cell-wise settings

We want to set up for each cell some kind of settings, such as: remove its output, input, stream outputs and stderr.
Decide what is the best way: celltags, or some other way?

Code inside Markdown

This does not get displayed correctly in the output HTML. However, it works without the details and summary tags.

<details>
<summary>...</summary>

```sql
select *
from tab1
where tab1.x = 10;

```

Embedding images

Thought it won't work, but it works. I'll add it to the tests.

any plan of adding 'plotly' support?

Hi,

awesome work, thank you for sharing this.
I wonder if you are planning to add interactive plot packages support? like R markdown knitted file. it supports plotly, highcharter and many interactive plotting packages.

thank you

Class to avoid global JS changes

E.g. some styling is applied to all tables in the container. There should be possibility to avoid this. There should most likely be st like a class on the table that allows us to turn all styling off.

Investigate blank space at the end

I am not sure if you call it an issue, but I think the generated html would look better without the blank space at the end.
Thank you again for your time and your package,
Best wishes

Originally posted by @chengfeng-liu in #67 (comment)

Allow to show errors and style it

Update before Release

Update texts before release:

README
Examples on web-page
Docs

Generalize tokens

Currently the tokens ([//]: <> (-.- token1 token2) serve only for tabsets etc. It could be generalized to add classes or set ids for specified situations.

We can implement it this way: The token will add classes or ids to the previous element. This way, we could style markdown tables, markdown images, refs etc. and e.g. also disable default styling for tables.

Implementation. Also add a different separator for future generalizations. When generating the token, generate token params to the insides of span to have greater power over the content.
Testing. We need to create special selenium tests for this that will validate that even in weird situations it works. This functionality is prone to bugs, so it must be tested as well as possible.
Docs.

Remove not used imports

Goal is to remove js and css imports that are not used by Pretty Jupyter. Also some dependencies will be optional and it will be able to specify them in the notebook metadata.

Hide only some cells

Note: I think Jupyter supports this internally with cell metadata.

Code cells style

Tabsets do not work with $$ math mode

This is caused by jQuery function nextUntil ignoring text nodes, and:

$$a = 5$$ is exported in markdown into text node " $$a = 5$$ ".

How to design my own page header

Add options to TOC

Optional numbering of headers.
Others picked from http://gregfranko.com/jquery.tocify.js/?

Styles

Improve styles for:

Code
Error outputs.

Support for multiple tabset styles

Introduction

Tabset can have two types: tabset-pills and nav-tabs. Currently there is only pills supported. The goal of this issue is to support other tokens as well, such as these two.

Analysis

Currently the approach is to transform the tokens from md comment into html span elements with class contianing the tokens.

Ex: [//]: <> (-.- token1 token2) -> 

To process the tabset, javascript looks into the first

element after header containing with the appropriate tabset class.

This makes the inherent assumption that markdown wraps all the spans into paragraph, which might not be true. This depends on markdown processor.

More safe approach:

Create tokens as spans. We can ensure that the tokes will be unique by their name. We can add special class that makes the tokens easily searchable
[//]: <> (-.- token1) -> 
One entire comment will lead to one span. This will be more resistant markdown -> html transformation. We need to change the markdown cell preprocessing to have the regex parse multiline etc.
[//]: <> (-.- token1 token2) -> 
We need to have the javascript more resistant to the markdown processor. For tabset specifically, instead of searching headers, we can search for the elements and the closest wrapper div (e.g. with class pretty-jupyter-section, which we will add before by javascript), while still recommending in the documentation to have it right after the header.
For example in the HTML above the algorithm would first identify the span with class pretty-jupyter-token and attribute tabset, also find out that the span has attribute tabset-pills and it would find the closest parent with class pretty-jupyter-section.

<div class="pretty-jupyter-section section level3">
  <h3>Some header</h3>
  <p><span class='pretty-jupyter-token tabset tabset-pills'></span></p>
</div>

Final transformation:

Original md:

### Some header
[//]: <> (-.- tabset tabset-pills)
[//]: <> (-.- some-token)

Into

Transformed md:

### Some header
<span class='pretty-jupyter-token tabset tabset-pills' ...></span><span class='pretty-jupyter-token some-token' ...></span>

<div class="pretty-jupyter-section section level3">
  <h3>Some header</h3>
  <p><span class='pretty-jupyter-token tabset tabset-pills' ...></span></p>
  <p><span class='pretty-jupyter-token some-token ...></span></p>
</div>

Note that the span must be hidden (display none).

Export to pdf

The export to pdf is not goal of this project. However if one chooses to do so, it would be good to at least do the following things:

Hide Jinja Markdown input cells by default.
Turn off code folding.
Do not generate tabsets (ignore it).
Don't generate table of contents (or generate a different one, mb future work).

The easiest way could be to create an inherited template jinja template for latex and then use pandoc as nbconvert probably does. Since nbconvert's template don't support tabsets etc, we wouldn't have to worry about it being set for pdf.

TODO:

Configurations different for latex and html? Should be able to specify both in metadata. Some settings would be good to move to "html" section in metadata, e.g. code_folding, toc etc. They are usually left as a default by users anyway. Title, however, must stay where it is.

2.0.0 Allow override any notebook metadata setting from console

The goal is to enable users to override any notebook metadata from the command-line when generating the output. This will lead to a greater flexibility at no additional cost.

Implementation.
Testing.
Docs.

Add optional dependencies

Add optional dependencies with the following functionality:

matplotlib figure to embedded html, link html, link markdown (good for pdf)
pandas to pageable html
plotly to embedded image (no dynamic)
...

Not-ended HTML tags

Sometimes when I do not end tags correctly, the report breaks:

E.g.: I've written this (notice nto correctly ended )

<details> <summary>...</summary> </details>

And both TOC and tabset stopped working.

This should be investigated and addressed.

janpalasek / pretty-jupyter Goto Github PK

pretty-jupyter's People

Contributors

Stargazers

Watchers

Forkers

pretty-jupyter's Issues

Intro

Analysis

Dataset

NER model

Something

Yes

Introduction

Analysis

Recommend Projects

Recommend Topics

Recommend Org

Jobs