GithubHelp home page GithubHelp logo

jupyter-summarytools's Introduction

GitHub

PyPI PyPI - Status PyPI - Downloads GitHub last commit

DataFrame Summary Tools in Jupyter Notebook

This is python version of summarytools, which is used to generate standardized and comprehensive summary of dataframe in Jupyter Notebooks.

The idea is originated from the summarytools R package (https://github.com/dcomtois/summarytools).

Installation

pip install summarytools

Dependencies

  1. python 3.6+
  2. pandas >= 1.4.0

Quick Start

the quick-start notebook is available in here or Open In Colab

out-of-box dfSummary function will generate a HTML based data frame summary.

import pandas as pd
from summarytools import dfSummary
titanic = pd.read_csv('./data/titanic.csv')
dfSummary(titanic)

collapsible summary

import pandas as pd
from summarytools import dfSummary
titanic = pd.read_csv('./data/titanic.csv')
dfSummary(titanic, is_collapsible = True)

tabbed summary

import pandas as pd
from summarytools import dfSummary, tabset
titanic = pd.read_csv('./data/titanic.csv')
vaccine = pd.read_csv('./data/country_vaccinations.csv')
vaccine['date'] = pd.to_datetime(vaccine['date'])

tabset({
    'titanic': dfSummary(titanic).render(),
    'vaccine': dfSummary(vaccine).render()})

Export notebook as HTML

when export jupyter notebook to HTML, make sure Export Embedded HTML extension is installed and enabled.

Using the following bash command to retain the data frame summary in exported HTML.

jupyter nbconvert --to html_embed path/of/your/notebook.ipynb

jupyter-summarytools's People

Contributors

6chaoran avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

jupyter-summarytools's Issues

does not work in google colab

Sorry that your module does not work in google colab
ImportError: cannot import name 'FilePath' from 'pandas._typing' (/usr/local/lib/python3.8/dist-packages/pandas/_typing.py)
pandas version : pandas-1.5.3

Incorrect summary (column names do not match the contents)

Consider the following code:

import pandas as pd
from summarytools import dfSummary

url = 'https://people.sc.fsu.edu/~jburkardt/data/csv/hw_200.csv'
df2 = pd.read_csv(url)
print(df2.head())

dfSummary(df2)

df2.mean()

Column names and summaries are mixed between several rows.

In this example, column 3 has no summaries:

image

And here it is visible that at least the mean is shown for the wrong variable:

image

Can this be fixed?

Contributions to summarytools

Hi @6chaoran ,
I love the original R package, and would have looked into making my own package if I didn't find it online.

How is this repo connected to the Pypi package that @Buckeyes2019 put up? Would you be interesting in making this repo a package that gets pushed to Pypi? I couldn't find it in a fork from @Buckeyes2019.

If from summarytools.summarytools import dfSummary is not ideal, we can make some changes to the package structure to get from summarytools import dfSummary. Let me know if you're open to any of changes or contributions, or I can make something similar on my own.

Thanks again for the port!

Converted repo to a Pypi package

I really like your work here but wanted to make the summarytools features a little easier to use so I packaged your repo into a Pypi library. See here: https://pypi.org/project/summarytools/

Users can pip install it now rather than having to clone the repo into individual project directories. The usage changed very slightly (to "from summarytools.summarytools import dfSummary" rather than "from summarytools import dfSummary"), but it seems to work properly.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.