GithubHelp home page GithubHelp logo

janosh / pymatviz Goto Github PK

View Code? Open in Web Editor NEW
119.0 7.0 9.0 67.12 MB

A toolkit for visualizations in materials informatics.

Home Page: https://janosh.github.io/pymatviz

License: MIT License

Python 93.93% CSS 1.72% HTML 0.57% Svelte 2.55% TypeScript 0.83% JavaScript 0.40%
machine-learning materials-informatics data-visualization uncertainty uncertainty-calibration plots matplotlib plotly materials-science python

pymatviz's Introduction

Logo
pymatviz

A toolkit for visualizations in materials informatics.

Tests This project supports Python 3.9+ PyPI PyPI Downloads Zenodo

If you use pymatviz in your research, see how to cite.

Installation

pip install pymatviz

API Docs

See the /api page.

Usage

See the Jupyter notebooks under examples/ for how to use pymatviz. PRs with additional examples are welcome! ๐Ÿ™

matbench_dielectric_eda.ipynb Open in Google Colab Launch Codespace
mp_bimodal_e_form.ipynb Open in Google Colab Launch Codespace
matbench_perovskites_eda.ipynb Open in Google Colab Launch Codespace
mprester_ptable.ipynb Open in Google Colab Launch Codespace

Periodic Table

See pymatviz/ptable.py. Heatmaps of the periodic table can be plotted both with matplotlib and plotly. plotly supports displaying additional data on hover or full interactivity through Dash.

ptable_heatmap(compositions, log=True) ptable_heatmap_ratio(comps_a, comps_b)
ptable-heatmap ptable-heatmap-ratio
ptable_heatmap_plotly(atomic_masses) ptable_heatmap_plotly(compositions, log=True)
ptable-heatmap-plotly-more-hover-data ptable-heatmap-plotly-log
ptable_hists(data, colormap="coolwarm" ptable_plots(data, colormap="coolwarm"
ptable-hists ptable-plots

Phonons

See pymatviz/phonons.py.

plot_phonon_bands(bands_dict) plot_phonon_dos(doses_dict)
phonon-bands phonon-dos
plot_phonon_bands_and_dos(bands_dict, doses_dict) plot_phonon_bands_and_dos(single_bands, single_dos)
phonon-bands-and-dos-mp-2758 phonon-bands-and-dos-mp-23907

Dash app using ptable_heatmap_plotly()

See examples/mprester_ptable.ipynb.

2022-07-28-ptable_heatmap_plotly-dash-example.mp4

Sunburst

See pymatviz/sunburst.py.

spacegroup_sunburst([65, 134, 225, ...]) spacegroup_sunburst(["C2/m", "P-43m", "Fm-3m", ...])
spg-num-sunburst spg-symbol-sunburst

Sankey

See pymatviz/sankey.py.

sankey_from_2_df_cols(df_perovskites) sankey_from_2_df_cols(df_rand_ints)
sankey-spglib-vs-aflow-spacegroups sankey-from-2-df-cols-randints

Structure

See pymatviz/structure_viz.py. Currently structure plotting is only supported with matplotlib in 2d. 3d interactive plots (probably with plotly) are on the road map.

plot_structure_2d(mp_19017) plot_structure_2d(mp_12712)
struct-2d-mp-19017-Li4Mn0.8Fe1.6P4C1.6O16-disordered struct-2d-mp-12712-Hf9Zr9Pd24-disordered

matbench-phonons-structures-2d

Histograms

See pymatviz/histograms.py.

spacegroup_hist([65, 134, 225, ...], backend="matplotlib") spacegroup_hist(["C2/m", "P-43m", "Fm-3m", ...], backend="matplotlib")
spg-num-hist-matplotlib spg-symbol-hist-matplotlib
spacegroup_hist([65, 134, 225, ...], backend="plotly") spacegroup_hist(["C2/m", "P-43m", "Fm-3m", ...], backend="plotly")
spg-num-hist-plotly spg-symbol-hist-plotly
elements_hist(compositions, log=True, bar_values='count')
elements-hist

Parity Plots

See pymatviz/parity.py.

density_scatter(xs, ys, ...) density_scatter_with_hist(xs, ys, ...)
density-scatter density-scatter-with-hist
density_hexbin(xs, ys, ...) density_hexbin_with_hist(xs, ys, ...)
density-hexbin density-hexbin-with-hist
scatter_with_err_bar(xs, ys, yerr, ...) residual_vs_actual(y_true, y_pred, ...)
scatter-with-err-bar residual-vs-actual

Uncertainty

See pymatviz/uncertainty.py.

qq_gaussian(y_true, y_pred, y_std) qq_gaussian(y_true, y_pred, y_std: dict)
normal-prob-plot normal-prob-plot-multiple
error_decay_with_uncert(y_true, y_pred, y_std) error_decay_with_uncert(y_true, y_pred, y_std: dict)
error-decay-with-uncert error-decay-with-uncert-multiple

Cumulative Metrics

See pymatviz/cumulative.py.

cumulative_error(preds, targets) cumulative_residual(preds, targets)
cumulative-error cumulative-residual

Classification

See pymatviz/relevance.py.

roc_curve(targets, proba_pos) precision_recall_curve(targets, proba_pos)
roc-curve precision-recall-curve

Correlation

See pymatviz/correlation.py.

marchenko_pastur(corr_mat, gamma=ncols/nrows) marchenko_pastur(corr_mat_significant_eval, gamma=ncols/nrows)
marchenko-pastur marchenko-pastur-significant-eval

How to cite pymatviz

See citation.cff or cite the Zenodo record using the following BibTeX entry:

@software{riebesell_pymatviz_2022,
  title = {Pymatviz: visualization toolkit for materials informatics},
  author = {Riebesell, Janosh and Goodall, Rhys and Baird, Sterling G.},
  date = {2022-10-01},
  year = {2022},
  doi = {10.5281/zenodo.7486816},
  url = {https://github.com/janosh/pymatviz},
  note = {10.5281/zenodo.7486816 - https://github.com/janosh/pymatviz},
  urldate = {2023-01-01}, % optional, replace with your date of access
  version = {0.8.1}, % replace with the version you use
}

pymatviz's People

Contributors

comprhys avatar danielyang59 avatar jageo avatar janosh avatar pre-commit-ci[bot] avatar sgbaird avatar tinaatucsd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

pymatviz's Issues

Periodic table heatmap raises error for values = 1 for `log=True`

I get an error when using ptable_heatmap_plotly with a dataset with element prevalence = 1.

Maybe we could modify the following logic to allow displaying that?

    if log and values.dropna()[values != 0].min() <= 1:
        smaller_1 = values[values <= 1]
        raise ValueError(
            "Log color scale requires all heat map values to be > 1 since values <= 1 "
            f"map to negative log values which throws off the color scale. Got "
            f"{smaller_1.size} values <= 1: {dict(smaller_1)}"
        )

This is the error I get:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[61], line 3
      1 from pymatviz import ptable_heatmap_plotly
----> 3 ptable_heatmap_plotly(df_grouped_formula['formula'], log=True)

File /opt/conda/lib/python3.10/site-packages/pymatviz/ptable.py:552, in ptable_heatmap_plotly(values, count_mode, colorscale, showscale, heat_mode, precision, hover_props, hover_data, font_colors, gap, font_size, bg_color, color_bar, cscale_range, exclude_elements, log, fill_value, label_map, **kwargs)
    550 if log and values.dropna()[values != 0].min() <= 1:
    551     smaller_1 = values[values <= 1]
--> 552     raise ValueError(
    553         "Log color scale requires all heat map values to be > 1 since values <= 1 "
    554         f"map to negative log values which throws off the color scale. Got "
    555         f"{smaller_1.size} values <= 1: {dict(smaller_1)}"
    556     )
    558 if heat_mode in ("fraction", "percent"):
    559     # normalize heat values
    560     clean_vals = values.replace([np.inf, -np.inf], np.nan).dropna()

ValueError: Log color scale requires all heat map values to be > 1 since values <= 1 map to negative log values which throws off the color scale. Got 8 values <= 1: {'Al': 1.0, 'Ti': 1.0, 'Mn': 1.0, 'Fe': 1.0, 'Y': 1.0, 'Te': 1.0, 'Pt': 1.0, 'Au': 1.0}

ERROR: Could not install packages due to an OSError: [WinError 5] Access is denied

@sp8rks and I both ran into this issue recently.

(mpds-gpt3) PS C:\Users\sterg\Documents\GitHub\ramseyissa\SSMCDAT-2023> pip install pymatviz
Collecting pymatviz
  Downloading pymatviz-0.5.2.tar.gz (46 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 46.2/46.2 kB 2.2 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... done
Requirement already satisfied: matplotlib>=3.6.2 in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from pymatviz) (3.6.3)
Requirement already satisfied: numpy>=1.21.0 in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from pymatviz) (1.24.1)
Requirement already satisfied: pandas in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from pymatviz) (1.5.3)
Collecting plotly
  Downloading plotly-5.12.0-py2.py3-none-any.whl (15.2 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 15.2/15.2 MB 21.1 MB/s eta 0:00:00
Collecting pymatgen
  Downloading pymatgen-2023.1.20-cp310-cp310-win_amd64.whl (10.2 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 10.2/10.2 MB 12.3 MB/s eta 0:00:00
Collecting scikit-learn
  Using cached scikit_learn-1.2.0-cp310-cp310-win_amd64.whl (8.2 MB)
Requirement already satisfied: scipy in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from pymatviz) (1.10.0)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from matplotlib>=3.6.2->pymatviz) (1.4.4)
Requirement already satisfied: fonttools>=4.22.0 in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from matplotlib>=3.6.2->pymatviz) (4.38.0)
Requirement already satisfied: python-dateutil>=2.7 in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from matplotlib>=3.6.2->pymatviz) (2.8.2)
Requirement already satisfied: contourpy>=1.0.1 in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from matplotlib>=3.6.2->pymatviz) (1.0.7)
Requirement already satisfied: pillow>=6.2.0 in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from matplotlib>=3.6.2->pymatviz) (9.4.0)
Requirement already satisfied: packaging>=20.0 in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from matplotlib>=3.6.2->pymatviz) (22.0)
Requirement already satisfied: pyparsing>=2.2.1 in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from matplotlib>=3.6.2->pymatviz) (3.0.9)
Requirement already satisfied: cycler>=0.10 in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from matplotlib>=3.6.2->pymatviz) (0.11.0)
Requirement already satisfied: pytz>=2020.1 in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from pandas->pymatviz) (2022.7)
Collecting tenacity>=6.2.0
  Using cached tenacity-8.1.0-py3-none-any.whl (23 kB)
Requirement already satisfied: requests in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from pymatgen->pymatviz) (2.28.1)
Collecting pybtex
  Using cached pybtex-0.24.0-py2.py3-none-any.whl (561 kB)
Collecting uncertainties>=3.1.4
  Using cached uncertainties-3.1.7-py2.py3-none-any.whl (98 kB)
Collecting palettable>=3.1.1
  Using cached palettable-3.3.0-py2.py3-none-any.whl (111 kB)
Collecting spglib>=2.0.2
  Downloading spglib-2.0.2-cp310-cp310-win_amd64.whl (289 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 289.2/289.2 kB ? eta 0:00:00
Collecting mp-api>=0.27.3
  Downloading mp_api-0.30.8-py3-none-any.whl (71 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 71.2/71.2 kB ? eta 0:00:00
Requirement already satisfied: tqdm in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from pymatgen->pymatviz) (4.64.1)
Collecting networkx>=2.2
  Downloading networkx-3.0-py3-none-any.whl (2.0 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 2.0/2.0 MB 25.7 MB/s eta 0:00:00
Collecting monty>=3.0.2
  Using cached monty-2022.9.9-py3-none-any.whl (66 kB)
Collecting sympy
  Using cached sympy-1.11.1-py3-none-any.whl (6.5 MB)
Collecting tabulate
  Using cached tabulate-0.9.0-py3-none-any.whl (35 kB)
Collecting ruamel.yaml>=0.17.0
  Using cached ruamel.yaml-0.17.21-py3-none-any.whl (109 kB)
Collecting threadpoolctl>=2.0.0
  Using cached threadpoolctl-3.1.0-py3-none-any.whl (14 kB)
Collecting joblib>=1.1.1
  Using cached joblib-1.2.0-py3-none-any.whl (297 kB)
Collecting msgpack
  Downloading msgpack-1.0.4-cp310-cp310-win_amd64.whl (61 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 61.3/61.3 kB ? eta 0:00:00
Requirement already satisfied: typing-extensions>=3.7.4.1 in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from mp-api>=0.27.3->pymatgen->pymatviz) (4.4.0)
Collecting emmet-core>=0.39.8
  Downloading emmet_core-0.39.11-py3-none-any.whl (235 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 235.4/235.4   15.0 MB/s eta 0:00:00
                    kB
Requirement already satisfied: setuptools in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from mp-api>=0.27.3->pymatgen->pymatviz) (65.6.3)
Requirement already satisfied: six>=1.5 in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from python-dateutil>=2.7->matplotlib>=3.6.2->pymatviz) (1.16.0)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from requests->pymatgen->pymatviz) (2022.12.7)      
Requirement already satisfied: idna<4,>=2.5 in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from requests->pymatgen->pymatviz) (3.4)
Requirement already satisfied: charset-normalizer<3,>=2 in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from requests->pymatgen->pymatviz) (2.0.4)    
Requirement already satisfied: urllib3<1.27,>=1.21.1 in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from requests->pymatgen->pymatviz) (1.26.14)     
Collecting ruamel.yaml.clib>=0.2.6
  Using cached ruamel.yaml.clib-0.2.7-cp310-cp310-win_amd64.whl (111 kB)
Collecting future
  Downloading future-0.18.3.tar.gz (840 kB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 840.9/840.9 kB ? eta 0:00:00
  Preparing metadata (setup.py) ... done
Requirement already satisfied: PyYAML>=3.01 in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from pybtex->pymatgen->pymatviz) (6.0)
Collecting latexcodec>=1.0.4
  Using cached latexcodec-2.0.1-py2.py3-none-any.whl (18 kB)
Collecting mpmath>=0.19
  Using cached mpmath-1.2.1-py3-none-any.whl (532 kB)
Requirement already satisfied: colorama in c:\users\sterg\miniconda3\envs\mpds-gpt3\lib\site-packages (from tqdm->pymatgen->pymatviz) (0.4.6)
Collecting pydantic>=1.10.2
  Downloading pydantic-1.10.4-cp310-cp310-win_amd64.whl (2.1 MB)
     โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ” 2.1/2.1 MB 26.5 MB/s eta 0:00:00
Collecting numpy>=1.21.0
  Using cached numpy-1.23.5-cp310-cp310-win_amd64.whl (14.6 MB)
Building wheels for collected packages: pymatviz, future
  Building wheel for pymatviz (setup.py) ... done
  Created wheel for pymatviz: filename=pymatviz-0.5.2-py2.py3-none-any.whl size=45963 sha256=53cc3ae18297c9aa72aba2e9f871192b9ea4c41b47b127d75398a785fc7abcdf        
  Stored in directory: c:\users\sterg\appdata\local\pip\cache\wheels\b7\3d\81\f00ea9b7928810cdb738a3abe7b22915074e86897fd96b99b8
  Building wheel for future (setup.py) ... done
  Created wheel for future: filename=future-0.18.3-py3-none-any.whl size=492025 sha256=cd417f1bb5bb574f9bb8705572d2297a4539a31b1b1f64c7b3980d1f2317aac2
  Stored in directory: c:\users\sterg\appdata\local\pip\cache\wheels\69\c0\ce\f2a18105d619f21239a048bcc58e98d8ce47ac824e0531f1a0
Successfully built pymatviz future
Installing collected packages: palettable, msgpack, mpmath, threadpoolctl, tenacity, tabulate, sympy, ruamel.yaml.clib, pydantic, numpy, networkx, monty, latexcodec, joblib, future, uncertainties, spglib, ruamel.yaml, pybtex, plotly, scikit-learn, emmet-core, mp-api, pymatgen, pymatviz
  Attempting uninstall: numpy
    Found existing installation: numpy 1.24.1
    Uninstalling numpy-1.24.1:
      Successfully uninstalled numpy-1.24.1
ERROR: Could not install packages due to an OSError: [WinError 5] Access is denied: 'C:\\Users\\sterg\\miniconda3\\envs\\mpds-gpt3\\Lib\\site-packages\\~umpy\\.libs\\libopenblas64__v0.3.21-gcc_10_3_0.dll'
Consider using the `--user` option or check the permissions.

One workaround is to open a terminal in administrator mode and run the install command.

Incompatible with Google Colab (Python 3.7)

Open In Colab

!pip install ml-matrics
from ml_matrics.parity import density_hexbins
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
[<ipython-input-1-48d4b62e713f>](https://localhost:8080/#) in <module>()
----> 1 from ml_matrics.parity import density_hexbins

1 frames
[/usr/local/lib/python3.7/dist-packages/ml_matrics/__init__.py](https://localhost:8080/#) in <module>()
      1 from .correlation import marchenko_pastur, marchenko_pastur_pdf
      2 from .cumulative import add_dropdown, cum_err, cum_res
----> 3 from .elements import (
      4     count_elements,
      5     hist_elemental_prevalence,

[/usr/local/lib/python3.7/dist-packages/ml_matrics/elements.py](https://localhost:8080/#) in <module>()
      1 from __future__ import annotations
      2 
----> 3 from typing import TYPE_CHECKING, Any, Literal, Sequence
      4 
      5 import matplotlib.pyplot as plt

ImportError: cannot import name 'Literal' from 'typing' (/usr/lib/python3.7/typing.py)

---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------

[Enhancement] Separate data preprocessing from plotters

Separate data preprocessing from plotters

Previously proposed in #81 (comment), it might be good to separate data preprocess (could make them private so users could still input any format, make this invisible from user) from plotters, which could hopefully resolve #131 (comment) too.

Suggestions

Currently almost each plotter accept various types of data, but at the cost of plotter being very complex (and repeated code). I would suggest making plotter itself only handle single (or very few) data type and migrate the following data processing to some dedicated utilities:

  • Data type conversion to numpy.array or pandas.DataFrame (or some other preferred type)
  • Missing value imputation (could wrap scikit-learn)
  • Anomaly value handling (NaN or inifinity)

Potential Impact

I don't expect this to be breaking (or even visible to user), but certainly would be a lot of work as almost the entire code base need to be refactored.

Include example notebooks in CI tests

Might be a good idea to setup CI testing for example notebooks with nbmake:

Would require more optional deps to pyproject.toml (dash, jupyter-dash, nbmake, etc.)and runningpytest --nbmake examples/**/*.ipynb` in a GitHub workflow. Finally, we should check results for MPRester requests into the repo to not burden MP API.

[Feature] [Self-assigned] Add heatmap plotter

Hi @janosh, I happen to have some plotting scripts for my MPhil research which includes plotting heatmaps. Shall I proceed and try to merge them into your package?

For example:
headmap1

And this (I didn't find a good way to fix these overlapping labels though):
heatmap2

Inconsistency in `assets` after renaming `ptable_scatters` to `ptable_plots`

It's great to extend ptable_scatters to ptable_plots, but:

  • The following cell seems not used:
    # %% Scatter plots laid out as a periodic table
    data_dict = {
    elem.symbol: [
    np.random.randint(0, 20, 10),
    np.random.randint(0, 20, 10),
    np.random.randint(0, 20, 10),
    ]
    for elem in Element
    }
    fig = ptable_plots(
    data_dict,
    colormap="coolwarm",
    cbar_title="Periodic Table Scatter Plots",
    plot_kwds=dict(marker="o", linestyle=""),
    )
    save_and_compress_svg(fig, "ptable-scatters")
  • The example for ptable_plots now resides at examples/diatomics/homo-nuclear-mace-medium.svg, which differs from other assets? Maybe create a symlink for consistency?

I can open a PR to address these later.

can't clone `ml-matrics` - `error: invalid path 'data/mp-n_elements<2.csv'`

(base) PS C:\Users\sterg\Documents\GitHub\sparks-baird> git clone https://github.com/janosh/ml-matrics.git
Cloning into 'ml-matrics'...
remote: Enumerating objects: 916, done.
remote: Counting objects: 100% (220/220), done.
remote: Compressing objects: 100% (157/157), done.
remote: Total 916 (delta 136), reused 134 (delta 57), pack-reused 696 eceiving objects:  96% (880/916), 3.74 MiB | 1.23 Receiving objects:  97% (889/9
Receiving objects: 100% (916/916), 3.88 MiB | 1.23 MiB/s, done.
Resolving deltas: 100% (640/640), done.
error: invalid path 'data/mp-n_elements<2.csv'
fatal: unable to checkout working tree
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry with 'git restore --source=HEAD :/'

(base) PS C:\Users\sterg\Documents\GitHub\sparks-baird> cd ml-matrics
(base) PS C:\Users\sterg\Documents\GitHub\sparks-baird\ml-matrics> git status
On branch main
Your branch is up to date with 'origin/main'.

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        deleted:    .github/workflows/publish.yml
        deleted:    .github/workflows/svgo.yml
        deleted:    .github/workflows/test.yml
        deleted:    .gitignore
        deleted:    .pre-commit-config.yaml
        deleted:    assets/cumulative_error.svg
        deleted:    assets/cumulative_residual.svg
        deleted:    assets/density_hexbin.svg
        deleted:    assets/density_hexbin_with_hist.svg
        deleted:    assets/density_scatter.svg
        deleted:    assets/density_scatter_with_hist.svg
        deleted:    assets/err_decay.svg
        deleted:    assets/err_decay_multiple.svg
        deleted:    assets/hist_elemental_prevalence.svg
        deleted:    assets/hist_elemental_prevalence_log_count.svg
        deleted:    assets/marchenko_pastur.svg
        deleted:    assets/marchenko_pastur_rank_deficient.svg
        deleted:    assets/marchenko_pastur_significant_eval.svg
        deleted:    assets/normal_prob_plot.svg
        deleted:    assets/normal_prob_plot_multiple.svg
        deleted:    assets/precision_recall_curve.svg
        deleted:    assets/ptable_heatmap.svg
        deleted:    assets/ptable_heatmap_log.svg
        deleted:    assets/ptable_heatmap_log_cbar_max.svg
        deleted:    assets/ptable_heatmap_percent.svg
        deleted:    assets/ptable_heatmap_plotly.html
        deleted:    assets/ptable_heatmap_plotly.svg
        deleted:    assets/ptable_heatmap_plotly_custom_color_scale.html
        deleted:    assets/ptable_heatmap_plotly_custom_color_scale.svg
        deleted:    assets/ptable_heatmap_plotly_more_hover_data.html
        deleted:    assets/ptable_heatmap_plotly_more_hover_data.svg
        deleted:    assets/ptable_heatmap_plotly_no_labels.html
        deleted:    assets/ptable_heatmap_plotly_no_labels.svg
        deleted:    assets/ptable_heatmap_plotly_percent_labels.html
        deleted:    assets/ptable_heatmap_plotly_percent_labels.svg
        deleted:    assets/ptable_heatmap_ratio.svg
        deleted:    assets/ptable_heatmap_ratio_inverse.svg
        deleted:    assets/residual_hist.svg
        deleted:    assets/residual_vs_actual.svg
        deleted:    assets/roc_curve.svg
        deleted:    assets/scatter_with_err_bar.svg
        deleted:    assets/spacegroup_hist.svg
        deleted:    assets/spacegroup_hist_no_counts.svg
        deleted:    assets/spacegroup_sunburst.html
        deleted:    assets/spacegroup_sunburst.svg
        deleted:    assets/spacegroup_sunburst_percent.html
        deleted:    assets/spacegroup_sunburst_percent.svg
        deleted:    assets/true_pred_hist.svg
        deleted:    data/elem_counts_1.csv
        deleted:    data/elem_counts_2.csv
        deleted:    data/ex-ensemble-roost.csv
        deleted:    data/matbench-phonons.csv
        deleted:    data/mp-n_elements<2.csv
        deleted:    data/rand_clf.csv
        deleted:    data/rand_regr.csv
        deleted:    data/rand_tall_matrix.csv
        deleted:    data/rand_wide_matrix.csv
        deleted:    license
        deleted:    ml_matrics/__init__.py
        deleted:    ml_matrics/correlation.py
        deleted:    ml_matrics/cumulative.py
        deleted:    ml_matrics/elements.csv
        deleted:    ml_matrics/elements.py
        deleted:    ml_matrics/histograms.py
        deleted:    ml_matrics/parity.py
        deleted:    ml_matrics/quantile.py
        deleted:    ml_matrics/ranking.py
        deleted:    ml_matrics/relevance.py
        deleted:    ml_matrics/sunburst.py
        deleted:    ml_matrics/utils.py
        deleted:    readme.md
        deleted:    scripts/fetch_mp_data.py
        deleted:    scripts/generate_assets.py
        deleted:    scripts/generate_rand_data.py
        deleted:    setup.cfg
        deleted:    setup.py
        deleted:    tests/__init__.py
        deleted:    tests/test_correlation.py
        deleted:    tests/test_cumulative.py
        deleted:    tests/test_elements.py
        deleted:    tests/test_histograms.py
        deleted:    tests/test_parity.py
        deleted:    tests/test_quantile.py
        deleted:    tests/test_ranking.py
        deleted:    tests/test_relevance.py
        deleted:    tests/test_sunburst.py
        deleted:    tests/test_utils.py

(base) PS C:\Users\sterg\Documents\GitHub\sparks-baird\ml-matrics> git restore --source=HEAD :/
error: invalid path 'data/mp-n_elements<2.csv'

Changing the edge color of atoms

Using from pymatviz import plot_structure_2d, we can plot atoms, but is there a way to change the edge or face color of the atoms using the attributes of plot_structure_2d()? In the source code, Wedge() is already defined with edge color. Further, there is no option for xlabel or ylabel when I set them?

Feature: Visualizing Molecules

Hi @janosh ,

do you have any plans on supporting visualizations of molecules?

Of course, the following code block arrives at the same purpose but maybe it could be done more naturally?

from pymatgen.core import Molecule
from pymatviz.structure_viz import plot_structure_2d

mol = Molecule(species=["N", "O", "O"],
               coords=[[1.151500, -0.665600, 0.000000], [2.303000, 0.000000, 0.000000], [0.000000, 0.000000, 0.000000]])
plot_structure_2d(mol.get_boxed_structure(15, 15, 15), show_unit_cell=False)

For a porphyrine, it does look quite nice (except that my mol file does not seem to have H atoms):
image

`plot_structure_2d()` show bonds

Had a random thought. As I was looking through some of the images, I notice that it's difficult to get a sense of the depth and by extension the relative distances between things. Perhaps semi-transparent "tubes" that vary in size based on the bond distance would help. Or maybe that's something that's better suited for visualization via VESTA. Thinking about using plot_structure_2d extensively for an upcoming manuscript.

No pressure - just a random passing thought.

Behavior of multiple `plot_structure_2d()` in one `plt.figure`

Not really a bug, just something I noticed. Due to (I think) the following line:

ax = ax or plt.gca()

I get the somewhat unintuitive behavior where the plots overlap each other:

from pymatviz.structure_viz import plot_structure_2d
plots = train_structures.apply(plot_structure_2d)

image

plots then seems to be empty objects.

plots
146     AxesSubplot(0.1981,0.125;0.6288x0.755)
925     AxesSubplot(0.1981,0.125;0.6288x0.755)
1282    AxesSubplot(0.1981,0.125;0.6288x0.755)
Name: structure, dtype: object
plots.iloc[0]
<AxesSubplot:>

By itself works fine:

plot_structure_2d(train_structures.iloc[0])

image

[Enhancement] Decouple nested periodic table plotters

Proposal

Decouple the periodic table projector and nested plotter (make them modular) , to make it easier for users to access (currently there are too many arguments for every ptable plotter), and for us to maintain (or add new functionalities, I'm hoping user could use the projector on its own at this point, where they could throw in any nested plotter as they wish instead of having to ask us to add a separate plotter).

Starting from the discussion in #131 (comment), it's still a vague feeling right now but would certainly become clearer as discussion goes on.

Suggestion

Separate ptable plotters into two parts:

  • The nested unit plotter (for example scatter), which carries its own style arguments (scatter size/color and such).
  • A periodic table projector which simply arrange the unit plotter into a periodic table style, which carry global style arguments (colorbar/figure title......). Potentially make it modular such that user could put any plotters as they wish (and we could provide some pre-built plotters).

Potential Impact

This is very likely becoming a badly-breaking change (but I think it's necessary and beneficial).

Rename ml-matrics to pymatviz

Not entirely happy with the name ml-matrics anymore. Thinking of changing it to mainvis for materials informatics visualizations. Not sure if it's really better but at least seems to more correctly represent the current focus of this package.

Let me know if you think the new name is terrible @CompRhys @sgbaird. Otherwise will change the name next week maybe.

Handle lists and arrays as `x`, `y` in `density_scatter` and siblings

This simple example

import numpy as np

from pymatviz import density_scatter


arr = np.arange(5)
lst = list(range(5))

ax = density_scatter(x=arr, y=arr)
ax = density_scatter(x=lst, y=lst)

embarrassingly raises

File ~/dev/pymatviz/pymatviz/parity.py:43, in hist_density(x, y, df, sort, bins)
     39 x, y = df_to_arrays(df, x, y)
     41 data, x_e, y_e = np.histogram2d(x, y, bins=bins)
---> 43 zs = scipy.interpolate.interpn(
     44     (0.5 * (x_e[1:] + x_e[:-1]), 0.5 * (y_e[1:] + y_e[:-1])),
     45     data,
     46     np.vstack([x, y]).T,
     47     method="splinef2d",
     48     bounds_error=False,
     49 )
...
    611 result[idx_valid] = interp.ev(xi[idx_valid, 0], xi[idx_valid, 1])
--> 612 result[np.logical_not(idx_valid)] = fill_value
    614 return result.reshape(xi_shape[:-1])

ValueError: cannot convert float NaN to integer

Better docs

We should have a few example Jupyter notebooks as documentation of how to use the functions in pymatviz to do EDA e.g. on the matbench data sets.

New plot type: `ptable_scatter`

Add ptable_scatter equivalent to ptable_hists. Function signature can be largely identical except each element data must specify both x and y data. Could potentially have a single function that toggles between hist and scatter depending on presence of y data.

pymatviz/pymatviz/ptable.py

Lines 745 to 795 in 882a6ca

def ptable_hists(
data: pd.DataFrame | pd.Series | dict[str, list[float]],
bins: int = 20,
colormap: str | None = None,
hist_kwds: dict[str, Any]
| Callable[[Sequence[float]], dict[str, Any]]
| None = None,
cbar_coords: tuple[float, float, float, float] = (0.18, 0.8, 0.42, 0.02),
x_range: tuple[float | None, float | None] | None = None,
symbol_kwargs: Any = None,
symbol_text: str | Callable[[Element], str] = lambda elem: elem.symbol,
cbar_title: str = "Values",
cbar_title_kwds: dict[str, Any] | None = None,
cbar_kwds: dict[str, Any] | None = None,
symbol_pos: tuple[float, float] = (0.5, 0.8),
log: bool = False,
anno_kwds: dict[str, Any] | None = None,
return_axes: bool = False,
**kwargs: Any,
) -> plt.Figure:
"""Plot histograms of values across the periodic table of elements.
Args:
data (pd.DataFrame | pd.Series | dict[str, list[float]]): Map from element
symbols to histogram values. E.g. if dict, {"Fe": [1, 2, 3], "O": [4, 5]}.
If pd.Series, index is element symbols and values lists. If pd.DataFrame,
column names are element symbols histograms are plotted from each column.
bins (int): Number of bins for the histograms. Defaults to 20.
colormap (str): Matplotlib colormap name to use. Defaults to None. See options
at https://matplotlib.org/stable/users/explain/colors/colormaps.
hist_kwds (dict | Callable): Keywords passed to ax.hist() for each histogram.
If callable, it is called with the histogram values for each element and
should return a dict of keyword arguments. Defaults to None.
cbar_coords (tuple[float, float, float, float]): Color bar position and size:
[x, y, width, height] anchored at lower left corner of the bar. Defaults to
(0.25, 0.77, 0.35, 0.02).
x_range (tuple[float | None, float | None]): x-axis range for all histograms.
Defaults to None.
symbol_text (str | Callable[[Element], str]): Text to display for each element
symbol. Defaults to lambda elem: elem.symbol.
symbol_kwargs (dict): Keyword arguments passed to plt.text() for element
symbols. Defaults to None.
cbar_title (str): Color bar title. Defaults to "Histogram Value".
cbar_title_kwds (dict): Keyword arguments passed to cbar.ax.set_title().
Defaults to dict(fontsize=12, pad=10).
cbar_kwds (dict): Keyword arguments passed to fig.colorbar().
symbol_pos (tuple[float, float]): Position of element symbols relative to the
lower left corner of each tile. Defaults to (0.5, 0.8). (1, 1) is the upper
right corner.
log (bool): Whether to log scale y-axis of each histogram. Defaults to False.
anno_kwds (dict): Keyword arguments passed to plt.annotate() for element

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.