lab-cosmo / chemiscope Goto Github PK

View Code? Open in Web Editor NEW

112.0 19.0 27.0 29.96 MB

An interactive structure/property explorer for materials and molecules

Home Page: http://chemiscope.org

License: BSD 3-Clause "New" or "Revised" License

TypeScript 68.17% HTML 5.06% CSS 2.42% JavaScript 0.38% Python 22.45% TeX 1.45% Shell 0.07%

visualization molecule materials-science web hacktoberfest

chemiscope's Introduction

Chemiscope: interactive structure-property explorer for materials and molecules

Chemiscope is a graphical tool for the interactive exploration of materials and molecular databases, correlating local and global structural descriptors with the physical properties of the different systems; as well as a library of re-usable components useful to create new interfaces.

Citing chemiscope

Chemiscope is distributed under an open-source license, and you are welcome to use it and incorporate it into your own research and software projects. If you find it useful, we would appreciate a citation to the chemiscope paper:

G. Fraux, R. K. Cersonsky, M. Ceriotti, Chemiscope: Interactive Structure-Property Explorer for Materials and Molecules. Journal of Open Source Software 5 (51), 2117 (2020)

If you incorporate chemiscope components into a software project, a link back to the chemiscope homepage (https://chemiscope.org) is the preferred form of acknowledgement.

Documentation

You may be interested in particular about how to create a visualization of your own dataset.

If you would like to generate a simple chemiscope for your dataset, we have a Google Colab notebook that can help!

Getting help for using chemiscope

If you want to get help when using chemiscope either as a JavaScript/TypeScript library inside your own project; or for creating input files for the default visualizer at https://chemiscope.org, you can open a Github issue with your question; or send an email to the developers (you can find these emails on the lab webpage: https://www.epfl.ch/labs/cosmo/people/)

Getting the python package and using chemiscope in Jupyter notebooks

Using chemiscope in a Jupyter notebook should be as easy as

pip install chemiscope

This also allows to generate chemiscope JSON files that can be viewed on http://chemiscope.org

If you need to build and install a development version, you should have all the npm stack installed, and then just run

git clone https://github.com/lab-cosmo/chemiscope
cd chemiscope
pip install .

Getting and running the web app locally

git clone https://github.com/lab-cosmo/chemiscope
cd chemiscope
npm install
npm start

# navigate to localhost:8080

Building the code to use it in other projects

git clone https://github.com/lab-cosmo/chemiscope
cd chemiscope
npm install
npm run build

# Include dist/chemiscope.min.js or dist/molecule-viewer.min.js
# in your own web page

See [app/] or the documentation for a examples of how to create a webpage using chemiscope.

License and contributions

If you are interested in contributing to chemiscope, please have a look at our contribution guidelines

Chemiscope itself is distributed under the 3-Clauses BSD license. By contributing to this repository, you agree to distribute your contributions under the same license.

chemiscope's People

Contributors

Stargazers

Watchers

chemiscope's Issues

Center selected environment in viewer

Add a button (perhaps to the right of the environment indicator

That would center the JSMol widget on the selected environment.
In JSMol language, this seems to be as simple as
select (*)[index]
centerAt average

Add pinning to chemiscope to enable comparisons

Enable chemiscope to remember the last few items (or pinned items) so that we can enable comparison between structures within the projection.

1st round: click-enabled list of structure/environment id's to revisit
nth round: movable click-enabled static pngs of previously visited structures/environments (see proxy)

Enable Selective Opacity

convert all 2D colormaps to RGBA format
disable RGBA format for 3D colormaps until plotly can support

Refactor, cleanup and improvement linked to the pinning feature

Taking the list from #25 review

Do not call select({-1, -1}, guid) to indicate removal, instead add specific functions for each possible action: addPinned, removePinned, changeActivePinned. Then select always act on the current active and only move the marker/change the structure
Remove the starterGUID parameter for PropertyMap constructor to allow using the map without a structure viewer. This should be easy to do after the above changes
Extract code dealing with selected markers (things like classList.toggle('chsp-active-structure-marker', false)) to a separate HTMLMarker class doing all of this.

On a related note, src/map/map.ts is getting quite large, so we should try to extract functionalities and potentially move it out of this file.

Symbols get messed up with multiple selections

Using symbols to display categorical data interferes with selection highlighting, particularly when using multiple structure panels.

Expected behavior:
Highlighted structures are shown using the same symbol used for the category, which is just made a bit bigger and colored according to the structure panel color key.

Observed behavior:
in 2D, a circle is used regardless of the underlying class
in 3D (even more problematic) the symbol of the active panel is used for all the selected structures (so that a structure with a square symbol might turn into a cross if I select a cross structure as the active one)

Might be linked to #50 and #11

Make the separation between the library and chemiscope.org clearer

This should be made more visible in the documentation.

Another way to achieve this is to move the DefaultVisualizer code to /app and point to it as an example of how to link widgets together.

map color range is not updated when the text boxes are edited

Seems that changing the color range manually does not trigger an update of the plotly map

Use Plotly GroupBy instead of separate traces for symbols

See: https://plot.ly/javascript/group-by/

We currently use one additional empty trace for each symbol to be displayed, this will make the code simpler and might improve performances.

Property table is missing a scrollbar when there are too many properties

For example when loading the QM9 KPCovR map, and then clicking on the Strucure XXX button, there are too many properties to fit in the space occupied by the structure viewer, and no scroll bar to get back to the properties on the top.

This should be a simple fix to set overflow CSS property to scroll instead of hidden.

Environment disabling setting is lost when changing structure

If one click on "disable" for the "Environment" setting in the structure visualizer, the setting is reset when loading a new structure.

Default visualization settings in JSON input

For the default interface, we could have a way to specify a visualization state in the input JSON file.

On the map side, this means describing which property should be used as color/size/symbols, x/y/z ranges, etc.

On the structure viewer side, this means supercell settings, visualizations settings, etc.

This can be implemented by adding serialization/de-serialization of the settings to JSON, and having an additional section for these in the JSON input file.

Loader menu in standalone viewer

Previously the standalone script would by default hide the loader when a data file was included in the html.
Now (1) the hiding code doesn't work anymore and (2) I question whether it makes sense to hide it, given that being able to save visualization state is useful. This issue is to track (1) and discuss on (2)

Active widget doesn't reset correctly on re-rendering

See videos

Add properties metadata

Adding description and units for all properties in the dataset would be good

Places where we can display them:

use name (unit) on the axis of the map (just name if unit is undefined or empty)
show a tooltip with the description when selecting properties in the map setting? Not sure if this is possible
show description in a tooltip in the "info" panel
show unit in the "info" panel

Do you see other metadata we would want to attach to the properties?

Write an example/recipe for going from structure to chemiscope input

There should be an example (maybe in https://github.com/cosmo-epfl/kernel-tutorials/) on how to go from structure to chemiscope, using SOAP and PCA (maybe KPCovR).

Add chemiscope publication to reference list

Reverse size property in the map

When something like energy is used as a property for dot size in the map, the most interesting points are the one with lowest energy, but they end up being the smallest.

We could provide an option to use -<property> instead of <property> for the size to support this use case.

Save camera zoom and orientation as settings in structure viewer & 3D map

As it says on the tin! We already save the zoom level in 2D with axis min/max values.

Getting these values and applying them should be easy with JSmol (we already have a save/apply orientation setting), I don't know if this is feasible with plotly though.

Add clear indication that github issues can be used for feature request

Both on this repo and in the website

Issue when loading multiple structures rapidely in different viewers

When loading different structures in multiple viewers in the same page, it looks like some global state is not updated on the JSmol side, leading to the wrong structure being loaded in some viewers.

We mostly see this when applying saved settings containing multiple viewers, and slowing down loading seems to fix it:
https://github.com/cosmo-epfl/chemiscope/blob/acdae4d83cac62da0308c5b0d87046306a89c8bb/src/index.ts#L339-L347

We should still go for an actual fix as much as possible since the fix above makes everything very slow, and even 1s delay is not always enough.

Some of the structure settings are dropped when switching structure

Regardless of the choice of "trajectory" settings, some options such as packed cells, or the style for the environments, are reset when loading a new structure by clicking on the map or on the trajectory sliders.

Multi-dimensional Properties in Viewer

Multidimensional properties could be very nice to visualize. We can see two use-cases that could be implemented with the same infrastructure:

DOS plots, where the property is interesting to plot by itself as well
parametric properties (e.g. everything that depends on alpha in KPCovR), where we would want a slider to change the alpha value and have the changes reflected in the projection/coloring/size/etc.

The same code could also be used to store (and display in the future) vector/tensorial properties as 3/9-values vectors.

1st milestone:

decide & implement the data structure as stored in JSON. Something like this should work:

"properties": {
    "<name>": {
        // array of arrays, of dimension N_structure/N_environments x p were p is the size of the parameter array below
        "values": [[...], [...]],
        // list of parameters, 1 to start but potentially multiple parameters later for 2+D properties
        "parameters": ['parameter 1'], 
    }
},
// parameters above refer to the values in this separate table so that multiple properties can use the same parameters
"parameters": {
    "parameter 1":  number[],  // p1 elements
    "parameter 2": number[],  // p2 elements
}

handle non-scalar properties within code (i.e. make sure it doesn't break things)
visualization - disable non-scalars for all plot styling / info bar

2nd milestone:

provide necessary checks for the structure of data
process 1D properties within code
visualize 1D properties in plot mode

3rd milestone:

provide slider for 1D property parameter --> plot styling or info bar

We may want to extend this to 2+D properties in the future, the current proposal should be forward compatible with this.

When viewed on a mobile device, the menu is not visible

If you reduce the width of the page, the top links and the loader box are collapsed in an hamburger menu. However, the menu items are invisible when you click the menu

Reload of file with same name should trigger a refresh of the chemiscope display

Standalone viewer is broken with compressed JSON files

The standalone viewer is only able to read uncompressed JSON, and fails with a cryptic error message if one uses a compressed file instead. We should at least improve the error message in this case, or even better add pako to the standalone viewer to decompress the dataset.

restructure layout to make the standard interface fit without scrollbars

Idea would be to have two layouts depending on aspect ratio of the window, resized automaticallly to fit within it. also, the property listing for one frame should overlay the jsmol box, so that it always stay within the corresponding 1x1 block. if there are too many properties, there should be a small scrollbar within the overlay.
mockup:

Add a way to get the lib version and display it

Getting the version as git hash with git describe --tags --dirty or equivalent when compiling the typescript code to javascript; and then displaying it somewhere on the page

Structure and environment info panels overlap with each other

clicking on the "structure" info button when the "environment" info is displayed should hide the latter. instead, it opens "behind" it.

Make creating chemiscope input easier

While #96 is a first step toward this, there are still multiple places where the process could be smoother.

I think ideally we should keep the dual workflow with a function that is
mirrored by a command-line utility. People from "my generation" still have
an instinct to go full bash onto postprocessing.
So I think we want to be able to easily combine structures (here having
something that can be read by ASE or an Atoms list seem to cover quite some
grounds) and arrays of values (that maps easily into column files) or
dicts.
One thing that often bugs me is that I want to drop info from the ASE file
so there could also be a switch that allows you to drop those fields.

Originally posted by @ceriottm in #96 (comment)

There are three main parts to a chemiscope input file: metadata, properties and structures.

The story to import structures into chemiscope is already pretty good, as long as you work with ASE =). Adding support for alternative file formats should be relatively easy and can be done on a case-by-case basis.

Properties is the harder part right now. We take the properties defined by ase in Atoms.info and Atoms.arrays, but the user may not want this (e.g. the number property), and may want more properties. For now, the only way to add other properties is to manually create the right dictionary and pass it to the function. Removing properties is also possible within python with del frame.info["whatever"] or del frame.arrays["whatever"], but not with the command line script.

Finally, the script support basic metadata input, but again it is much easier to do this with the Python function.

One thing we can do is add support for properties stored in CSC/text/npy files. For CSV files the property name would be the CSV header, for the other methods we could just name properties 1, 2, 3, etc. We could easily guess the target (atom/structure) by counting the number of values in the property.

This will obviously not support any property metadata (description/units), but for quick & dirty command line scripting, or to separate analysis/chemiscope generation it could help.

Highlight map points from the same structure

When in atom mode, it would be good to highlight on the map other points pertaining to the same structure as the currently selected atom.

A good way to do that would be to dim / grey out other points, using the same mechanism as #3.

Introduce end to end testing

One of the reviewer concern on the JOSS paper was that we lack automated tests. We can do manual testing, but that mean it can be hard to refactor without introducing bugs for corner cases. At the same time, chemiscope being mostly a GUI is not well suited for standard unit testing (checking public classes/functions one by one, single behavior by single behavior)

I think we should add some form of automated testing, end to end testing might be the less bad one. The idea is to specify high-level behavior such as selecting a new property for axis x changes the plot & axis title on the plot. Such tests might be painful to write since we have to make them generic enough to not break due to layout changes but specific enough to catch the issues; but I still think they are worth it.

Cypress looks like a nice framework for such end to end testing: https://www.cypress.io/.

Closing viewer can be blocked when trying to close the viewie before when only one viewer is left

When you only have one viewer left:

You click on remove viewer. This will not work and a user message will appear (this is okay).
Then you duplicate the viewer
Try to close viewer which you tried to close earlier. This will not work. You can close other viewers but not that one

This probably will not happen to a lot of user, and if it happens you can just reload everything, so I would say this is low priority.

Failed to run `npm run download-example-input` when /tmp is on a different partition than chemiscope

When trying to download the example input following the install instructions, the download-example-input step fails when /tmp is on a different partition than the current working directory:

$ npm run download-example-input

> [email protected] download-example-input /home/kai/Documents/reviews/2020/JOSS-2117/chemiscope
> ts-node ./utils/download-example-input.ts

Cloning into '/tmp/tmp-173153-8SqGyGFytJyu/chemiscope'...
Error: EXDEV: cross-device link not permitted, rename '/tmp/tmp-173153-8SqGyGFytJyu/chemiscope/CSD-500.json.gz' -> './app/CSD-500.json.gz'
    at Object.renameSync (fs.js:756:3)
    at Object.<anonymous> (/home/kai/Documents/reviews/2020/JOSS-2117/chemiscope/utils/download-example-input.ts:14:8)
    at Module._compile (internal/modules/cjs/loader.js:1200:30)
    at Module.m._compile (/home/kai/Documents/reviews/2020/JOSS-2117/chemiscope/node_modules/ts-node/src/index.ts:858:23)
    at Module._extensions..js (internal/modules/cjs/loader.js:1220:10)
    at Object.require.extensions.<computed> [as .ts] (/home/kai/Documents/reviews/2020/JOSS-2117/chemiscope/node_modules/ts-node/src/index.ts:861:12)
    at Module.load (internal/modules/cjs/loader.js:1049:32)
    at Function.Module._load (internal/modules/cjs/loader.js:937:14)
    at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:71:12)
    at main (/home/kai/Documents/reviews/2020/JOSS-2117/chemiscope/node_modules/ts-node/src/bin.ts:227:14)

Googling the error message ends me up on a stackoverflow post that sounds like fs.rename uses a syscall that doesn't work across partitions/mount points/filesystems.
In my case, /home is on a different filesystem than /tmp, and thus the above error happens.

Scroll to zoom use different directions for map and structure

Scrolling in one direction increases zoom level on the map but decreases the one for the structure. This is surprising for the users.

I don't know if plotly or jsmol can be configured to use "reverse" zoom. If not, a possibility would be to intercept the wheel event and invert the sign before allowing handling by the libraries.

Sparse environment list

Using NaNs to indicate environments that are not linked to points on the map is cumbersome and inefficient. The indexing is already allowing for a sparse list, it only needs to be handled on the JSMol side. Also related to #4

Filter points by property values

Instead of displaying the full dataset in the map all the time, we could give the user the ability to filter which points to show based on the property values (e.g. energy < 345).

The filtered-out points should not be removed, but rather dimmed out/greyed on the map, so that selecting an environment from the structure or the slider still works.

The hard question is how the user would input the filters. A first version can use range filters like

[ MIN ] <= [ $property ▼] <= [ MAX ]

with MIN and MAX number input and $property a dropdown select element.

Then a second step would be to combine range filters on different properties with or/and.

Another solution would be to have a selection language, but this might be overkill.

Switch between 'atom' and 'structure' display mode

If the dataset contains both atom and structure properties, we currently default to showing the 'atom' ones, but the user should have a way to switch between the two modes.

The display mode is already centralized in the EnvironmentIndexer, this mainly needs to be a setting somewhere.

Add a loading indicator in the examples

On slow connections, loading one of the examples can take multiple seconds. A small loading indicator would show the users that the code is doing something.

Improve metadata

The only metadata being used currently is the dataset name. We should add more, I think at least

author(s) string[]
references string[] (journal in which this was published/DOI)
description string

Anything else?

The metadata could be hidden by default, and displayed when clicking on the dataset name. Or this could be moved to a separate component, to be displayed on top of the other ones.

Extract selected markers from map

Extract code dealing with selected markers (things like classList.toggle('chsp-active-structure-marker', false)) to a separate HTMLMarker class doing all of this.

Add support for LaTeX rendering of text

As discussed in #88 (comment) and following comments, it would be very nice to support LaTeX syntax in user-facing data. The core use case for this is rendering units. Another appealing case is to render dataset description with some math inside. The alternative is to use unicode math characters (in particular unicode superscripts) where needed.

One solution to do this that would integrate relatively well with plotly is to use https://www.mathjax.org/ for latex rendering. Another alternative is https://katex.org/, which is usually faster and smaller than mathjax, although I don't know if it works with plotly.

If we want to do this, one thing to consider is that we would have to bundle mathjax/katex, which would increase the size of the bundled javascript; and might interfere with downstream users who might already have a latex rendered installed (e.g. materials cloud). This is my only objection to this feature: it might not be worth the slower loading time for marginally more convenient unit & description math input.

Also, we currently render the dataset description & references using markdown syntax, so we have to make sure not to break it when rendering latex. This should be fine, we mostly have to check that it is not broken when implementing this.

Places where we may want to have latex rendering:

dataset name
dataset description
dataset references (maybe not really useful)
property name
property description
property unit

Things to look at before starting the implementation:

Whats is the size difference between Mathjax and KaTeX?
Can we use KaTeX with Plotly?
How do markdown-it play with mathjax/katex?

install, and easy import of write_chemiscope_input utils

I find often myself using ugly code such as

from os.path import expanduser
sys.path.append(expanduser('~')+'/lavoro/code/chemiscope/utils')
from chemiscope_input import write_chemiscope_input

to be able to write chemiscopes from my analysis notebooks.
I think it would be nice to have a pip package that installs the "utils" section of the chemiscope package, to facilitate its usage.
I was kind of torn as to whether these utils should rather live in a separate repo, but I actually think it makes sense for them to be associated with chemiscope, as that will simplify ensuring that the utils and the actual viewer stay in sync.

In-tree pip install is broken

Trying to run pip install . from a source checkout will fail with "could not find package.json" issue. Installing from a pre-generated sdist (which is what happen with pip install chemisope) is fine, so this should not impact most users.

The core of the issue is that we use the same version number for the python package & the npm package, so python setup.py read the version number in package.json, which is symlinked in python/package.json. pip install . tries to isolate the build, and copy the python/ directory to a temporary location, where it can not find the symlinked file.

python setup.py install and the versions uploaded on PyPI works fine, so I don't think we have to worry about this.

This is tracked upstream in pypa/pip#3500

Random errors when clicking randomly on the JSMol frame

You may say that I'm getting what I deserve, but if you randomly click (click and drag while pressing left and right button in a random sequence seems to trigger this consistently) on the JSMol window, an error message appears

Add "download as standalone" button in default interface

We already have the ability to generate a "standalone" visualization as an HTML file, the idea is to make it easier for users to download such file including their dataset.

building the file automatically on Travis & deploying it to github pages
when the user clicks on the button, download standalone.html, stringify the dataset back to JSON (including visualization state #6) and add it at the end of the file.
open a download dialog from Javascript for the user to save the file.

Warning should be raised when log scale is used for negative values

Hide subset of points with symbol

It would be nice to be able to hide/grey out points in the map by selecting a given symbol/clicking on it on the legend.

This could be implemented in such a way to be able to also use it for #4.

Allow to save the structure viewer output as PNG

This should be easy, since all viewers are built on canvas elements, which can be rendered to an image with canvas.toDataURL.

This should be nicer to use than using screenshots to extract structures views.

Typo in tutorial "input file format for chemiscope"

In the first pseudo script under the section "Creating an input file"
the ase.io import needs to be changed to

import ase.io

same for sklearn import

import sklearn.decomposition

(or you change the usage later in the pseudo script)

Add input sanitation to strings

It is possible to set meta.name to <script>alert('got you')</script>, or any other arbitrary HTML, which is unfortunate. We should add HTML sanitation when checking the datasets before trying to display them, to remove potential visual breakage or attacks.