smtg-ucl / galore Goto Github PK

Gaussian and Lorentzian smearing of simulated spectra

License: GNU General Public License v3.0

Python 70.19% Roff 12.16% TeX 17.65%

galore's Introduction

README

Introduction

Galore is a package which applies Gaussian and Lorentzian broadening to data from ab initio calculations. The two main intended applications are

Gaussian and Lorentzian broadening of electronic density-of-states, with orbital weighting to simulate UPS/XPS/HAXPES measurements.
Application of Lorentzian instrumental broadening to simulated Raman spectra from DFPT calculations.

Documentation

A brief overview is given in this README file. A full manual, including tutorials and API documentation, is available online at readthedocs.io. You can build a local version using Sphinx with make html from the docs directory of this project.

An brief formal overview of the background and purpose of this code has been published in The Journal of Open Source Software.

Usage

Broadening, weighting and plotting are accessed with the galore program. For full documentation of the command-line flags, please use the in-built help:

galore -h

Instrumental broadening

Data may be provided as a set of X,Y coordinates in a text file of comma-separated values (CSV). Whitespace-separated data is also readable, in which case a .txt file extension should be used.

To plot a CSV file to the screen with default Lorentzian broadening (2 cm^-1), use the command:

galore MY_DATA.csv -l -p

and to plot to a file with more generous 10 cm^-1 broadening:

galore MY_DATA.csv -l 10 -p MY_PLOT.png

will provide the additional data needed.

Other file formats are supported, including IR and Raman intensity simulation output. See the Tutorials for usage examples.

Photoelectron spectra

UPS, XPS or HAXPES spectra can be simulated using Galore. This requires several inputs:

Orbital-projected density of states data.
- This may be provided as an output file from the VASP or GPAW codes.
- Formatted text files may also be used.
Instrumental broadening parameters. The Lorentzian and Gaussian broadening widths are input by the user as before.
Photoionization cross section data, which is used to weight the contributions of different orbitals.
- Galore includes data for valance band orbitals at Al k-α (XPS) and He II (UPS) energies, drawn from a more extensive table computed by Yeh and Lindau (1985). An alternative dataset may be provided as a JSON file; it is only necessary to include the elements and orbitals used in the DOS input files.
- Cross-sections for high-energy (1-1500 keV) photons have been fitted from tabulated data computed by Scofield (1973).

See the Tutorials for a walkthrough using sample data.

The orbital data can also be accessed without working on a particular spectrum with the galore-get-cs program. For example:

galore-get-cs 4 Sn O

will print a set of valence orbital weightings for Sn and O corresponding to a 4 keV hard x-ray source. These values have been converted from atomic orbital data to per electron cross-sections.

The galore-plot-cs program is provided for plotting over a range of energies using the high-energy fitted data:

galore-plot-cs Pb S --emin 2 --emax 10 -o PbS.pdf

generates a publication-quality plot of cross-sections which may help in the selection of appropriate HAXPES energies for experiments with a given material.

Requirements

Galore is currently compatible with Python versions 3.8 and newer. Galore uses Numpy to apply convolution operations. Matplotlib is required for plotting.

Galore uses Pip and setuptools for installation. You probably already have this; if not, your GNU/Linux package manager will be able to oblige with a package named something like python-setuptools. On Max OSX, the Python distributed with Homebrew includes setuptools and Pip.

Installation

Windows user installation

Anaconda is recommended for managing the Python environment and dependencies on Windows. From the Anaconda shell:

pip3 install .

Linux/Mac developer installation

From the directory containing this README:

pip3 install --user -e .

which installs an editable (-e) version of galore in your userspace. The executable program galore goes to a user directory like ~/.local/bin (which may need to be added to your PATH) and the galore library should be available on your PYTHONPATH. These are links to the project source folder, which you can continue to edit and update using Git.

To import data from VASP calculations you will need the Pymatgen library. If you don't have Pymatgen yet, the requirements can be added to the Galore installation with by adding [vasp] to the pip command e.g.:

pip3 install --user -e .[vasp]

Installation for documentation

If you need to build the documentation you can add [docs] to the pip command to ensure you have all the Sphinx requirements and extensions:

pip3 install --upgrade .[docs]

Support

If you're having trouble with Galore or think you've found a bug, please report it using the Github issue tracker. Issues can also be used for questions and discussion about the Galore methodology/implementation.

Development

This code is developed by the Scanlon Materials Theory Group based at University College London. Suggestions and contributions are welcome; please read the CONTRIBUTING guidelines and use the Github issue tracker.

How to cite Galore

If you use Galore in your research, please consider citing the following work:

Adam J. Jackson, Alex M. Ganose, Anna Regoutz, Russell G. Egdell, David O. Scanlon (2018). Galore: Broadening and weighting for simulation of photoelectron spectroscopy. Journal of Open Source Software, 3(26), 773, doi: 10.21105/joss.007733

Galore includes a machine-readable citation file in an emerging standard format with citation details for the actual code, but as conventions for software citation are still developing the JOSS paper is a more reliable method of giving credit.

License

Galore is made available under the GNU Public License, version 3.

Acknowledgements

Development work by Adam J. Jackson took place in the course of research into new transparent conducting materials, led by David O. Scanlon and funded by EPSRC (project code EP/N01572X/1). Work by Alex M. Ganose was supported by a studentship co-sponsored by the Diamond Light Source at the EPSRC Centre for Doctoral Training in Molecular Modelling and Materials Science (EP/L01582/1). Anna Ragoutz was our expert advisor on all things PES, guiding the feature-set and correcting the implementation of weighting, and was supported by an Imperial College Research Fellowship.

We acknowledge useful discussions with Alexey Sokol (who proposed that a code such as this would be useful), Katie Inzani, and Tim Veal. Feature requests and user testing came from Benjamin Williamsion, Christopher Savory and Winnie L. Leung.

This would have been much more painful if not for the excellent scientific Python ecosystem, and the Python Materials Genome project spared us the pain of writing Yet Another Vasp Parser.

galore's People

Contributors

Stargazers

Watchers

Forkers

kcantosh flower0226 gkerherve captaindasheng tianlie88 yaxuan-lii wangvei sylvesterlali ntq1982 hashan-peiris chrinide hugo-lopez-pena

galore's Issues

Pymatgen dependency breaks installation on Python 3.4

See SMTG-Bham/sumo#62 for an identical problem with Sumo

How to change the font size of the labels, and stuff?

Dear developer

I would like to know in which python file shall I look so that I can change the axis size, font, and other stuff? Tks.

with kind regards,
Peter

ENH: Colours

An entirely reasonable feature request from @cnsavory .

There should be a way for users to specify a preferred colour palette. Some kind of config file would be more practical than endless command-line strings. I suggest a heirarchy of something like: [file specified as arg] > ./galore.conf > ~/.galore.conf > [defaults]

The absolute worst-case for number of lines would be something like a quaternary system with lots of f-orbitals. In this case the automatic palette should do something clever with line styles and markers.

Input text files for spin-polarised calculations

Is your feature request related to a problem? Please describe.

The output of spin-polarised DOS from sumo doesn't play nicely with the galore lookup for cross-sections.

Example, the O_dos file :
O_dos.dat.gz

If I run galore O_dos.dat --plot --pdos --weighting alka

I get Could not find cross-section data for element O, orbital sup. Skipping this orbital.

Describe the solution you'd like

I would like a solution whereby galore would seamlessly take spin-polarised output from sumo as input. In the way that it can read the vasprun.xml.

Describe alternatives you've considered

I can alter the header of the sumo generated file, but this raises some other probems. E.g. if I change the header to: # energy s s p p d d

The code runs and plots, but I do get the following output, which seems like something has gone wrong:

  Orbital cross-section weights per electron:
   O s: 9.500e-04
Could not find cross-section data for element O, orbital s_1. Skipping this orbital.
   O p: 6.000e-05
Could not find cross-section data for element O, orbital p_1. Skipping this orbital.
Could not find cross-section data for element O, orbital d_1. Skipping this orbital.

Additional context

n/a

Bug plotting PDOS

Describe the bug
Plotting PDOS is broken on master (1270488) with error message

    tdos = np.zeros(len(next(pdos_data.values())['energy']))
TypeError: 'ValuesView' object is not an iterator

To Reproduce
galore vasprun.xml --pdos -p

Expected behavior
Generate a plot and not an error message

To fix

Fix the error. This is pretty easy, just need to change the way this list is accessed. tdos = np.zeros(len(next(pdos_data.items()[1])['energy'])) should work.
Add a test This problem should have been caught by unit testing. We need to check this code runs without raising errors, even if inspecting the plotted output is hard.

ability to input and scale experiment xps data

don't know how easy it is to implement this, but could be quite handy to scale and match the simulated XPS to the experimental xps data in a plot i.e.

flipping the valence band xps data so the x axis goes negative
matching the peak intensities of the VB-XPS to the weighted DOS

Option to save galore-plot-cs output as a .csv file

Hello,

Currently galore has the option to provide a plot of the photoionisation cross sections for a specific range of elements using the galore-plot-cs function. However, the output is only provided as a .png file and the format appears to be Matplotlib.

Would it be possible to save the plot as a .csv, .xlsx or .txt file? This would be very useful so that I could plot the graph in a different plotting software (i.e. Origin).

Thanks!

Make project publicly-viewable

I initially made the project private as we hadn't discussed it yet. Everyone seems fairly relaxed about making this generally available; are there any key milestones we should be trying to hit before public release or can we follow the "release early, release often" philosophy and open it up now? @scanlond @utf @badw @cnsavory

automatic latex symbols with Matplotlib

would it be possible for galore to automatically default to truetype font so the figure can be edited easily later? I know you can put this in the matplotlibrc file as pdf.fonttype = 42 but I have been spoilt by sumo which has it by default. ;)

apologies for my laziness!
merci,
Ben

Consider interface with LOBSTER

Currently Galore relies on the inbuilt orbital assignment of supported DFT codes, which may not be the best approach. The LOBSTER code can also be used to generate an alternative PDOS assignment; it would be worth evaluating if this is helpful for PES simulation.

galore needs to be added to pythonpath manually

Galore needs to be added to python path in order to load module 'galore', and doesn't do this automatically.

`--fill-between` option

Similar to issue #42, I think a --fill-between CLI option would be a nice addition to the functionality of Galore, where this gives an output XPS plot with a semi-transparent fill for each orbital contribution (as with sumo-dosplot) which would aid the clarity of the output plots.

I'll try implement this in my local version of galore and make a PR soon.

`--no-total` and `--legend-cutoff` options like `sumo`

I think CLI options --no-total (to hide the solid black line of the total XPS spectrum) and --legend-cutoff options as available with sumo-dosplot would be nice additions to the functionality of Galore.

They would be useful options to improve the clarity of output XPS plots, as often the orbital character of the XPS peaks are hidden by the solid black line (if only a single orbital contribution dominates at a given energy range, as is typical) and also the legend often includes many orbital contributions that have negligible contributions to the XPS spectrum.

For example:

MP API and a bug

Thanks for the great package. I have been using it to plot XPS and I would like to suggest a few things:

Allow direct analysis from a CompleteDOS from the Materials Project API (this can be obtained via the pymatgen.ext.matproj.MPRester). I did try using the CompleteDOS object as an input to process_pdos (which the Documentation indicated as being supported), but I think the code logic is wrong because the code expects a string. I think analysis from a pymatgen CompleteDOS is probably the most robust way to code the whole thing because you can then get the CompleteDOS from any source, including LOBSTER (which is another feature request).
offset in plot_pdos does nothing as far as I can tell. This is actually needed because otherwise the binding energies are wrong.

Galore does not handle comments in KPOINTS files

Describe the bug

If the KPOINTS file from a vasp calculation exists and has #ed comments in it, Galore throws the following error:

  File "/usr/local/bin/galore", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/site-packages/galore/cli/galore.py", line 56, in main
    run(**args)
  File "/usr/local/lib/python3.6/site-packages/galore/cli/galore.py", line 74, in run
    simple_dos_from_files(**kwargs)
  File "/usr/local/lib/python3.6/site-packages/galore/cli/galore.py", line 144, in simple_dos_from_files
    x_values, broadened_data = galore.process_1d_data(**kwargs)
  File "/usr/local/lib/python3.6/site-packages/galore/__init__.py", line 91, in process_1d_data
    xy_data = galore.formats.read_vasprun_totaldos(input)
  File "/usr/local/lib/python3.6/site-packages/galore/formats.py", line 446, in read_vasprun_totaldos
    dos = read_vasprun(filename)
  File "/usr/local/lib/python3.6/site-packages/galore/formats.py", line 321, in read_vasprun
    band = vr.get_band_structure()
  File "/usr/local/lib/python3.6/site-packages/pymatgen/io/vasp/outputs.py", line 757, in get_band_structure
    kpoint_file = Kpoints.from_file(kpoints_filename)
  File "/usr/local/lib/python3.6/site-packages/pymatgen/io/vasp/inputs.py", line 1186, in from_file
    return Kpoints.from_string(f.read())
  File "/usr/local/lib/python3.6/site-packages/pymatgen/io/vasp/inputs.py", line 1214, in from_string
    kpts = [int(i) for i in lines[3].split()]
  File "/usr/local/lib/python3.6/site-packages/pymatgen/io/vasp/inputs.py", line 1214, in <listcomp>
    kpts = [int(i) for i in lines[3].split()]
ValueError: invalid literal for int() with base 10: '#'

To Reproduce

Using the data in the file attached below, run:

Archive.zip

galore vasprun.xml

Expected behavior

You should get the error message described above.

Images

System information:

OS distribution: OSX High Sierra
Python version: Brew 3.6
Matplotlib version: n/a

Additional context

The problem can be circumvented by removing the KPOINTS file entirely, but this does not seem satisfactory and I can certainly imagine that some users may not easily diagnose the cause of this error in the future.

Separate modes for spikes and distributions

When a (P)DOS is plotted with no broadening specified from the command line, the plot appears to be filled due to zig-zagging of the resampled line. A small amount of broadening, comparable to the sampling width, would be much less "surprising". If the raw unbroadened output is desired, the user could always set -g=0, say.

Add Community Guidelines

As per the JOSS Review guidelines, it would be good if you added some info on community guidelines on contribution, posting issues, and seeking support. Doesn't have to be complicated, but always good to be succinct.

It's also a good guide to customize the issue template.

Ability to load VASP files directly

Although I've already written the code to load a DOS from the DOSCAR and POSCAR files, it probably makes sense to write a quick vasprun.xml parser instead. That way only one file is needed for analysis.

The main caveat is that the vasprun.xml file is generally at least an order of magnitude larger than the DOSCAR file. Also, the DOS is given to a greater precision in the DOSCAR file. What do people think?

Not specifying -p outputs nothing

Would it be possible to write out a .dat file of the output convoluted spectrum? Would be handy to be able to have option to plot output in own graph plotting tool/plot with experiment.

ENH: Overlay experimental data

As a precursor to interactive cleverness (Issue #7) we should get a basic overlay up-and-running. My initial scope for this is:

work with both TDOS and PDOS modes
enable with extra flag --overlay FILENAME
Allow data in same CSV and space-separated formats as TDOS (re-using infrastructure for these)
Add flags --overlay_offset and --overlay_scale for basic alignment tweaks

Working out the appropriate y scale for overlay data is probably going to be a pain. Maybe have a message reporting the automatically estimated value, which makes the maxima the same?

The test case will be rutile SnO2 for comparison with Farahani et al. (2014) https://doi.org/10.1103/PhysRevB.90.155413 - I have a nice DOS calculated with PBE0 which we can use to try and reproduce the plot in the paper.

Broadening width definitions - FWHM?

At the moment the width of the Gaussian and Lorentzian functions are the direct values (γ, c) from the functions:

g(x) = exp(-(x-x˳)² / (2 c² ))
l(x) = 0.5 γ / (π (x-x˳)² + (0.5 γ)²)

Would it be better to convert from a specified full-width-half-maximum FWHM value?
From wikipedia:

I think the γ for that Lorentzian form is already equal to the FWHM, but will check that.

ENH: Interactive tools for fitting

It would be pleasant to have an interactive tool for adjusting the broadening and energy offset of ab initio data and overlaid experimental data.

@utf has done some prototyping with Matplotlib widgets already. Javascript-based plots could be a little bit prettier and more responsive, but would require the broadening functions to be re-implemented.