GithubHelp home page GithubHelp logo

gouldgroup / saxspy Goto Github PK

View Code? Open in Web Editor NEW
3.0 0.0 2.0 2.06 MB

Synthetic data generation, processing and model training for predicting lipid phase behaviour

License: MIT License

Python 100.00%

saxspy's Introduction

SAXSpy

This repository contains the synthetic data generation, processing and model training code for predicting lipid phase behaviour, details are in the paper titled "Machine Learning Platform for Determining Experimental Lipid Phase Behaviour from Small Angle X-ray Scattering Patterns by Pre-training on Synthetic Data".

Getting Started

Requirements

To get started, install the requirements by:

pip install -r requirements.txt

Quick Start

To begin, you need to generate some synthetic SAXS Samples. Let's generate some lamellar phase samples to begin, navigate to Generation_scripts and run

python Generate_Lamellar.py

This should automatically generate a set of lamellar I/q values and save it into Synthetic_raw. A random sample from the generated data will also be plotted.

Note: The default parameters for synthetic data generation are the ones that gave a resulting matching distribution with real, experimental SAXS patterns - feel free to experiment with different ranges!

Data Generation

There are 3 synthetic data generation scripts in Generation_scripts. Each script uses our saxspy module to allow you to generate entire datasets of SAXS patterns for a particular lipid phase. The cubic phase script, Generate_Cubic.py can be modified to generate patterns for Primitive, Gyroid or Diamond surface cubic phases. This can be done by varying the phase variable

# Instantiate the synthetic model: 'P', 'G', or 'D'
phase = 'G'

Each scripts generates the data based on the params variable - a list of parameter ranges.

# ranges of: lattice parameter, head position, sigma head, sigmal tail
params = np.array([[20, 78], [5, 30], [0.5, 3], [0.5, 5]])

The resulting datasets are saved as I/q arrays in the Synthetic_raw folder.

Verbosity

Each generation script has a verbose boolean, when True, a random sample from the generated data is plotted.

Data Processing

Once the synthetic data is generated, we perform multiple pre-processing steps before massing through the model. This can be done with the scripts in Preprocessing_scripts. In order to correctly pre-process the data for training, ensure that your raw data files are in the Synthetic_raw directory with the correct phase names as given by the generation scripts. The processed datafiles will be saved in

Model Training

train.py is the training/validation script. Assuming you have generated and processed all available phases, running python train.py from this repository's root should start training your model on the synthetic data. The trained model will be saved as trained_saxs_model.h5.

saxspy's People

Contributors

hichiaty avatar

Stargazers

Isai Gordeev avatar  avatar  avatar

saxspy's Issues

Question on lamellar phase

Hello,
thanks for sharing the code for your paper. I was trying to reproduce the generation of the lamellar phase scattering curve and had a couple of questions that I would appreciate if you can provide some clarification:

store_it.append(np.array([q, (recip_cell*np.conj(recip_cell)).real]))

In the line above, is the second argument to np.array the same as $F(q)^2$ in the paper?

if yes, shouldn't we be somehow multiplying $F(q)^2$ with $S(q)$ to get the intensity profile? Can you point me to where this is being done in the code?

Also, does the following line simply add the "correction" to a computed I(q) profile?

temp = data[k,1,i]*dwf(data[k,0,i],dw_param)*saxspy.voigtSignal(q, random_voigt_gaussalpha, random_voigt_lorenzgamma, data[k,0,i])

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.