GithubHelp home page GithubHelp logo

songyosk / uvvis Goto Github PK

View Code? Open in Web Editor NEW
6.0 2.0 1.0 1.05 MB

Automatic Prediction of Peak Optical Absorption Wavelengths in Molecules using Convolutional Neural Networks

License: MIT License

Python 100.00%
uvvis uvvis-spectroscopy ultraviolet-visable

uvvis's Introduction

Automatic Prediction of Peak Optical Absorption Wavelengths in Molecules using Convolutional Neural Networks

All source code and images are associated with the paper:

"Automatic Prediction of Peak Optical Absorption Wavelengths in Molecules using Convolutional Neural Networks"

J. Chem. Inf. Model (2024)

By S. G. Jung, G. Jung & J. M. Cole

Introduction

The description of each file is summarized below:

(i) smile_descriptors.py

Script to generate descriptor features based on SMILES of chemical molecules & solvents.

(ii) smiles_feature_matrix.py

Script to generate two-dimensional feature matrix (i.e. 2D image) of the chemical molecules & solvents based on their SMILES respresentation.

(iii) deep_convolutional_representation.py

Script to create and train deep residual convolutional neural networks, taking the 2D feature matrices as input.

(iv) GBFS.py

Script to perform gradient boosted feature selection, generate feature ranking, and carry out recursive feature selection. See "Gradient Boosted and Statistical Feature Selection" in https://github.com/Songyosk/GBSFS4MPP.

(v) Multicollinearity_reduction.py

Script to perform multicollinearity reduction, which includes correlation analysis and hierarchical clustering analysis. Correlation and linkage thresholds are defined to elminate features.

(vi) optimization.py

Script to perform Bayesian optimization, which determines the architecture of the predictive model based on a defined hyperparameter space.

(vii) evaluate_model.py

Script to evaluate models by returning performance metrics and plots.

Model Architecture

The overview of the project pipeline: Figure 1 F1

The Deep-CNN architecture: Figure 2 F2

The GBFS pipeline can be found in: https://github.com/Songyosk/GBSFS4MPP

Results

The final results of the multi-fidelity prediction of the optical peaks are shown below:

F3

Figure 3: Multi-Fidelity Prediction of the Optical Peaks (Random Split)

F4

Figure 4: Multi-Fidelity Prediction of the Optical Peaks (Scaffold Split)

Acknowledgements

J.M.C. conceived the overarching project. The study was designed by S.G.J. and J.M.C. S.G.J. created the workflow, designed the CNN architecture, performed data pre-processing, featurization, hyperparameter optimization, and analysed the data under the supervision of J.M.C. G.J. assisted with the design of the CNN architecture and contributed to the hyperparameter optimization. S.G.J. drafted the manuscript with the assistance from J.M.C. The final manuscript was read and approved by all authors.

J.M.C. is grateful for the BASF/Royal Academy of Engineering Research Chair in Data-Driven Molecular Engineering of Functional Materials, which is partly sponsored by the Science and Technology Facilities Council (STFC) via the ISIS Neutron and Muon Source; this Chair also supports a PhD studentship (for S.G.J.). STFC is also thanked for a PhD studentship that is sponsored by its Scientific Computing Department (for G.J.).

๐Ÿ”— Links

portfolio

License

License: MIT

uvvis's People

Contributors

songyosk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

deepdive2023

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.