GithubHelp home page GithubHelp logo

giocoal / cluster-analysis-on-monte-carlo-simulations-of-water-adsorption-on-nacl-atmospheric-particulate Goto Github PK

View Code? Open in Web Editor NEW
3.0 1.0 0.0 19.1 MB

Code for the paper "Theoretical Investigation of Inorganic Particulate Matter: The Case of Water Adsorption on a NaCl Particle Model Studied Using Grand Canonical Monte Carlo Simulations" and Bachelor's Thesis "Cluster Analysis on the Results of Molecular Simulation of the Water Adsorption Process on Atmospheric Particulate Models"

Home Page: https://doi.org/10.3390/inorganics11110421

License: MIT License

Python 100.00%
scikit-learn clustering computational-chemistry python dbscan molecular-mechanics particulate-matter pdb

cluster-analysis-on-monte-carlo-simulations-of-water-adsorption-on-nacl-atmospheric-particulate's Introduction

Code for the Paper:

"Theoretical Investigation of Inorganic Particulate Matter: The Case of Water Adsorption on a NaCl Particle Model Studied Using Grand Canonical Monte Carlo Simulations"

and the Bachelor's Thesis:

"Cluster Analysis on the Results of Molecular Simulation of the Water Adsorption Process on Atmospheric Particulate Models."

Research Traineeship - BSc in Chemical Science and Technology [L-27] - University of Milano-Bicocca.

Contributors Forks Stargazers Issues MIT License LinkedIn Paper Thesis

Table of contents

Introduction

This is the code for the cluster and data analysis of the paper "Theoretical Investigation of Inorganic Particulate Matter: The Case of Water Adsorption on a NaCl Particle Model Studied Using Grand Canonical Monte Carlo Simulations" (F. Rizza, A. Rovaletti, G. Carbone, T. Miyake, C. Greco, U. Cosentino), published on the international, peer-reviews and open access Inorganics journal by MDPI, and my Bachelor's Thesis: "Cluster Analysis on the Results of Molecular Simulation of the Water Adsorption Process on Atmospheric Particulate Models."

My research internship activity was part of a research project concerning the study, by means of computational simulations, of the adsorption process of water on model surfaces of sodium chloride (NaCl) atmospheric particulate matter of marine origin.
To gain a molecular-level understanding of the adsorption process of water vapor on the NaCl surface, Monte Carlo simulations performed in the Grand Canonical ensemble were carried out, considering the water adsorption at different water pressures on a NaCl(001) surface.

During the research internship at the Computational Physical Chemistry Laboratory at the University of Milano-Bicocca, under the supervision of Professor Claudio Greco and Professor Ugo Cosentino, I worked on my Bachelor's Thesis project.
I analyzed 3-D molecular mechanics computational simulations of the water adsorption process on atmospheric particulate matter, leveraging unsupervised machine learning (DBSCAN) for water clusters detection.
Specifically, my work involved the development of a script in Python language (NumPy, pandas, scikit-learn), capable of performing an automated (frame-by-frame) data analysis of the configurations (atomic coordinates of water molecules) generated during each simulation, conducted at a specific water pressure value.
Mainly, the script performs a cluster analysis (DBSCAN) of the configurations, with the purpose of studying the aggregation-type phenomena involving the water molecules adsorbed on the surface, the identified clusters are then classified into "islands" or "layers" according to their size, and their different properties are studied.
The results of my study are collected in my Bachelor's thesis: "Cluster Analysis on the Results of Molecular Simulation of the Water Adsorption Process on Atmospheric Particulate Models."

Furthermore, during the course of the year 2023, I subsequently contributed, in the context of a voluntary collaboration with the corresponding authors' research groups, to the development of a paper entitled: "Theoretical Investigation of Inorganic Particulate Matter: The Case of Water Adsorption on a NaCl Particle Model Studied Using Grand Canonical Monte Carlo Simulations" (F. Rizza, A. Rovaletti, G. Carbone, T. Miyake, C. Greco, U. Cosentino), published on the international, peer-reviews and open access Inorganics journal by MDPI.
In particular, I was involved in the investigation, formal analysis and data curation phases.

Requirements

  • matplotlib 3.5.2
  • numpy 1.22.4
  • pandas 1.5.3
  • scikit_learn 1.1.1
  • scipy 1.7.3

Usage

Data

Simulation data (atomic coordinates of H, O, Na and Cl) are not publicly available. To use these scripts, the data must be in .pdb format.

Peprocessing Scripts

The preprocessing scripts allow the extraction of .pdb files containing the atomic coordinates of the atoms in the different simulation frames from the simulation HISTORY files.

  • From_ARCHIV_to_PYMOL_NaCl.py: generates a .pdb file containing the atomic coordinates of only the Na and Cl atoms. Each frame is preceded by the string MODEL *frame_NUM* and ends with *ENDMDL*. With NX and CX we denote the Nitrogen and Chlorine atoms of the 4 mobile layers, and with NA and CL those of the fixed layer, respectively.
  • From_ARCHIV_to_PYMOL_SoloH2O: generates a .pdb file containing the atomic coordinates of only the H and O atoms from the water molecules. Each frame is preceded by the string MODEL *frame_NUM* and ends with *ENDMDL*. With OW and HW we denote the Oxygen and Hydrogen atoms.
  • From_ARCHIV_to_PYMOL_SoloOssigeno_NaCl: generates a .pdb file containing the atomic coordinates of only O atoms from the water molecules. Each frame is preceded by the string MODEL *frame_NUM* and ends with *ENDMDL*.

Cluster and data analysis scripts

cluster_analysis.py is the main script of the repository, which apply cluster and orientational analysis to .pdb files containing the atomic hydrogen and oxygen coordinates of water molecules for a simulation conducted at a specific partial pressure of water.
The first part of the analysis relates to the identification and classification of clusters of water molecules:

  • Using the DBSCAN clustering algorithm, it identifies for each frame, the number of clusters in the system and the number of water molecules in the cluster.
  • It assigns each molecule in each frame a label identifying it as belonging to a particular cluster.
  • Generates for each frame 3D representations of the different clusters identified
  • Classifies each cluster into an "island" or "layer" according to its size and shape.
  • Determines, for each simulation, the average number of clusters per frame and their average size.

Schematic representation of cluster analysis steps.

The second part of the analysis, on the other hand, involves an orientational study of the water molecules with respect to the NaCl model surface, in particular:

  • Distance sectors from the NaCl surface are defined
  • For each sector, the orientation of the water molecules is studied in terms of the angle between the dipole moment vector of the water and the normal vector to the NaCl surface.
  • Histograms representing the distribution of water molecules in each distance sector are generated.
  • The average orientation of the water molecules is determined for each distance sector.
  • Scatter plots are generated representing the trend of the average orientation of the water molecules moving away from the surface.

Schematic representation of the orientational analysis steps.

On the other hand, the molecules_density_analysis.ipynb notebook is a study of the density of Na and Cl atoms in the different layers of the simulation box.

Status

Project is: ##c5f015 Done

Contact

Feel free to contact me!

Citation

If you find the paper or the source code useful to your projects, please cite the following bibtex:

@Article{inorganics11110421,
    AUTHOR = {Rizza, Fabio and Rovaletti, Anna and Carbone, Giorgio and Miyake, Toshiko and Greco, Claudio and Cosentino, Ugo},
    TITLE = {Theoretical Investigation of Inorganic Particulate Matter: The Case of Water Adsorption on a NaCl Particle Model Studied Using Grand Canonical Monte Carlo Simulations},
    JOURNAL = {Inorganics},
    VOLUME = {11},
    YEAR = {2023},
    NUMBER = {11},
    ARTICLE-NUMBER = {421},
    URL = {https://www.mdpi.com/2304-6740/11/11/421},
    ISSN = {2304-6740},
    DOI = {10.3390/inorganics11110421}
}

cluster-analysis-on-monte-carlo-simulations-of-water-adsorption-on-nacl-atmospheric-particulate's People

Contributors

giocoal avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.