ajkerr0 / kappa Goto Github PK
View Code? Open in Web Editor NEWA python package to calculate thermal conductivity across molecular interfaces.
License: MIT License
A python package to calculate thermal conductivity across molecular interfaces.
License: MIT License
Equivocal issue but solution has yet to be determined.
Right now the user has to manual look at the atoms using 3D matplotlib, drag the image around using their mouse, find an atom and go "Hey that looks good I think I'll chain something to that atom" and mentally take note of that atom index. This is an unclever and hacked-together method.
It would be nice to signify interfaces on the molecule, then plot the interface(s) in 2D with relevant information. The less the mouse is used the better. Maybe for each interface, have a default atom that will be the attachment point (non-default choices much be explicitly chosen by the user) to streamline things. The question is, how to develop this without falling in the trap that exists now of manually looking at the molecule in 3d to define your interfaces.
If the pre-generated atoms have their interfaces pre-defined this issue can be avoided. But what about users making their own molecules. Should that be a focus anyway?
We could write class definitions for Interface objects. Define them with orientation vectors and effective areas?
_combine doesn't need that functionality and makes the function less self-contained. That check can simply be in operation[_plus].chain (since that's the only situation in which it would be needed)
To future proof the code we need to do make sure all of the code is compatible for python 3. I think most of it involves the print functions but it might also affect the string formatting.
The package needs code to generate the forcefield parameters in the numpy '.npy' format.
Code in setup.py
or __init__.py
to run the modules in /param/ ?
There's no point that I can think of to have single method classes here over simple functions to build the premade molecules (like graphene, cnt, etc.) in molecule.py
It just muddles class/inheritance structure and I bet it's faster and requires less memory too.
For example on testing on the default dingus molecule, the numerical gradients and analytical gradients do not come close to matching. In fact the numerical gradients seem to be more accurate as they lead the dingus molecule to minimization faster. This has led me to believe something is not quite right with the bond angle analytical gradients.
When interfaces are deleted (like when single atom interfaces are combined) the code currently gives no consideration to this in terms of adding to the facetracking attribute. As it stands the code will input numbers for arrays that don't exist anymore in those cases, causing errors.
The analytical gradients do not yet work. First off the dimensionality isn't correct yet: gradients are 3*J where J is the number of bonds, angles, dihedral angles, or w/e. Need to write it so the forces due to angle ijk go to atoms i,j and k correctly, for example.
I had a version of this working months ago but I cannot find it! The merits of version control become more apparent everyday.
If we can get the analytical gradients working then it will be much faster to calculate them leading to the viability of using first derivative information to find good step sizes in energy minimization.
For example, when amine molecules are generated and _configure_parameters is called, there is an error because amine.dihList is empty (as expected). The error occurs when vn is being assigned to the molecule object. The 0th element of self.dihList doesn't exist! Should wrap those lines with try/except blocks to handle the cases where bond/angle/dihList is empty.
This python package needs a new name in the future.
There already exists a 'kappa' in the PyPI, so we need a unique one for future submission there.
Maybe move the code to a new repository? (don't believe you can change names).
The c4s tube lattice generation is embarrassingly bad right now. It needs construction similar to CNT, in which a strip is rolled into a tube. The circular starting shapes will hopefully minimize to the physical structures we are looking at.
Right now the current calculate_thermal_conductivity() code only works if the specified drivers are ON interfaces. We want arbitrary placements on heat baths fully across the interfaces. We want arbitrary numbers of them. If Users don't specify their location(s), assign them randomly.
A software license is needed since the code is now publicly available.
.pdb functionality is installed for certain molecules. Works using molecule.posList, molecule.zList, molecule.bondList. However more code is required to directly interact with gromacs and generate more files. Specifically .top files, containing position data and all interactions and strengths. Should this be part of this package? A config file and SLURM script will also be needed for the supercomputer.
A mix of golden search rule and parabolic interpolation.
How will this method compare to backtracking?
What is effectively a single-method class is unneeded here. Can be a single function with the same kwargs. Scipy calls a function: https://github.com/scipy/scipy/blob/master/scipy/optimize/_minimize.py
This also eliminates the awkward action of the user instantiating a Minimizer. Also rename minimize.py to _minimize.py. Import 'minimize' in package init.py
This package requires numpy
and matplotlib
to run, so code in setup.py
needs to reflect this so python installers check if its installed for the user already, and downloads it automatically if its not.
Instead of using nList that is indexed like posList to determine bonds, why not make the user input bondLists explicitly?
Pros:
-That way there can be no mistake in determining the bondList.
-It is more explicit; no need to interpret what nearest neighbor means in the context of kappa
-Keeps all the molecule attributes that need to join in the same ndarray format.
Cons:
-The code is already there to use nLists; we would have to go back through the pre-generated molecules to build their bondLists straight up
-Python lists are easy in this regard that you can just 'append' new neighbors when molecules are chained together
Maybe give the user an option to input either?
In case the user doesn't input a list as a numpy ndarray type, every array-like parameter should be converted to a numpy array so it has they all use the nice vectorized properties and they don't run into errors.
Need to make available the thermal conductivity code but applied to the Molecule objects.
Things to note:
As an example, when a dingus molecule (count=5, angle=160.) is minimized with numerical gradients, it's energy actually increases by a few factors after taking the 4th step. I have seen this problem pop up before occasionally.
I think the problem is in the step size calculation. There may be a point where the code stops checking if the step size leads to a lower energy or not. Probably need to make the code more robust anyway.
See notes for structure. Need to write a build function in molecule.py
. These will be building blocks for more molecules.
When an interaction is turned on or off in a forcefield, it would be nice to change a state variable that indicated that the forcefield has fundamentally changed. This would be used in Molecules using ff's as attributes; if an interaction has been changed then a signal needs to go off in Molecule that new energy/gradient functions need to be built because the old ones are invalid. Then there's the problem keeping track of this for all of the molecules that use FF as an attribute.
Many parts of the code require opening text files, they are currently accessed through relative paths which don't work if the user is not in the same directory as the kappa package. Incorporate code like:
import os
dir = os.path.dirname(file)
rel_dir = "./AMBER.txt"
Join the file paths. Probably have thisdir
variable defined in init.py
For different standard forcefield interactions (ie bond lengths, bond angles, etc.) it only makes sense to use standards for the parameter list names (kbList, kaList, t0List,...) and possibly tie these into the Forcefield base class.
My idea to resolve:
The configure parameters method that only exists in the Amber subclass now should be in the Forcefield base class, with checks to determine which parameter lists are assigned to the instance as well as from which directory to pull the parameters from.
For now the forces due to bond interactions are added to the initialized gradient via for loops instead of doing it by numpy vectorization. This method appears to work in bond length gradients for example, but may not necessarily be ideal.
Need to formalize a network of interfaces to that we can calculate thermal conductivity concerned the two 'root' interfaces. Might have to pair certain interfaces (there has to be flow).
This is related to the comments of Issue #37
Need to add factory function to forcefield subclasses (Amber, etc.) to 'manufacture' energy calculation functions of different sources (bond length, bond angles, nonbonded interactions, etc). Preliminary code stored in amber_e.py
Forcefield class needs a method to change which interactions are 'on' or 'off'.
At this time there is nothing in place to explicitly stop users from chaining molecules with atoms of different atomic number, chaining 'occupied' interfacial atoms, possibly trying to bond atoms of the same interface, etc. Right now the user will only run into errors such as if the specified atomic indices aren't found in faces.atoms.
At the bare minimum an error should be raised if the user is making an obvious error in the way he/she is chaining molecules.
The analytical gradients have been calculated by hand, but previous attempts to implement them in the code have failed. That is why numerical gradients are in the code for now. Analytical gradients would improve the speed of minimization considerably.
In build functions that call the chain operation, particularly build_imine_chain
and build_polyethylene
, interface objects get created on the base chain ONLY and don't go away when the chain get's longer.
I think the problem is new Interface objects don't get created on 'sub' build functions. Also run a check in the chain function when an interface has only 1 atom, remove it?
I probably have to pay attention to have what happens to interfaces when molecules combine also...new interfaces don't exist in the new,base molecules.
The 3D plots use
Of course, these plots would then only be physical if minimization took place.
In some systems, particularly imine functional groups, the dihedral and improper torsionals are not being found correctly. Why are they being found in some molecules (such as graphene?) and not the smaller systems?
The package needs more docstrings and one-line comments to help the readability.
README needs to be updated
Either through the RESP or AM1-BCC methods for example. Refer to the Antechamber paper. For electric monopoles.
Need code that calculates the thermal conductivity of the molecules.
In particular when 6 member chain is driven at the end by a cosine force of angular frequency 1.5: the displacements get huge around 200 time units. There are also amplitudes near 100 time units that don't seem to be present in Dr. Mullen's plots, an indication that things are to blow up at later t?
Maybe this is an indication that my calculation isn't totally correct (seems to be closer to correct than before). Maybe this frequency is a resonant frequency that moves the chain large distances since the chain isn't 'tied down'? Could try adding a small potential to one of the inner atoms.
The displacements of the atoms do not return to zero (assuming a driving force would let them do so, and considering drag), and do not match Dr. Mullen's old solutions at all.
It would nice to have a 'Calculation' class (we can work on a better name) that, for example, instantiates with a base molecule like a CNT then the user specifies what molecules will be attached, and where. Also have options for the heat bath drivers, etc. Maybe also have a log of details that can be exported with a method call.
When we perform the thermal conductivity calculations we need certain atoms to be attached to heat baths. Heat bath objects need to be defined with attributes such as temperature, which atoms they are attached to, etc. From these the effective forces on the atoms will be determined for the sake of finding the positions of the atoms in the heat driven differential equation.
When analytical gradients are mixed (in particular, when bond lengths and bond bending are turned on together with nothing else), the total gradient does not point in the same direction as the numerical gradient. When alone, these interactions have near identical (up to a certain step) step histories when comparing the numerical and analytical gradients. This leads me to believe the 'relative' vectors aren't correct. (One might be multiplied by a factor of 2 for example.) This can explain why when alone the analytical gradients work (the only thing that would change with such an error is the force magnitude, not the direction of the force because stepsizes can handle that).
Be careful not to judge from errors related to #38. The minimization routine needs a bit of an overhaul anyway, at least in terms of line searching.
The energy minimization routine does not print out a statement that signifies what condition was met to stop the minimization, whether it was force precision or step size precision. It seems like what is stopping it in a lot of cases is that either the forces aren't pointing in the right direction therefore returning a really small step size through via the line search or the line search is missing handling of certain cases. Maybe the Step Precision limit should be lowered?
A total rewrite probably isn't necessarily but right now it's a mess. Maybe the function should be separated with comments. Comments also explaining what each section and subsection(s) is doing would make it less of a mess.
Facetracking is pointlessly complicated right now. While I am proud that it works, in practice it doesn't help much. In practice what I imagine happening with the Calculation class:
We also can delete interfaces (which we might already be doing to single atom interfaces). We don't need the extra interfaces in imine chains for example.
Need input from @tab10.
Need a way to export kappa's molecules for use in other codes, namely GROMACS.
It would also be nice if we had an easy way to carry molecule objects around.
At some point a bug appeared that changed the i index in successive _combine calls in operation.chain. Right now users cannot input >2 molecules in the molList argument of chain. Users have to chain 2 molecules at a time which defeats the purpose of the chain function.
There is no treatment for the atoms of Interfaces that have been previously chained at that spot. There is no consideration in the plotting of the faces (they look the same as the others), and there is not stopping users from chaining additional molecules at those points.
Maybe add an attribute like occupied
that's a list of the occupied indices to the Intefaces. Draw them differently with plot.faces (with a different color like purple)? Also run a check in the chain code that would disallow the user of chaining that, by raising an error?
It may be necessary to 'weld' molecules together, that is, combine multiple atoms from 2 different molecules at once. Right now only single atom combinations are treated.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.