Deep Learning for Metasurface Optimization

Optimization of single-element metasurface parameters using deep learning with tensorflow/keras and ~5600 Lumerical simulations as training data. Simulations performed under normally incident light. The features that define metasurface are the 1.length (L) 2.width (W) 3. height (H) 4.x-direction periodicity (Ux) 5. y-direction periodicity (Uy). The output is the phase spectrum around and across the visible at increments of 5 nm (450 nm - 800 nm).

For the powerpoint, there are animations so I recommend watching it in slide show mode.

Everything published in this repo has been done so with permission

I will be presenting this work at Photonica-2019.

Background

Metasurfaces are used to manipulate light in various manners for a plethora of applications. Current state-of-the-art methods to design these nanostructures are quite medieval and rely on a brute force strategy. That is, given a desired output, what combination of metasurface parameters can give us the values closest to what we seek? To answer this question, researchers rely on simulation software and perform thousands of parameter sweeps in hope that they find the best combination. The costs of the simulations are high, both in terms of time and computing power, especially in a research setting where many people are simultaneously working on the same cluster (see slide 18). This is the forward design approach.

In performing these vast parameter sweeps, researchers unknowingly built a powerful dataset. During my time at Harvard, I realized this and decided to aggregate some few thousand of my simulations and used it as a proof of concept that there exists a better and more efficient way to optimize metasurface parameters; deep learning.

Attached is part of the presentation for the talk I gave to my colleagues at the Capasso Group in Harvard University's Applied Physics Department along with some results using the Scatternet repo from a neighbour group at MIT. A component of this presentation contains some of the basics of neural networks, which I think might be heplful. Given that the talk was oral and the slides were merely visual aids, I've added notes at the bottom of some slides for context and clarity.

Due to the success with the Scatternet repo (Nanophotonic Particle Simulation and Inverse Design), I have now built a better/simpler version designed for metasurface parameter optimization (based on their initial code). Currently, there is only code for the forward design, though the inverse design is the next step. I will not be further developing my inverse design code due to time limitations, though would highly recommend taking a look at the inverse design schme developed in the Scatternet repo for ideas on how to do such a thing. Let it be noted that even though this repo is only for the forward design and thus does not alleviate the problem of brute force, it does offer two things:

Since deep learning is merely vectors/matrices, all computations are analytical and faster by up to 1200x compared to the simulation methods used currently, which are numerical (finite difference time domain). So even though this is still brute force, it is a significantly faster brute force!
This code for the forward design sets the foundation for the inverse design.

Definitions:

Forward Design: Given a set of input parameters that define a metasurface, predict the phase spectrum.

Inverse Design: Given a desired phase spectrum, work backwards to determine what paramaters will result in said phase spectrum? This would alleviate the need for all brute force strategies, whether it be numerical brute force or analytical brute force.

Results

Comparing the accuracy from the test set and the validation, we see similar values suggesting that we have likely not overfit our systems, which is positive. Ideally, the accuracy would be higher, but it should be noted that the dataset is excpetionally small, and wasn't originally designed for this purpose. Rather, I developed this dataset for a different project and this project was born as a result of tangential curiousity. Additionally, one of the challenges of designing in the metasurface realm is the issue of coupling and other offects that are very hard to predict. Based on my experience, a larger and better designed dataset could really help with this. Another reason the accuracy might not be as high can be explained by the fact that phase wraps around 2π, meaning that 0 is the same as 2π is the same as 12π. It is likely (verified by image below) that the network predicts a phase, that when wrapped to 2π, is the same value as the actual phase. So even though the actual values and predicted values differ, their physical interpretation are the exact same. I think this is something to be very excited by and I briefly explain it below.

Here are some other candidate results that show the potential of the network (validation):

Here are some other candidate results that show the potential of the network (test):

One exciting result shown in the images below is from the test set (top) and validation set (bottom). The reason this is interesting is because the network predicted a phase of 0 radians when the FDTD software converged on -2π. In reality these are the same phase values, but it appears that the network is able to learn that phase wraps, without it being explicitly put in the code (likely a consequence of trying to minimize the loss). The way to justify it is via the non-linear regression process. An easy example is the case of isotropic structures. For isotropic structures, we know the phase is 0 across the entire spectrum. However, sometimes the FDTD software will provide 2π radians as the solution or 0 radians as the solution. This means in the training process, there are isotropic structures with 0 radians and 2π radians, and through the non-linear regression process, the network detected this pattern because it helped minimize the loss. The same can be said about non-isotropic structures. It is for this reason training/testing/validating/predicting should be performed with the phase wrapped to 2π (though this can make interpreting the phase spectrum more challenging, so for analysis purposes unwrapping it might be helpful).

Understanding this Repo

I performed some optimization to fine tune the hyperparamters. The default values in core.py represent the optimized values.

Files:

Input.csv: input features

Output.csv: corresponding phase spectrum

Presentation.pptx.zip: part of my presentation at Harvard SEAS for the Capasso Group (annotated with notes, recommended to view in slideshow mode)

core.py: main script

_my_model.h_5: saved model

my_model_weights.h5: saved model weights

test_results.zip: Test set result graphs comparing actual vs prediction. Note: when going through the images, keep in mind the y-axis values; curves might be closer than they seem since the axes are set to automatic.

validation_results.zip: Validation set result graphs comparing actual vs prediction. Note: when going through the images, keep in mind the y-axis values; curves might be closer than they seem since the axes are set to automatic.

test_x.txt: results of test(15%)/train(70%)/validation(15%) split

test_y.txt: results of test(15%)/train(70%)/validation(15%) split

train_x.txt: results of test(15%)/train(70%)/validation(15%) split

train_y.txt: results of test(15%)/train(70%)/validation(15%) split

val_x.txt: results of test(15%)/train(70%)/validation(15%) split

val_y.txt: results of test(15%)/train(70%)/validation(15%) split

unlogical0327 / deep-learning-for-metasurface-optimization Goto Github PK