petermattia / revisit-severson-et-al Goto Github PK

Repository for the paper "Statistical learning for accurate and interpretable battery lifetime prediction"

License: MIT License

MATLAB 0.13% Jupyter Notebook 99.04% Python 0.83%

revisit-severson-et-al's Introduction

petermattia

I'm currently working on a new venture. Previously, I worked for Tesla. Prior to Tesla, I was a graduate student in the Chueh group in the Stanford Department of Materials Science and Engineering.

All code on GitHub is my own and not my employer's.

My personal website can be found at petermattia.com.

revisit-severson-et-al's People

Contributors

Stargazers

Watchers

Forkers

pauljgasper linjing-gz-gd intelljames songyang1997 agafonovslava lewis2222 mkmohangb nicole-voltaiq yaoxy2010 sunnyhong36 sautee yangkai21 dingyi-yao

revisit-severson-et-al's Issues

Reproducing the "discharge" and "full" model

Hi, thanks for this very good and organized codebase.

Following your code, the "variance" model is easy to reproduce. However, I have trouble reproducing the performance of the "discharge" and "full" model. Do you know what detail I may be missing?

My results are:

(103.5718828346258, 138.2982760624032, 195.8641688228285)    # var
(62.6994702955215, 181.47910490241762, 194.8456325157642)    # discharge
(62.61943481775296, 141.89261700359316, 170.8711847567794)    # full

Here is my code for feature generation:

def var_feature(batch, to_skip=None):
    feats = []
    to_skip = to_skip or []
    if isinstance(to_skip, int):
        to_skip = [to_skip]
    for indx, cell in enumerate(batch):
        if indx in to_skip:
            continue
        x = cell['cycles']['99']['Qdlin'] - cell['cycles']['9']['Qdlin']
        feats.append([np.log10(np.var(x))])
    return np.array(feats)

def discharge_feature(batch, to_skip=None, freq=10):
    feats = []
    to_skip = to_skip or []
    if isinstance(to_skip, int):
        to_skip = [to_skip]
    for indx, cell in enumerate(batch):
        if indx in to_skip:
            continue
        cell_feat = []
        x = cell['cycles']['99']['Qdlin'] - cell['cycles']['9']['Qdlin']
        cell_feat.append(np.var(x[::freq]))
        cell_feat.append(np.abs(np.min(x[::freq])))
        cell_feat.append(np.abs(skew(x[::freq])))
        cell_feat.append(np.abs(kurtosis(x[::freq])))
        cell_feat.append(cell['summary']['QD'][1])
        cell_feat.append(cell['summary']['QD'][1:100].max() - cell['summary']['QD'][1])
        feats.append(np.log10(cell_feat))
    return np.array(feats)

def full_feature(batch, to_skip=None):
    feats = []
    to_skip = to_skip or []
    if isinstance(to_skip, int):
        to_skip = [to_skip]
    for indx, cell in enumerate(batch):
        if indx in to_skip:
            continue
        cell_feat = []
        x = cell['cycles']['99']['Qdlin'] - cell['cycles']['9']['Qdlin']
        cell_feat.append(np.log10(np.var(x)))
        cell_feat.append(np.log10(np.abs(np.min(x))))
        m = LinearRegression().fit(
            np.ones((99, 1)),
            cell['summary']['QD'][1:100])
        cell_feat.append(np.abs(float(m.coef_)))
        cell_feat.append(np.abs(m.intercept_))
        cell_feat.append(np.log10(cell['summary']['QD'][1]))
        cell_feat.append(np.log10(cell['summary']['chargetime'][:5].mean()))
        cell_feat.append(np.log10(cell['summary']['Tavg'][1:100].sum()))
        cell_feat.append(np.log10(cell['summary']['IR'][1:100].min() + 1e-8))
        cell_feat.append(np.log10(abs(cell['summary']['IR'][100] - cell['summary']['IR'][1])))
        feats.append(cell_feat)

    return np.array(feats)

I just use a linear regression after the standard scaler:

def train(x_train, x_test1, x_test2):
    scaler = preprocessing.StandardScaler().fit(x_train)
    x_train = scaler.transform(x_train)
    x_test1 = scaler.transform(x_test1)
    x_test2 = scaler.transform(x_test2)

    # Define and fit linear regression via enet
    # l1_ratios = [0.1, 0.5, 0.7, 0.9, 0.95, 0.99, 1]
    # enet = ElasticNetCV(l1_ratio=l1_ratios, cv=5, random_state=0)
    enet = LinearRegression()
    enet.fit(x_train, y_train)

    # Predict on test sets
    y_train_pred = enet.predict(x_train)
    y_test1_pred = enet.predict(x_test1)
    y_test2_pred = enet.predict(x_test2)

    # Evaluate error
    return get_RMSE_for_all_datasets(y_train_pred, y_test1_pred, y_test2_pred)

Standardization of discharge curve

Hello, in the cycle process, how to handle the capacity-voltage curve standardization of all discharge curves of 4C, and what are the steps of smooth spline function fitting? Thank you very much.

There is a gap between the spline function and real data

Thanks for sharing the great work!
Qdlin comes from the sampling of the spline function which is generated from the Qd-V points, so the Qdlin-V curve and Qd-V curve should be consistent, but the figure show that there is an obvious gap between these two curves.
Why it came out like this?

Here is the code

key = 'b1c1' # also for b132 b2c25 ...
cycle = 2 
plt.figure(figsize = (10, 10))
diff = np.diff(train[key]['cycles'][f'{cycle}']['I'] < -1.1 / 50)
indices = np.where(diff != 0)[0]
start_ind, end_ind = indices[0] + 1, indices[1]
print(start_ind, end_ind, len(train[key]['cycles'][f'{cycle}']['I']))
plt.scatter(train[key]['cycles'][f'{cycle}']['Qd'][start_ind:end_ind], train[key]['cycles'][f'{cycle}']['V'][start_ind:end_ind], s = 0.1, color = 'blue', label = 'original')
plt.scatter(train[key]['cycles'][f'{cycle}']['Qdlin'], np.linspace(3.6, 2.0, 1000), s = 0.1, color = 'red', label = 'spline')
plt.legend()

The data i used is generated by Load Data.ipynb in repo

Readme.md - Update 'featuregeneration.m' to 'generate_voltage_arrays.m'

The Readme.md file has the following -

Our key scripts and functions are summarized here:

featuregeneration.m: MATLAB script that generates capacity arrays from the [battery dataset]

(https://data.matr.io/1/projects/5c48dd2bc625d700019f3204) and exports them to csvs (stored in /data).

featuregeneration.m is now referenced in the repository as generate_voltage_arrays.m?

Inconsistency found

Thanks for sharing the great work!
I found two inconsistency

First one

Second one

using script BuildPkl_Batch1.ipynb to load mat file and plot discharge_capacity-time curve

channel = 18 #b1c18
print(f[batch['policy_readable'][channel,0]][:].tobytes()[::2].decode()) # 5.4C(70%)-3C
print(f[batch['cycle_life'][channel,0]][:]) # 691
cycles = f[batch['cycles'][channel,0]]
Qd = np.hstack((f[cycles['Qd'][9,0]][:]))
t = np.hstack((f[cycles['t'][9,0]][:]))
plt.plot(t, Qd)

this is weird.

plt.plot(range(len(Qd)), Qd) # change 't' to 'range(len(Qd))'

then the figure looks like this

np.sort(Qd) may make the figure look more normal, but the horizontal line remains which is weird.

It seems there are something wrong in the program about data conversion from csv to mat.

petermattia / revisit-severson-et-al Goto Github PK

revisit-severson-et-al's Introduction

petermattia

revisit-severson-et-al's People

Contributors

Stargazers

Watchers

Forkers

revisit-severson-et-al's Issues

Reproducing the "discharge" and "full" model

Standardization of discharge curve

There is a gap between the spline function and real data

Readme.md - Update 'featuregeneration.m' to 'generate_voltage_arrays.m'

Inconsistency found

First one

Second one

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs