airoldilab / ai-sgd Goto Github PK

View Code? Open in Web Editor NEW

5.0 5.0 2.0 672 KB

Towards stability and optimality in stochastic gradient descent

R 79.27% Makefile 3.39% C++ 17.34%

ai-sgd's People

Contributors

Stargazers

Watchers

Forkers

liangwuli1992 hulalazz

ai-sgd's Issues

Modularity in sgd.R

I've been tweaking sgd.R over the past few days and it has gotten absurdly lengthy (have not pushed the latest yet).

It would be natural to use update functions, e.g., in the main loop, you would have something such as theta.new <- sgd.update(...) for each if statement. However, the number of parameters the update function takes causes them to be very convoluted.

Here's an example of what the main loop would look like:

# Run the stochastic gradient method.
# Main iteration: i = #iteration
for (i in 1:(n*npass)) {
  # Update y matrix if method uses least squares estimate.
  if (sgd.method %in% c("LS-SGD", "LS-ISGD")) {
    y[, i] <- ls.update(i, data, theta.sgd, n)
  }
  # Update parameters.
  if (sgd.method %in% c("SGD", "ASGD", "LS-SGD")) {
    theta.new <- sgd.update(i, data, theta.sgd, lr, lambda, n, ...)
  } else if (sgd.method %in% c("ISGD", "AI-SGD", "LS-ISGD")) {
    theta.new <- isgd.update(i, data, theta.sgd, lr, lambda, n, ...)
  } else if (sgd.method == "SVRG") {
    theta.new <- svrg.update(i, data, theta.sgd, lr, lambda, n, ...)
  }
  theta.sgd[, i+1] <- theta.new
}

Is this an acceptable solution? It makes the sgd() function more readable at the cost of making the update functions themselves a bit convoluted.

Need separate folder/better naming for experiments.

It is probably better to have a new folder with the experiments and then
follow a better naming scheme. For example, it is more descriptive to have a name
such as exp_model_params.R e.g., exp_normal_n5p2.R to know that this is an experiment
on the normal model with n=1e5 examples and p=1e2 parameters.

Run experiments on covertype data

See Bach and Moulines (2013). Compare AI-SGD to Bach's method for logistic regression and SVRG.

Implement averaged implicit (aiSGD) in sgd.R

Benchmark using svrg.update() vs putting the code directly inside sgd()

svrg.update() may possibly be slower due to loading in the entire DATA object per iteration of the main loop. This may be negligible unless the number of passes is arbitrarily large.

Unit testing

Move the time benchmark function outside of exp_normal_correlated.R and make it generic

Reorganize functions.R and disperse functions over multiple files

It's preferable to have multiple functions that are roughly of the same category belonging to the same file (or directory). One file should not collect everything however.

Add signal-to-noise ratio for other GLMs in generate.data()

Only the Gaussian case currently incorporates the snr argument.

Add routine to search heuristically for optimal rates for AI-SGD

This should be script optimal_aisgd in the theory/ folder.

Remove need for defining sgd/batch in exp_normal_n5p2.R

sgd.R should be generic enough to work for all experiments.

lambda parameter in sgd not working

Not sure why, but in my experiments setting lambda to be anything other than zero leads to worse performance. Also the implicit case simply doesn't run.

batch learning needs update

The DATA object was changed to include a definition of the GLM model
that generates the data. The batch learning needs to change so that
to use this information and feed it into a call glm(..., family=...)
where family will depend on the GLM model.

On the slope argument for the batch function

To address the comment on f9395ac:

NOTE(ptoulis): This was a bit confusing. Where are we using no-slope?

This is used in examples/exp_normal_n5p2.R. Agreed that it's confusing though.

airoldilab / ai-sgd Goto Github PK

ai-sgd's People

Contributors

Stargazers

Watchers

Forkers

ai-sgd's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs