GithubHelp home page GithubHelp logo

dataslingers / moma Goto Github PK

View Code? Open in Web Editor NEW
20.0 4.0 4.0 963 KB

MoMA: Modern Multivariate Analysis in R

Home Page: https://DataSlingers.github.io/MoMA

License: GNU General Public License v2.0

R 60.78% C++ 39.22%
multivariate-analysis multivariate-statistics statistical-learning statistics principal-component-analysis partial-least-squares canonical-correlation-analysis sparsity smoothness r

moma's Introduction

MoMA: Modern Multivariate Analysis

TravisCI Build Status codecov Coverage Status License: GPL v2 CRAN_Status_Badge Project Status: WIP – Initial development is in progress, but there has not yet been a stable, usable release suitable for the public.

MoMA is a penalized SVD framework that supports a wide range of sparsity-inducing penalties. For a matrix X, MoMA gives the solution to the following optimization problem:


The penalties (the P functions) we support so far include

  • moma_lasso(): LASSO (least absolute shrinkage and selection operator).

  • moma_scad(): SCAD (smoothly clipped absolute deviation).

  • moma_mcp() MCP (minimax concave penalty).

  • moma_slope(): SLOPE (sorted (\ell)-one penalized estimation).

  • moma_grplasso(): Group LASSO.

  • moma_fusedlasso(): Fused LASSO.

  • moma_spfusedlasso(): Sparse fused LASSO.

  • moma_l1tf(): (\ell)-one trend filtering.

  • moma_cluster(): Cluster penalty.

With this at hand, we can easily extend classical multivariate models:

  • moma_sfpca() performs penalized principal component analysis.

  • moma_sfcca() performs penalized canonical component analysis.

  • moma_sflda() performs penalized linear discriminant analysis.

We also provide Shiny App support to facilitate interaction with the results. If you are new to MoMA, the best place to start is vignette("MoMA").

Installation

The newest version of the package can be installed from Github:

library(devtools)
install_github("DataSlingers/MoMA", ref = "master")

Usage

Perform sparse linear discriminant analysis on the Iris data set.

library(MoMA)

## collect data
X <- iris[, 1:4]
Y_factor <- as.factor(rep(c("s", "c", "v"), rep(50, 3)))

## range of penalty
lambda <- seq(0, 1, 0.1)

## run!
a <- moma_sflda(
    X = X,
    Y_factor = Y_factor,
    x_sparse = moma_lasso(lambda = lambda),
    rank = 3
)

plot(a) # start a Shiny app and play with it!

Background

Multivariate analysis – the study of finding meaningful patterns in datasets – is a key technique in any data scientist’s toolbox. Beyond its use for Exploratory Data Analysis (“EDA”), multivariate analysis also allows for principled Data-Driven Discovery: finding meaningful, actionable, and reproducible structure in large data sets. Classical techniques for multivariate analysis have proven immensely successful through history, but modern Data-Driven Discovery requires new techniques to account for the specific complexities of modern data. This package provides a new unified framework for Modern Multivariate Analysis (“MoMA”), which will provide a unified and flexible baseline for future research in multivariate analysis. Even more importantly, we anticipate that this easy-to-use R package will increase adoption of these powerful new models by end users and, in conjunction with R’s rich graphics libraries, position R as the leading platform for modern exploratory data analysis and data-driven discovery.

Multivariate analysis techniques date back to the earliest days of statistics, pre-dating other foundational concepts like hypothesis testing by several decades. Classical techniques such as Principal Components Analysis (“PCA”) [1, 2], Partial Least Squares (“PLS”), Canonical Correlation Analysis (“CCA”) [3], and Linear Discriminant Analysis (“LDA”), have a long and distinguished history of use in statistics and are still among the most widely used methods for EDA. Their importance is reflected in the CRAN Task View dedicated to Multivariate Analysis [4], as well as the specialized implementations available for a range of application areas. Somewhat surprisingly, each of these techniques can be interpreted as a variant of the well-studied eigendecomposition problem, allowing statisticians to build upon a rich mathematical and computational literature.

In the early 2000s, researchers noted that naive extensions of classical multivariate techniques to the high-dimensional setting produced unsatisfactory results, a finding later confirmed by advances in random matrix theory [5]. In response to these findings, multivariate analysis experienced a renaissance as researchers developed a wide array of new techniques to incorporate sparsity, smoothness, and other structure into classical techniques [6,7,8,9,10,11,12,13,14 among many others], resulting in a rich literature on “modern multivariate analysis.” Around the same time, theoretical advances showed that these techniques avoided many of the pitfalls associated with naive extensions [15,16,17,18,19,20].

While this literature is vast, it relies on a single basic principle: it is essential to adapt classical techniques to account for known characteristics and complexities of the dataset at hand for multivariate analysis to succeed. For example, a neuroscientist investigating the brain’s response to an external stimulus may expect a response which is simultaneously spatially smooth and sparse: spatially smooth because the brain processes related stimuli in well-localized areas (e.g., the visual cortex) and sparse because not all regions of the brain are used to respond to a given stimulus. Alternatively, a statistical factor model used to understand financial returns may be significantly improved by incorporating known industry sector data, motivating a form of group sparsity. A sociologist studying how pollution leads to higher levels of respiratory illnesses may combine spatial smoothness and sparsity (indicating “pockets” of activity) with a non-negativity constraint, knowing that pollution and illness have a positive effect.

To incorporate these different forms of prior knowledge into multivariate analysis, a wide variety of algorithms and approaches have been proposed. In 2013, Allen proposed a general framework that unified existing techniques for “modern” PCA, as well as proposing a number of novel extensions [21]. The recently developed MoMA algorithm builds on this work, allowing more forms of regularization and structure, as well as supporting more forms of multivariate analysis.

The principal aim of this package is to make modern multivariate analysis available to a wide audience. This package will allow for fitting PCA, PLS, CCA, and LDA with all of the modern “bells-and-whistles:” sparsity, smoothness, ordered and unordered fusion, orthogonalization with respect to arbitrary bases, and non-negativity constraints. Uniting this wide literature under a single umbrella using the MoMA algorithm will provide a unified and flexible platform for data-driven discovery in R.

Authors

  • Michael Weylandt

    Department of Statistics, Rice University

  • Genevera Allen

    Departments of Statistics, CS,and ECE, Rice University

    Jan and Dan Duncan Neurological Research Institute Baylor College of Medicine and Texas Children’s Hospital

  • Luofeng “Luke” Liao

    School of Data Science, Fudan University

Acknowledgements

  • MW was funded by an NSF Graduate Research Fellowship 1450681.
  • LL was funded by Google Summer of Code 2019.

References

[1] K. Pearson. “On Lines and Planes of Closest Fit to Systems of Points in Space.” The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 2, p.559-572, 1901. https://doi.org/10.1080/14786440109462720

[2] H. Hotelling. Analysis of a Complex of Statistical Variables into Principal Components. Journal of Educational Psychology 24(6), p.417-441, 1933. http://dx.doi.org/10.1037/h0071325

[3] H. Hotelling. “Relations Between Two Sets of Variates” Biometrika 28(3-4), p.321-377, 1936. https://doi.org/10.1093/biomet/28.3-4.321

[4] See CRAN Task View: Multivariate Statistics

[5] I. Johnstone, A. Lu. “On Consistency and Sparsity for Principal Components Analysis in High Dimensions.” Journal of the American Statistical Association: Theory and Methods 104(486), p.682-693, 2009. https://doi.org/10.1198/jasa.2009.0121

[6] B. Silverman. “Smoothed Functional Frincipal Fomponents Analysis by Choice of Norm.” Annals of Statistics 24(1), p.1-24, 1996. https://projecteuclid.org/euclid.aos/1033066196

[7] J. Huang, H. Shen, A. Buja. “Functional Principal Components Analysis via Penalized Rank One Approximation.” Electronic Journal of Statistics 2, p.678-695, 2008. https://projecteuclid.org/euclid.ejs/1217450800

[8] I.T. Jolliffe, N.T. Trendafilov, M. Uddin. “A Modified Principal Component Technique Based on the Lasso.” Journal of Computational and Graphical Statistics 12(3), p.531-547, 2003. https://doi.org/10.1198/1061860032148

[9] H. Zou, and T. Hastie, and R. Tibshirani. “Sparse Principal Component Analysis.” Journal of Computational and Graphical Statistics 15(2), p.265-286, 2006. https://doi.org/10.1198/106186006X113430

[10] A. d’Aspremont, L. El Gahoui, M.I. Jordan, G.R.G. Lanckriet. “A Direct Formulation for Sparse PCA Using Semidefinite Programming.” SIAM Review 49(3), p.434-448, 2007. https://doi.org/10.1137/050645506

[11] A. d’Aspremont, F. Bach, L. El Gahoui. “Optimal Solutions for Sparse Principal Component Analysis.” Journal of Machine Learning Research 9, p.1269-1294, 2008. http://www.jmlr.org/papers/v9/aspremont08a.htm

[12] D. Witten, R. Tibshirani, T. Hastie. “A Penalized Matrix Decomposition, with Applications to Sparse Principal Components and Canonical Correlation Analysis.” Biostatistics 10(3), p.515-534, 2009. https://doi.org/10.1093/biostatistics/kxp008

[13] R. Jenatton, G. Obozinski. F. Bach. “Structured Sparse Principal Component Analysis.” Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS) 2010. http://proceedings.mlr.press/v9/jenatton10a.html

[14] G.I. Allen, M. Maletic-Savatic. “Sparse Non-Negative Generalized PCA with Applications to Metabolomics.” Bioinformatics 27(21), p.3029-3035, 2011. https://doi.org/10.1093/bioinformatics/btr522

[15] A.A. Amini, M.J. Wainwright. “High-Dimensional Analysis of Semidefinite Relaxations for Sparse Principal Components.” Annals of Statistics 37(5B), p.2877-2921, 2009. https://projecteuclid.org/euclid.aos/1247836672

[16] S. Jung, J.S. Marron. “PCA Consistency in High-Dimension, Low Sample Size Context.” Annals of Statistics 37(6B), p.4104-4130, 2009. https://projecteuclid.org/euclid.aos/1256303538

[17] Z. Ma. “Sparse Principal Component Analysis and Iterative Thresholding.” Annals of Statistics 41(2), p.772-801, 2013. https://projecteuclid.org/euclid.aos/1368018173

[18] T.T. Cai, Z. Ma, Y. Wu. “Sparse PCA: Optimal Rates and Adaptive Estimation.” Annals of Statistics 41(6), p.3074-3110, 2013. https://projecteuclid.org/euclid.aos/1388545679

[19] V.Q. Vu, J. Lei. “Minimax Sparse Principal Subspace Estimation in High Dimensions.” Annals of Statistics 41(6), p.2905-2947, 2013. https://projecteuclid.org/euclid.aos/1388545673

[20] D. Shen, H. Shen, J.S. Marron. “Consistency of Sparse PCA in High Dimension, Low Sample Size Contexts.” Journal of Multivariate Analysis 115, p.317-333, 2013. https://doi.org/10.1016/j.jmva.2012.10.007

[21] G.I. Allen. “Sparse and Functional Principal Components Analysis.” ArXiv Pre-Print 1309.2895 (2013). https://arxiv.org/abs/1309.2895

moma's People

Contributors

traviscibot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

moma's Issues

Support Correspondance Analysis

Correspondance Analysis is another multivariate analysis technique based on an (generalized) SVD which can probably cast in the MoMA framework. I don’t know much about it (yet) but it. Ay be worth supporting alongside PCA, PLS, CCA, LDA, etc.

Absorb parameter values into sparsity-type/smoothness-type specification

Perhaps a similar sort of thing:

u_smoothness = second_order_difference(select = TRUE)

I used a similar pattern for fusion weights in the clustRviz package: it's a bit tricky to get your head around it, but check it out - might have some useful ideas.

Related, but separate, question: if we go this route, should parameter values be in the object or as a second argument?

Originally posted by @michaelweylandt in #42

GSoC (Google Summer of Code) 2019 Report

Work done in GSoC 2019

  • Code formatting #33

  • Code coverage #49

  • PG loop argument wrapper #36

  • Design improve #37

  • R6 PCA wrappers #42

  • Put parameters values in sparsity / smoothness specification #48

  • Add SFLDA / SFCCA, and documentations #54

How to use it

Please refer to the Github pages of the repo. It contains detailed documentations of the functions and a couple of illustrative examples.

TODO

  • Extend the package to allow more penalty choices and multivariate methods #52, #19

  • More helper R6 methods to facilitate exploration of the results.

  • Support caching and frame smoothing in Shiny apps.

Citation File

Add a citation file so that citation("MoMA") returns something useful. For now, the DSW SFPCA paper should be fine.

Accessor with interpolation

#37 (comment)

Longer term, we should think about making an accessor which takes alpha_u, lambda_u, etc (values not indices) and does the extraction. If we don't have exactly the right value in the saved list, we should (by default) interpolate with an option for an exact solve.

(Something similar for the coef function in Michael's ExclusiveLasso package.)

Add generalized lasso to the toolset

Generalized lasso [1] meets the requirements of sparse penalty in our sfpca framework:

  1. of order 1
  2. convex or can be decomposed of difference of two convex functions

However existing algorithm is not fast enough to solve lots of generalized lasso problems quickly. Some of its special case has efficient algorithms though, i.e., l1 trend filtering. [2]

[1] https://arxiv.org/pdf/1005.1971.pdf
THE SOLUTION PATH OF THE GENERALIZED LASSO
[2] Koh, K., Kim, S. and Boyd, S. (2007), An interior-point method for large-scale l1-regularized logistic regression, Journal of Machine Learning Research

SLOPE Penalty

Recently, the "SLOPE" penalty (sorted L1-norm) has been shown to have good theoretical properties and attractive performance in simulations [1]. It is a first order penalty, so would fit within the MoMA framework. It theoretically has several tuning parameters (one for each parameter), so we might take a reduced case, e.g., the BH-type rule discussed in [1]. Algorithm 4 in [1] gives a good algorithm for the proximal operator.

[1] https://projecteuclid.org/euclid.aoas/1446488733

Handle passing deflation scheme encoding between R and C++

Hmmm.... it looks like the answer is "sort of." The following works, but isn't totally type-safe. We could probably do a more general solution in the future. Issue?

#include "Rcpp.h"

enum DeflationMethod {
  PCA = 0, 
  PLS = 1,  
  CCA = 2
};

namespace Rcpp {
  SEXP wrap(DeflationMethod dm){
    Rcpp::IntegerVector dm_int = Rcpp::wrap(static_cast<int>(dm));
    dm_int.attr("class") = "DeflationMethod"; 
    return dm_int;
  }

  template <> DeflationMethod as(SEXP dm_sexp){
    Rcpp::IntegerVector dm_iv(dm_sexp);
    int dm_int = dm_iv(0); 
    DeflationMethod dm = static_cast<DeflationMethod>(dm_int);
    return dm; 
  }
}

// [[Rcpp::plugins(cpp11)]]
// [[Rcpp::export]]
DeflationMethod make_pls(){
  DeflationMethod pls = DeflationMethod::PLS;
  return pls;
}

// [[Rcpp::export]]
void take_pls(DeflationMethod x){
  Rcpp::Rcout << " x = " << x << std::endl; 
}

Originally posted by @michaelweylandt in #54

End-to-End Tests

Let's add some end-to-end tests where we reproduce the examples (either simulated or EEG) from the SFPCA paper.

Nothing fancy - just save the "known good" results (from GA's Matlab code) and then run our functions on the same data and check that everything (estimated values and selected BICs) is close to what we expect.

Possibly should wait until after the SFPCA wrappers are written.

Code formatting

Create a clang-format file for C++ code. Use the styler package for R code formatting.

Improve BIC Search Algorithm

When doing nested BIC search, do we need to update both U and V? (The current code holds V constant while optimizing over U and vice versa) Formally, in the BIC expression, v is a function of u_hat so it seems weird to not change v. On the flip side, we're just optimizing a regression (with constraint). My concern is that, if we're just solving the penalized regression problem and Xv is a sub-unit vector, then we can always choose u to be exactly Xv and then we get zero variance and stumble into a log(0) problem. (Or at least, that's what I got out of a quick skim. Please correct me if I'm wrong @Banana1530.)

For well-initialized V, this doesn't make a big difference so we can usually get away with it: in particular, if V is well initialized (maybe because we're already using a full grid or tuning parameters are small enough that the SVD is really close to correct), V_init and V_stationary point are close, so BIC(U, V_init) and BIC(U, V_stationary point) will be similar. That seems like too strong an assumption to make in practice though.

The below experiments indicate what I mean. If we set UPDATE_V = 0 but leave INITIALIZE_V_SVD = 1, we get saved by the fact that v_star and V1 are pretty close; but if we set both flags to 0, corresponding to a random initialization of v_hat without updating, things go poorly. If we have INITIALIZE_V_SVD = 0, but keep UPDATE_V = 1, we do better, but not nearly as well as using the SVD initialization.

UPDATE_V = 1;        % Set to 0 to fix V at initialization value
INITIALIZE_SVD = 1;  % Set to 0 to initialize U, V to random unit vector

n = 50; p = 25; s = 5; 
u_star = [ones(s, 1); zeros(n - s, 1)]; v_star = [zeros(p - s, 1); ones(s, 1)]; d = 3; 

N = randn(n, p);
S = d * u_star * v_star'; 

X = S + N;  
[U, ~, V] = svd(X); 

st = @(x, lambda) sign(x) .* max(abs(x) - lambda, 0); 

% Initialize SFPCA
U1 = U(:, 1); V1 = V(:, 1);


% SFPCA search on lambda_U
% Keep v parameters fixed for now...
lambda_u_range = linspace(0, 5, 51);
n_lambda_u     = size(lambda_u_range, 2);

sigma_hat_holder = zeros(size(lambda_u_range)); 
bic_holder     = zeros(size(lambda_u_range));
df_holder      = zeros(size(lambda_u_range)); 

for lu_ix=1:n_lambda_u
  lu = lambda_u_range(lu_ix);
  
  % Quick and dirty SFPCA - only doing sparsity in U
  
  if INITIALIZE_SVD
      u_hat = U1; 
      v_hat = V1; 
  else 
      u_hat = randn(n, 1); u_hat = u_hat / norm(u_hat); 
      v_hat = randn(p, 1); v_hat = v_hat / norm(v_hat);
  end
  
  u_hat_old = u_hat + 5000; v_hat_old = v_hat + 5000; 
  
  while norm(u_hat - u_hat_old) + norm(v_hat - v_hat_old) > 1e-6
      while norm(u_hat - u_hat_old) > 1e-6
          u_hat_old = u_hat; 
          u_hat = st(X * v_hat, lu); 
          u_hat = u_hat / norm(u_hat); 
      end
      
      if UPDATE_V
          while norm(v_hat - v_hat_old) > 1e-6
              v_hat_old = v_hat;
              v_hat = st(u_hat' * X, 0);
              v_hat = v_hat / norm(v_hat);
              v_hat = v_hat'; % Keep sizes correct
          end
      else
          v_hat_old = v_hat;
      end
  end
  
  sigma_hat_sq = mean((X * v_hat - u_hat).^2); 
  
  sigma_hat_holder(lu_ix) = sigma_hat_sq; 
  df_holder(lu_ix)  = sum(u_hat ~= 0); 
  bic_holder(lu_ix) = log(sigma_hat_sq / n) + 1 / n * log(n) * sum(u_hat ~= 0); 
end

[min_bic, min_bic_ind] = min(bic_holder);

lambda_u_optimal = lambda_u_range(min_bic_ind);
  
% Quick and dirty SFPCA - only doing sparsity in U

if INITIALIZE_SVD
    u_hat = U1; 
    v_hat = V1; 
else 
    u_hat = randn(n, 1); u_hat = u_hat / norm(u_hat); 
    v_hat = randn(p, 1); v_hat = v_hat / norm(v_hat);
end
  
  u_hat_old = u_hat + 5000; v_hat_old = v_hat + 5000;
  
while norm(u_hat - u_hat_old) + norm(v_hat - v_hat_old) > 1e-6
    while norm(u_hat - u_hat_old) > 1e-6
        u_hat_old = u_hat; 
        u_hat = st(X * v_hat, lambda_u_optimal); 
        u_hat = u_hat / norm(u_hat); 
    end
     
    if UPDATE_V
        while norm(v_hat - v_hat_old) > 1e-6
            v_hat_old = v_hat;
            v_hat = st(u_hat' * X, 0);
            v_hat = v_hat / norm(v_hat);
            v_hat = v_hat'; % Keep sizes correct
        end
    else
        v_hat_old = v_hat;
    end
end

snr = max(svd(X)) / max(svd(N)); 

u_hat_supp = u_hat ~= 0; 
u_star_supp = u_star ~= 0; 
u_star_nonsupp = u_star == 0; 

tpr = mean(u_hat_supp(u_star_supp)); 
fpr = mean(u_hat_supp(u_star_nonsupp)); 

Running 1000 replicates, I see

                           | TPR  |  FPR  | SNR
Update_V = 1, INIT_SVD = 1 | 96%  | 0%    | 1.5
Update_V = 1, INIT_SVD = 0 | 42%  | 26%   | 1.5
Update_V = 0, INIT_SVD = 1 | 96%  | 0.1%  | 1.5
Update_V = 0, INIT_SVD = 0 | 36%  | 18%   | 1.5

My hunch is that the Update_V = 0, INIT_SVD = 1 case would suffer more on harder situations than this: the angle between V1(S + N) and v_star isn't very much for this problem.

A collection of degree of freedoms

We need to find good estimates of degrees of freedom for different penalties if we are using BIC for model selection.

  • Lasso, fused lasso, sparse fused lasso.
    Table 1 of THE SOLUTION PATH OF THE GENERALIZED LASSO

  • many others

Momentum step size

I notice there are two different versions of FISTA algorithm. To minimize g+h, where h is non-smooth and g smooth, FISTA update goes like this

where \beta_k is the momentum stepsize.

[1] uses

For the original paper [2]


where

[1] http://www.seas.ucla.edu/~vandenbe/236C/lectures/fista.pdf Page 3
[2] https://people.rennes.inria.fr/Cedric.Herzet/Cedric.Herzet/Sparse_Seminar/Entrees/2012/11/12_A_Fast_Iterative_Shrinkage-Thresholding_Algorithmfor_Linear_Inverse_Problems_(A._Beck,_M._Teboulle)_files/Breck_2009.pdf

Clean up existing entry points

Currently we have three R wrappers. They differ in functionalities, the abstraction level of arguments they take in, and where they are used in testsuites. Eventually their functionalities will be subsets of SFPCA wrappers' and thus they should be removed.

1. sfpca

https://github.com/michaelweylandt/MoMA/blob/7c8fd20fbd18d9cbfe21837bacd8ad401853efa6/R/sfpca.R#L1

It is simply an R interface for the C++ function cpp_sfpca , https://github.com/michaelweylandt/MoMA/blob/7c8fd20fbd18d9cbfe21837bacd8ad401853efa6/src/moma_R_function.cpp#L6 which uses repeatedly MoMA::solve and MoMA::deflate. We need to explicitly specify all parameters.

What it does: Solve the penalized SVD for fixed alpha_u/v, lambda_u/v. It also finds rank-k SVD by repeatedly deflating the matrix and then rerunning the algorithm. Note we don't have tests for the latter functionality yet.

Where it is used in the testsuite: It is used to test the correctness of the PG algorithm. To do this we inspect special cases where closed-form solutions exist. Then we check the results obtained by our algorithm against closed-form solutions. See https://github.com/michaelweylandt/MoMA/blob/7c8fd20fbd18d9cbfe21837bacd8ad401853efa6/tests/testthat/test_sfpca.R#L1.

2. moma_svd

https://github.com/michaelweylandt/MoMA/blob/7c8fd20fbd18d9cbfe21837bacd8ad401853efa6/R/moma_svd.R#L61

What it does: It supports the following three use cases. Note that it cooperates with prox argument wrappers like lasso(), scad() and PG loop settings wrapper (not merged yet). Essentially what it does is a proper subset of MoMA::select_nestedBIC described in section 3.

  1. Find rank-k penalized SVD with fixed alpha_u/v and lambda_u/v by calling cpp_sfpca described above;

  2. Run nested-BIC search on 2-D grids, whose axises could be a combination of any two parameters, by calling cpp_sfpca_nestedBIC. cpp_sfpca_nestedBIC does some sanity check and then calls MoMA::select_nestedBIC;
    https://github.com/michaelweylandt/MoMA/blob/7c8fd20fbd18d9cbfe21837bacd8ad401853efa6/src/moma_R_function.cpp#L179

  3. Run grid search on 2-D grids by calling cpp_sfpca_grid , which uses MoMA::reset and MoMA::solve;
    https://github.com/michaelweylandt/MoMA/blob/7c8fd20fbd18d9cbfe21837bacd8ad401853efa6/src/moma_R_function.cpp#L80

Where it is used in the testsuite: It tests that prox arguments are correctly passed to C++ side (see test_argument.R https://github.com/michaelweylandt/MoMA/blob/7c8fd20fbd18d9cbfe21837bacd8ad401853efa6/tests/testthat/test_arguments.R#L1 ). We also test that cpp_sfpca_grid and cpp_sfpca give identical result (see test_grid.R https://github.com/michaelweylandt/MoMA/blob/7c8fd20fbd18d9cbfe21837bacd8ad401853efa6/tests/testthat/test_grid.R#L1).

3 MoMA::grid_BIC_mix

This will become the core of SFPCA wrappers (in progress). It supports finding the first k pairs of singular vectors, and the combination of nested-BIC search and grid search.

Where it is used in the testsuite: We test that it gives correctly sized lists. See https://github.com/michaelweylandt/MoMA/blob/7c8fd20fbd18d9cbfe21837bacd8ad401853efa6/tests/testthat/test_BIC_gird_mixed.R#L1

Remove MoMA::u, MoMA::v

At the very beginning, we include u and v as members of MoMA to facilitate warm-start. They are initialized to be the SVD of data matrix X, and they are updated in real time as the PG loop runs. Ideally, we want

problem = MoMA(X, lambda_u=0)
problem.solve()
arma::vec u1 = problem.u
arma::vec v1 = problem.v

problem.reset(lambda_v=0.1)
problem.solve() // warm-start
arma::vec u2 = problem.u
arma::vec v2 = problem.v

However, as the mix of BIC search and grid search comes into our design, the concept of warm-start becomes intricate. Furthermore, dependence on MoMA::u and MoMA::v as solutions to the current or last penalized SVD problem might be a gotcha.

So lets remove MoMA::u/v and let outside code (wrt to MoMA internal code) take care of warm-start.

LDA Example Data: Rockhoppers

The iris data for LDA / classification is overused and typically mis-applied [1].

Let's use a new data set for our LDA examples and include it in the package. Steinfurth et al. have a paper on classifying penguins by sex using various body measurements [2] which seems like it would make a great example.

Idea from [3]; see also [4-5].

[1] http://www.dicook.org/files/jsm19/slides#1
[2] https://www.int-res.com/abstracts/esr/v39/p293-302/
[3] https://twitter.com/dan_p_simpson/status/1164581393516527616
[4] http://www.publish.csiro.au/mu/MU16027
[5] https://figshare.com/articles/Data_from_Using_measurements_to_predict_laying_order_in_harvested_Northern_Rockhopper_Penguin_Eudyptes_moseleyi_eggs/3384109

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.