GithubHelp home page GithubHelp logo

featurehashing's Introduction

output
html_document
keep_md
true

FeatureHashing

Linux: Travis-ci Status Win : Build status OS X: Travis-ci Status

Coverage Status CRAN_Status_Badge rstudio mirror downloads

Implement feature hashing with R

Introduction

Feature hashing, also called as the hashing trick, is a method to transform features to vector. Without looking the indices up in an associative array, it applies a hash function to the features and uses their hash values as indices directly.

The package FeatureHashing implements the method in (Weinberger, Dasgupta, Langford, Smola, and Attenberg, 2009) to transform a data.frame to sparse matrix. The package provides a formula interface similar to model.matrix in R and Matrix::sparse.model.matrix in the package Matrix. Splitting of concatenated data, check the help of test.tag for explanation of concatenated data, during the construction of the model matrix.

Installation

To install the stable version from Cran, run this command:

install.packages("FeatureHashing")

For up-to-date version, please install from github. Windows user will need to install RTools first.

devtools::install_github('wush978/FeatureHashing')

When should we use Feature Hashing?

Feature hashing is useful when the user does not easy to know the dimension of the feature vector. For example, the bag-of-word representation in document classification problem requires scanning entire dataset to know how many words we have, i.e. the dimension of the feature vector.

In general, feature hashing is useful in the following environment:

  • Streaming Environment
  • Distirbuted Environment

Because it is expensive or impossible to know the real dimension of the feature vector.

Getting Started

The following scripts show how to use the FeatureHashing to construct Matrix::dgCMatrix and train a model in other packages which supports Matrix::dgCMatrix as input.

The dataset is a sample from iPinYou dataset which is described in (Zhang, Yuan, Wang, and Shen, 2014).

Logistic Regression with glmnet

# The following script assumes that the data.frame
# of the training dataset and testing dataset are 
# assigned to variable `ipinyou.train` and `ipinyou.test`
# respectively

library(FeatureHashing)
## Loading required package: methods
# Checking version.
stopifnot(packageVersion("FeatureHashing") >= package_version("0.9"))

data(ipinyou)
f <- ~ IP + Region + City + AdExchange + Domain +
  URL + AdSlotId + AdSlotWidth + AdSlotHeight +
  AdSlotVisibility + AdSlotFormat + CreativeID +
  Adid + split(UserTag, delim = ",")
# if the version of FeatureHashing is 0.8, please use the following command:
# m.train <- as(hashed.model.matrix(f, ipinyou.train, 2^16, transpose = FALSE), "dgCMatrix")
m.train <- hashed.model.matrix(f, ipinyou.train, 2^16)
m.test <- hashed.model.matrix(f, ipinyou.test, 2^16)

# logistic regression with glmnet

library(glmnet)
## Loading required package: Matrix
## Loading required package: foreach
## Loaded glmnet 2.0-16
cv.g.lr <- cv.glmnet(m.train, ipinyou.train$IsClick,
  family = "binomial")#, type.measure = "auc")
p.lr <- predict(cv.g.lr, m.test, s="lambda.min")
auc(ipinyou.test$IsClick, p.lr)
## [1] 0.5187244

Gradient Boosted Decision Tree with xgboost

Following the script above,

# GBDT with xgboost

library(xgboost)

cv.g.gdbt <- xgboost(m.train, ipinyou.train$IsClick, max.depth=7, eta=0.1,
  nround = 100, objective = "binary:logistic", verbose = ifelse(interactive(), 1, 0))
p.lm <- predict(cv.g.gdbt, m.test)
glmnet::auc(ipinyou.test$IsClick, p.lm)
## [1] 0.6555304

Per-Coordinate FTRL-Proximal with $L_1$ and $L_2$ Regularization for Logistic Regression

The following scripts use an implementation of the FTRL-Proximal for Logistic Regresion, which is published in (McMahan, Holt, Sculley, Young, Ebner, Grady, Nie, Phillips, Davydov, Golovin, Chikkerur, Liu, Wattenberg, Hrafnkelsson, Boulos, and Kubica, 2013), to predict the probability (1-step prediction) and update the model simultaneously.

source(system.file("ftprl.R", package = "FeatureHashing"))

m.train <- hashed.model.matrix(f, ipinyou.train, 2^16, transpose = TRUE)
ftprl <- initialize.ftprl(0.1, 1, 0.1, 0.1, 2^16)
ftprl <- update.ftprl(ftprl, m.train, ipinyou.train$IsClick, predict = TRUE)
auc(ipinyou.train$IsClick, attr(ftprl, "predict"))
## [1] 0.5993447

If we use the same algorithm to predict the click through rate of the 3rd season of iPinYou, the overall AUC will be 0.77 which is comparable to the overall AUC of the 3rd season 0.76 reported in (Zhang, Yuan, Wang, et al., 2014).

Supported Data Structure

  • character and factor
  • numeric and integer
  • array, i.e. concatenated strings such as c("a,b", "a,b,c", "a,c", "")

Reference

[1] H. B. McMahan, G. Holt, D. Sculley, et al. "Ad click prediction: a view from the trenches". In: The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, Chicago, IL, USA, August 11-14, 2013. Ed. by I. S. Dhillon, Y. Koren, R. Ghani, T. E. Senator, P. Bradley, R. Parekh, J. He, R. L. Grossman and R. Uthurusamy. ACM, 2013, pp. 1222-1230. DOI: 10.1145/2487575.2488200. <URL: http://doi.acm.org/10.1145/2487575.2488200>.

[2] K. Q. Weinberger, A. Dasgupta, J. Langford, et al. "Feature hashing for large scale multitask learning". In: Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009, Montreal, Quebec, Canada, June 14-18, 2009. Ed. by A. P. Danyluk, L. Bottou and M. L. Littman. 2009, pp. 1113-1120. DOI: 10.1145/1553374.1553516. <URL: http://doi.acm.org/10.1145/1553374.1553516>.

[3] W. Zhang, S. Yuan, J. Wang, et al. "Real-Time Bidding Benchmarking with iPinYou Dataset". In: arXiv preprint arXiv:1407.7073 (2014).

featurehashing's People

Contributors

formwork avatar junjiemao avatar pommedeterresautee avatar wush-bridgewell avatar wush978 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

featurehashing's Issues

Review of the new design of `hashed.model.matrix`

Currently, I am modifying the function ‵hashed.model.matrix‵ to make it more familiar to R users.

I am hesitant now because the new design is based on my own thinking. Therefore, it would be great if anyone can review the reason of the new design.

The main changes are #7 , #35 and #39. A less important change is #49. Any comments and suggestions are welcome!

About English Writing

其實我也不是很確定,不過

When do I use Feature Hashing?

這句的do 改成 will 會不會比較好一點?

Unexpected hashing result of interaction term on Solaris Sparc

Running the tests in ‘tests/test-hashing.R’ failed.
Last 13 lines of output:
+ "TypeQuebec:Treatmentnonchilled", "TypeMississippi:Treatmentnonchilled", 
+ "PlantQn3:Treatmentnonchilled", "PlantMn2:uptake", "TypeMississippi:uptake", 
+ "PlantQn3:uptake", "PlantQc2:uptake", "Treatmentchilled:uptake", 
+ "PlantQc1:uptake", "PlantMn1:uptake", "PlantMc2:TypeMississippi", 
+ "PlantMc1:TypeMississippi", "PlantMn2:TypeMississippi", "PlantMn1:TypeMississippi", 
+ "TypeMississippi"))
> 
> checkTrue(isTRUE(all.equal(mapping_value[names(mapping_value.expected)], mapping_value.expected)),
+ "Unexpected hashing result of interaction term")
Error in checkTrue(isTRUE(all.equal(mapping_value[names(mapping_value.expected)], : 
Test not TRUE

clang++ warning unused variable

clang++ -std=c++11 -I/Library/Frameworks/R.framework/Resources/include -DNDEBUG  -I/usr/local/include -I/usr/local/include/freetype2 -I/opt/X11/include -I"/Users/wush/Library/R/3.1/library/Rcpp/include"  -g  -fPIC  -Wall -mtune=core2 -g -O2 -c as.cpp -o as.o
as.cpp:79:29: warning: unused variable 'pnew_p' [-Wunused-variable]
  int *pnew_i = &new_i[0], *pnew_p = &new_p[0];
                            ^
as.cpp:72:21: warning: unused variable 'pp' [-Wunused-variable]
  int *pi = &i[0], *pp(&p[0]);
                    ^
2 warnings generated.

Undefined behavior

> ### Name: CSCMatrix-class
> ### Title: CSCMatrix
> ### Aliases: CSCMatrix-class [,CSCMatrix,missing,numeric,ANY-method
> ###   [,CSCMatrix,numeric,missing,ANY-method
> ###   [,CSCMatrix,numeric,numeric,ANY-method %*%,CSCMatrix,numeric-method
> ###   %*%,numeric,CSCMatrix-method dim,CSCMatrix-method
> ###   dim<-,CSCMatrix-method
> 
> ### ** Examples
> 
> # construct a CSCMatrix
> m <- hashed.model.matrix(~ ., CO2, 8)
/usr/local/bin/../include/c++/v1/__tree:834:16: runtime error: downcast of address 0x7fff2ac3cff8 with insufficient space for an object of type 'std::__1::__tree_node<int, void *>'
0x7fff2ac3cff8: note: pointer points here
 ff 7f 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  12 70 61 72 73 65 5f 74  61 67 00 00
              ^ 
MurmurHash3.cpp:57:10: runtime error: load of misaligned address 0x0000049b5b69 for type 'const uint32_t' (aka 'const unsigned int'), which requires 4 byte alignment
0x0000049b5b69: note: pointer points here
 00 00 00  08 63 6f 6e 63 00 6d 65  6e 74 00 04 00 00 00 00  00 00 00 00 00 00 00 00  04 00 00 00 00
              ^ 

English Correction

Title: Implement Feature Hashing on Model Matrix

'a Model Matrix'.

Author(s): Wush Wu [aut, cre], Austin Appleby [ctb](for the included
Murmurhash3 sources)
Maintainer: Wush Wu [email protected]
Depends: methods
Suggests: pack
Description: Feature hashing, also called as the hashing trick, is a
method to transform features to vector. Without looking the

'Without looking up the indices'

indices up in an associative array, it applies a hash
function to the features and uses their hash values as
indices directly. This package implements the method of

'This package implements' is redundant.

feature hashing proposed in Weinberger et. al. (2009) with
Murmurhash3 and provides a formula interface in R. See the

omit 'the' (or say 'the file')

README.md for more information.

segfault when the column is not matched on rocker/r-base

Rscript tests/test-missing.R 

 *** caught segfault ***
address 0xfffffffffffffff0, cause 'memory not mapped'

Traceback:
 1: .Call("FeatureHashing_hashed_model_matrix_dataframe", PACKAGE = "FeatureHashing",     tf, data, hash_size, retval, keep_hashing_mapping)
 2: .hashed.model.matrix.dataframe(tf, data, hash_size, retval, keep.hashing_mapping)
 3: hashed.model.matrix(~PlAnT, CO2, 8)
aborting ...
Segmentation fault (core dumped)

New names for functions

Both function names are not easy to understand. Some idea:
hash_h -> hashed_value (opposite to matrix)
hash_xi -> hash_sign
hashed.model.matrix -> hashed_matrix (switch point to underscore for consistency)
tag -> expand

Memcheck errors

> # The tag-like feature
> data(test.tag)
> df <- data.frame(a = test.tag, b = rnorm(length(test.tag)))
> m <- hashed.model.matrix(~ tag(a, split = ",", type = "existence"):b, df, 2^6,
+  keep.hashing_mapping = TRUE)
==3134== Conditional jump or move depends on uninitialised value(s)
==3134==    at 0x13DFCECF: std::vector<std::shared_ptr<VectorConverter>, std::allocator<std::shared_ptr<VectorConverter> > > const get_converters<Rcpp::DataFrame_Impl<Rcpp::PreserveStorage> >(std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, Rcpp::RObject_Impl<Rcpp::PreserveStorage>, Rcpp::DataFrame_Impl<Rcpp::PreserveStorage>, HashFunction*, HashFunction*) (packages/tests-vg/FeatureHashing/src/hashed_model_matrix.cpp:564)
==3134==    by 0x13DFFB8A: SEXPREC* hashed_model_matrix<Rcpp::DataFrame_Impl<Rcpp::PreserveStorage> >(Rcpp::RObject_Impl<Rcpp::PreserveStorage>, Rcpp::DataFrame_Impl<Rcpp::PreserveStorage>, unsigned long, bool, Rcpp::S4_Impl<Rcpp::PreserveStorage>, bool) (packages/tests-vg/FeatureHashing/src/hashed_model_matrix.cpp:687)
==3134==    by 0x13DF480B: hashed_model_matrix_dataframe(Rcpp::RObject_Impl<Rcpp::PreserveStorage>, Rcpp::DataFrame_Impl<Rcpp::PreserveStorage>, unsigned long, bool, Rcpp::S4_Impl<Rcpp::PreserveStorage>, bool) (packages/tests-vg/FeatureHashing/src/hashed_model_matrix.cpp:788)
==3134==    by 0x13DED50D: FeatureHashing_hashed_model_matrix_dataframe (packages/tests-vg/FeatureHashing/src/RcppExports.cpp:110)
==3134==    by 0x47D6B7: do_dotcall (svn/R-devel/src/main/dotcode.c:1251)
==3134==    by 0x4B932E: Rf_eval (svn/R-devel/src/main/eval.c:655)
==3134==    by 0x4BB258: do_begin (svn/R-devel/src/main/eval.c:1642)
==3134==    by 0x4B9198: Rf_eval (svn/R-devel/src/main/eval.c:627)
==3134==    by 0x4BA37E: Rf_applyClosure (svn/R-devel/src/main/eval.c:1039)
==3134==    by 0x4B8FC8: Rf_eval (svn/R-devel/src/main/eval.c:674)
==3134==    by 0x4BB258: do_begin (svn/R-devel/src/main/eval.c:1642)
==3134==    by 0x4B9198: Rf_eval (svn/R-devel/src/main/eval.c:627)
==3134==  Uninitialised value was created by a heap allocation
==3134==    at 0x4A0645D: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==3134==    by 0x4E4D9F: GetNewPage (svn/R-devel/src/main/memory.c:864)
==3134==    by 0x4E6826: Rf_allocVector3 (svn/R-devel/src/main/memory.c:2584)
==3134==    by 0x5275F3: ReadItem (svn/R-devel/src/include/Rinlinedfuns.h:189)
==3134==    by 0x52650E: ReadBC1 (svn/R-devel/src/main/serialize.c:1867)
==3134==    by 0x526597: ReadBC1 (svn/R-devel/src/main/serialize.c:1840)
==3134==    by 0x526887: ReadItem (svn/R-devel/src/main/serialize.c:1880)
==3134==    by 0x526E5C: ReadItem (svn/R-devel/src/main/serialize.c:1630)
==3134==    by 0x528382: R_Unserialize (svn/R-devel/src/main/serialize.c:1923)
==3134==    by 0x528CE0: R_unserialize (svn/R-devel/src/main/serialize.c:2553)
==3134==    by 0x528F7D: do_lazyLoadDBfetch (svn/R-devel/src/main/serialize.c:2842)
==3134==    by 0x4B92AF: Rf_eval (svn/R-devel/src/main/eval.c:659)
==3134== 
==3134== Conditional jump or move depends on uninitialised value(s)
==3134==    at 0x13DFCE90: std::vector<std::shared_ptr<VectorConverter>, std::allocator<std::shared_ptr<VectorConverter> > > const get_converters<Rcpp::DataFrame_Impl<Rcpp::PreserveStorage> >(std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, Rcpp::RObject_Impl<Rcpp::PreserveStorage>, Rcpp::DataFrame_Impl<Rcpp::PreserveStorage>, HashFunction*, HashFunction*) (packages/tests-vg/FeatureHashing/src/hashed_model_matrix.cpp:564)
==3134==    by 0x13DFFB8A: SEXPREC* hashed_model_matrix<Rcpp::DataFrame_Impl<Rcpp::PreserveStorage> >(Rcpp::RObject_Impl<Rcpp::PreserveStorage>, Rcpp::DataFrame_Impl<Rcpp::PreserveStorage>, unsigned long, bool, Rcpp::S4_Impl<Rcpp::PreserveStorage>, bool) (packages/tests-vg/FeatureHashing/src/hashed_model_matrix.cpp:687)
==3134==    by 0x13DF480B: hashed_model_matrix_dataframe(Rcpp::RObject_Impl<Rcpp::PreserveStorage>, Rcpp::DataFrame_Impl<Rcpp::PreserveStorage>, unsigned long, bool, Rcpp::S4_Impl<Rcpp::PreserveStorage>, bool) (packages/tests-vg/FeatureHashing/src/hashed_model_matrix.cpp:788)
==3134==    by 0x13DED50D: FeatureHashing_hashed_model_matrix_dataframe (packages/tests-vg/FeatureHashing/src/RcppExports.cpp:110)
==3134==    by 0x47D6B7: do_dotcall (svn/R-devel/src/main/dotcode.c:1251)
==3134==    by 0x4B932E: Rf_eval (svn/R-devel/src/main/eval.c:655)
==3134==    by 0x4BB258: do_begin (svn/R-devel/src/main/eval.c:1642)
==3134==    by 0x4B9198: Rf_eval (svn/R-devel/src/main/eval.c:627)
==3134==    by 0x4BA37E: Rf_applyClosure (svn/R-devel/src/main/eval.c:1039)
==3134==    by 0x4B8FC8: Rf_eval (svn/R-devel/src/main/eval.c:674)
==3134==    by 0x4BB258: do_begin (svn/R-devel/src/main/eval.c:1642)
==3134==    by 0x4B9198: Rf_eval (svn/R-devel/src/main/eval.c:627)
==3134==  Uninitialised value was created by a heap allocation
==3134==    at 0x4A0645D: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==3134==    by 0x4E4D9F: GetNewPage (svn/R-devel/src/main/memory.c:864)
==3134==    by 0x4E6826: Rf_allocVector3 (svn/R-devel/src/main/memory.c:2584)
==3134==    by 0x5275F3: ReadItem (svn/R-devel/src/include/Rinlinedfuns.h:189)
==3134==    by 0x52650E: ReadBC1 (svn/R-devel/src/main/serialize.c:1867)
==3134==    by 0x526597: ReadBC1 (svn/R-devel/src/main/serialize.c:1840)
==3134==    by 0x526887: ReadItem (svn/R-devel/src/main/serialize.c:1880)
==3134==    by 0x526E5C: ReadItem (svn/R-devel/src/main/serialize.c:1630)
==3134==    by 0x528382: R_Unserialize (svn/R-devel/src/main/serialize.c:1923)
==3134==    by 0x528CE0: R_unserialize (svn/R-devel/src/main/serialize.c:2553)
==3134==    by 0x528F7D: do_lazyLoadDBfetch (svn/R-devel/src/main/serialize.c:2842)
==3134==    by 0x4B92AF: Rf_eval (svn/R-devel/src/main/eval.c:659)
==3134== 
> # The column `a` is splitted by "," and have an interaction with "b":
> mapping <- unlist(as.list(attr(m, "mapping")))
> names(mapping)
 [1] "a20"    "a"      "a21"    "atn:b"  "ant:b"  "a19:b"  "akh:b"  "a25:b" 
 [9] "a16:b"  "atw:b"  "b"      "a4:b"   "a10:b"  "a1:b"   "ahc"    "atc"   
[17] "a23"    "a24"    "a25"    "a26"    "a27"    "antw"   "akh"    "a29"   
[25] "atn"    "a30"    "atp"    "a1"     "a11:b"  "a8:b"   "a23:b"  "ach:b" 
[33] "a20:b"  "ahc:b"  "atc:b"  "a29:b"  "a26:b"  "a17:b"  "a3"     "ant"   
[41] "a4"     "a6"     "atw"    "a8"     "ach"    "aty"    "a9"     "ail"   
[49] "a10"    "a11"    "ail:b"  "a3:b"   "a9:b"   "atp:b"  "a12"    "a27:b" 
[57] "a:b"    "antw:b" "aty:b"  "a15:b"  "a24:b"  "a21:b"  "a6:b"   "a12:b" 
[65] "a30:b"  "a15"    "a16"    "a17"    "a19"   
> 
> 
> 
> ### * <FOOTER>
> ###
> options(digits = 7L)
> base::cat("Time elapsed: ", proc.time() - base::get("ptime", pos = 'CheckExEnv'),"\n")
Time elapsed:  100.219 1.296 101.76 0 0 
> grDevices::dev.off()
null device 
          1 
> ###
> ### Local variables: ***
> ### mode: outline-minor ***
> ### outline-regexp: "\\(> \\)?### [*]+" ***
> ### End: ***
> quit('no')
==3134== 
==3134== HEAP SUMMARY:
==3134==     in use at exit: 97,763,859 bytes in 51,968 blocks
==3134==   total heap usage: 112,852 allocs, 60,884 frees, 213,884,655 bytes allocated
==3134== 
==3134== LEAK SUMMARY:
==3134==    definitely lost: 0 bytes in 0 blocks
==3134==    indirectly lost: 0 bytes in 0 blocks
==3134==      possibly lost: 0 bytes in 0 blocks
==3134==    still reachable: 97,763,859 bytes in 51,968 blocks
==3134==         suppressed: 0 bytes in 0 blocks
==3134== Reachable blocks (those to which a pointer was found) are not shown.
==3134== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==3134== 
==3134== For counts of detected and suppressed errors, rerun with: -v
==3134== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 2 from 2)

Delete README.Rmd file

I don t see the purpose of this file as Github use only the md one.
I don't delete it myself as I may miss something.

Kind regards,
Michael

Installation Error on r-oldrel-windows-ix86+x86_64

* installing *source* package 'FeatureHashing' ...
** package 'FeatureHashing' successfully unpacked and MD5 sums checked
** libs

*** arch - i386
make[1]: Entering directory `/cygdrive/d/temp/RtmpqSwUZD/R.INSTALL21f02adf7a9d/FeatureHashing/src-i386'
g++  -I"D:/Rcompile/recent/R-3.0.3/include"            -I"d:/RCompile/CRANpkg/lib/3.0/Rcpp/include" -I"d:/Rcompile/CRANpkg/extralibs215/local215/include"     -O3 -Wall  -mtune=core2            -c MurmurHash3.cpp -o MurmurHash3.o
g++  -I"D:/Rcompile/recent/R-3.0.3/include"            -I"d:/RCompile/CRANpkg/lib/3.0/Rcpp/include" -I"d:/Rcompile/CRANpkg/extralibs215/local215/include"     -O3 -Wall  -mtune=core2            -c RcppExports.cpp -o RcppExports.o
g++  -I"D:/Rcompile/recent/R-3.0.3/include"            -I"d:/RCompile/CRANpkg/lib/3.0/Rcpp/include" -I"d:/Rcompile/CRANpkg/extralibs215/local215/include"     -O3 -Wall  -mtune=core2            -c as.cpp -o as.o
as.cpp: In function 'void pair_sort(int*, double*, size_t)':
as.cpp:8:83: warning: lambda expressions only available with -std=c++0x or -std=gnu++0x [enabled by default]
as.cpp:8:84: error: no matching function for call to 'sort(std::vector<unsigned int>::iterator, std::vector<unsigned int>::iterator, pair_sort(int*, double*, size_t)::<lambda(size_t, size_t)>)'
as.cpp:8:84: note: candidates are:
d:\compiler\gcc-4.6.3\bin\../lib/gcc/i686-w64-mingw32/4.6.3/../../../../include/c++/4.6.3/bits/stl_algo.h:5394:5: note: template<class _RAIter> void std::sort(_RAIter, _RAIter)
d:\compiler\gcc-4.6.3\bin\../lib/gcc/i686-w64-mingw32/4.6.3/../../../../include/c++/4.6.3/bits/stl_algo.h:5430:5: note: template<class _RAIter, class _Compare> void std::sort(_RAIter, _RAIter, _Compare)
as.cpp: In function 'size_t merge(int*, double*, size_t)':
as.cpp:32:16: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
as.cpp:39:31: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
as.cpp: In function 'SEXPREC* todgCMatrix(Rcpp::S4)':
as.cpp:67:7: warning: 'auto' will change meaning in C++0x; please remove it [-Wc++0x-compat]
as.cpp:67:12: error: 'pname' does not name a type
as.cpp:67:39: error: expected ';' before 'pname'
as.cpp:67:39: error: 'pname' was not declared in this scope
as.cpp:82:7: warning: 'auto' will change meaning in C++0x; please remove it [-Wc++0x-compat]
as.cpp:82:12: error: 'col' does not name a type
as.cpp:82:20: error: expected ';' before 'col'
as.cpp:82:26: error: 'Dim' cannot appear in a constant-expression
as.cpp:82:31: error: an array reference cannot appear in a constant-expression
as.cpp:82:20: error: parse error in template argument list
as.cpp:82:20: error: cannot resolve overloaded function 'col' based on conversion to type 'bool'
as.cpp:82:36: error: no post-increment operator for type
as.cpp:83:14: error: no match for 'operator[]' in 'new_p[Rcpp::col]'
as.cpp:83:14: note: candidates are:
d:\compiler\gcc-4.6.3\bin\../lib/gcc/i686-w64-mingw32/4.6.3/../../../../include/c++/4.6.3/bits/stl_vector.h:695:7: note: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = int, _Alloc = std::allocator<int>, std::vector<_Tp, _Alloc>::reference = int&, std::vector<_Tp, _Alloc>::size_type = unsigned int]
d:\compiler\gcc-4.6.3\bin\../lib/gcc/i686-w64-mingw32/4.6.3/../../../../include/c++/4.6.3/bits/stl_vector.h:695:7: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'unsigned int'
d:\compiler\gcc-4.6.3\bin\../lib/gcc/i686-w64-mingw32/4.6.3/../../../../include/c++/4.6.3/bits/stl_vector.h:710:7: note: std::vector<_Tp, _Alloc>::const_reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) const [with _Tp = int, _Alloc = std::allocator<int>, std::vector<_Tp, _Alloc>::const_reference = const int&, std::vector<_Tp, _Alloc>::size_type = unsigned int]
d:\compiler\gcc-4.6.3\bin\../lib/gcc/i686-w64-mingw32/4.6.3/../../../../include/c++/4.6.3/bits/stl_vector.h:710:7: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'unsigned int'
as.cpp:84:5: warning: 'auto' will change meaning in C++0x; please remove it [-Wc++0x-compat]
as.cpp:84:10: error: 'src_len' does not name a type
as.cpp:85:36: error: no match for 'operator[]' in 'p[Rcpp::col]'
as.cpp:85:36: note: candidates are:
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:287:18: note: Rcpp::Vector<RTYPE, StoragePolicy>::Proxy Rcpp::Vector<RTYPE, StoragePolicy>::operator[](int) [with int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Vector<RTYPE, StoragePolicy>::Proxy = int&]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:287:18: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'int'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:288:24: note: Rcpp::Vector<RTYPE, StoragePolicy>::const_Proxy Rcpp::Vector<RTYPE, StoragePolicy>::operator[](int) const [with int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Vector<RTYPE, StoragePolicy>::const_Proxy = const int&]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:288:24: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'int'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:304:22: note: Rcpp::Vector<RTYPE, StoragePolicy>::NameProxy Rcpp::Vector<RTYPE, StoragePolicy>::operator[](const string&) [with int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Vector<RTYPE, StoragePolicy>::NameProxy = Rcpp::internal::simple_name_proxy<13>, std::string = std::basic_string<char>]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:304:22: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'const string& {aka const std::basic_string<char>&}'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:311:22: note: Rcpp::Vector<RTYPE, StoragePolicy>::NameProxy Rcpp::Vector<RTYPE, StoragePolicy>::operator[](const string&) const [with int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Vector<RTYPE, StoragePolicy>::NameProxy = Rcpp::internal::simple_name_proxy<13>, std::string = std::basic_string<char>]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:311:22: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'const string& {aka const std::basic_string<char>&}'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:325:5: note: template<int RHS_RTYPE, bool RHS_NA, class RHS_T> Rcpp::SubsetProxy<RTYPE, StoragePolicy, RHS_RTYPE, RHS_NA, RHS_T> Rcpp::Vector::operator[](const Rcpp::VectorBase<RHS_RTYPE, RHS_NA, RHS_T>&) [with int RHS_RTYPE = RHS_RTYPE, bool RHS_NA = RHS_NA, RHS_T = RHS_T, int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:334:5: note: template<int RHS_RTYPE, bool RHS_NA, class RHS_T> const Rcpp::SubsetProxy<RTYPE, StoragePolicy, RHS_RTYPE, RHS_NA, RHS_T> Rcpp::Vector::operator[](const Rcpp::VectorBase<RHS_RTYPE, RHS_NA, RHS_T>&) const [with int RHS_RTYPE = RHS_RTYPE, bool RHS_NA = RHS_NA, RHS_T = RHS_T, int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:469:20: note: Rcpp::Vector<RTYPE, StoragePolicy>::Indexer Rcpp::Vector<RTYPE, StoragePolicy>::operator[](const Rcpp::Range&) [with int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Vector<RTYPE, StoragePolicy>::Indexer = Rcpp::internal::RangeIndexer<13, true, Rcpp::Vector<13, Rcpp::PreserveStorage> >]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:469:20: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'const Rcpp::Range&'
as.cpp:85:53: error: 'src_len' was not declared in this scope
as.cpp:86:36: error: no match for 'operator[]' in 'p[Rcpp::col]'
as.cpp:86:36: note: candidates are:
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:287:18: note: Rcpp::Vector<RTYPE, StoragePolicy>::Proxy Rcpp::Vector<RTYPE, StoragePolicy>::operator[](int) [with int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Vector<RTYPE, StoragePolicy>::Proxy = int&]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:287:18: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'int'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:288:24: note: Rcpp::Vector<RTYPE, StoragePolicy>::const_Proxy Rcpp::Vector<RTYPE, StoragePolicy>::operator[](int) const [with int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Vector<RTYPE, StoragePolicy>::const_Proxy = const int&]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:288:24: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'int'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:304:22: note: Rcpp::Vector<RTYPE, StoragePolicy>::NameProxy Rcpp::Vector<RTYPE, StoragePolicy>::operator[](const string&) [with int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Vector<RTYPE, StoragePolicy>::NameProxy = Rcpp::internal::simple_name_proxy<13>, std::string = std::basic_string<char>]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:304:22: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'const string& {aka const std::basic_string<char>&}'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:311:22: note: Rcpp::Vector<RTYPE, StoragePolicy>::NameProxy Rcpp::Vector<RTYPE, StoragePolicy>::operator[](const string&) const [with int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Vector<RTYPE, StoragePolicy>::NameProxy = Rcpp::internal::simple_name_proxy<13>, std::string = std::basic_string<char>]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:311:22: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'const string& {aka const std::basic_string<char>&}'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:325:5: note: template<int RHS_RTYPE, bool RHS_NA, class RHS_T> Rcpp::SubsetProxy<RTYPE, StoragePolicy, RHS_RTYPE, RHS_NA, RHS_T> Rcpp::Vector::operator[](const Rcpp::VectorBase<RHS_RTYPE, RHS_NA, RHS_T>&) [with int RHS_RTYPE = RHS_RTYPE, bool RHS_NA = RHS_NA, RHS_T = RHS_T, int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:334:5: note: template<int RHS_RTYPE, bool RHS_NA, class RHS_T> const Rcpp::SubsetProxy<RTYPE, StoragePolicy, RHS_RTYPE, RHS_NA, RHS_T> Rcpp::Vector::operator[](const Rcpp::VectorBase<RHS_RTYPE, RHS_NA, RHS_T>&) const [with int RHS_RTYPE = RHS_RTYPE, bool RHS_NA = RHS_NA, RHS_T = RHS_T, int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:469:20: note: Rcpp::Vector<RTYPE, StoragePolicy>::Indexer Rcpp::Vector<RTYPE, StoragePolicy>::operator[](const Rcpp::Range&) [with int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Vector<RTYPE, StoragePolicy>::Indexer = Rcpp::internal::RangeIndexer<13, true, Rcpp::Vector<13, Rcpp::PreserveStorage> >]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:469:20: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'const Rcpp::Range&'
as.cpp: In function 'SEXPREC* tomatrix(Rcpp::S4)':
as.cpp:105:7: warning: 'auto' will change meaning in C++0x; please remove it [-Wc++0x-compat]
as.cpp:105:12: error: 'col' does not name a type
as.cpp:105:20: error: expected ';' before 'col'
as.cpp:105:26: error: 'Dim' cannot appear in a constant-expression
as.cpp:105:31: error: an array reference cannot appear in a constant-expression
as.cpp:105:20: error: parse error in template argument list
as.cpp:105:20: error: cannot resolve overloaded function 'col' based on conversion to type 'bool'
as.cpp:105:36: error: no post-increment operator for type
as.cpp:106:9: warning: 'auto' will change meaning in C++0x; please remove it [-Wc++0x-compat]
as.cpp:106:14: error: 'j' does not name a type
as.cpp:106:25: error: expected ';' before 'j'
as.cpp:106:25: error: 'j' was not declared in this scope
as.cpp:106:37: error: invalid operands of types '<unresolved overloaded function type>' and 'int' to binary 'operator+'
as.cpp:107:7: warning: 'auto' will change meaning in C++0x; please remove it [-Wc++0x-compat]
as.cpp:107:12: error: 'row' does not name a type
as.cpp:108:7: warning: 'auto' will change meaning in C++0x; please remove it [-Wc++0x-compat]
as.cpp:108:12: error: 'value' does not name a type
as.cpp:109:22: error: no match for call to '(Rcpp::NumericMatrix {aka Rcpp::Matrix<14>}) (<unresolved overloaded function type>, <unresolved overloaded function type>)'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:28:7: note: candidates are:
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:131:18: note: Rcpp::Matrix<RTYPE, StoragePolicy>::Proxy Rcpp::Matrix<RTYPE, StoragePolicy>::operator()(const size_t&, const size_t&) [with int RTYPE = 14, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Matrix<RTYPE, StoragePolicy>::Proxy = double&, size_t = unsigned int]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:131:18: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'const size_t& {aka const unsigned int&}'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:134:24: note: Rcpp::Matrix<RTYPE, StoragePolicy>::const_Proxy Rcpp::Matrix<RTYPE, StoragePolicy>::operator()(const size_t&, const size_t&) const [with int RTYPE = 14, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Matrix<RTYPE, StoragePolicy>::const_Proxy = const double&, size_t = unsigned int]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:134:24: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'const size_t& {aka const unsigned int&}'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:138:16: note: Rcpp::Matrix<RTYPE, StoragePolicy>::Row Rcpp::Matrix<RTYPE, StoragePolicy>::operator()(int, Rcpp::internal::NamedPlaceHolder) [with int RTYPE = 14, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Matrix<RTYPE, StoragePolicy>::Row = Rcpp::MatrixRow<14>]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:138:16: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'int'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:141:19: note: Rcpp::Matrix<RTYPE, StoragePolicy>::Column Rcpp::Matrix<RTYPE, StoragePolicy>::operator()(Rcpp::internal::NamedPlaceHolder, int) [with int RTYPE = 14, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Matrix<RTYPE, StoragePolicy>::Column = Rcpp::MatrixColumn<14>]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:141:19: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'Rcpp::internal::NamedPlaceHolder'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:144:19: note: Rcpp::Matrix<RTYPE, StoragePolicy>::Column Rcpp::Matrix<RTYPE, StoragePolicy>::operator()(Rcpp::internal::NamedPlaceHolder, int) const [with int RTYPE = 14, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Matrix<RTYPE, StoragePolicy>::Column = Rcpp::MatrixColumn<14>]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:144:19: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'Rcpp::internal::NamedPlaceHolder'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:147:16: note: Rcpp::Matrix<RTYPE, StoragePolicy>::Sub Rcpp::Matrix<RTYPE, StoragePolicy>::operator()(const Rcpp::Range&, const Rcpp::Range&) [with int RTYPE = 14, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Matrix<RTYPE, StoragePolicy>::Sub = Rcpp::SubMatrix<14>]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:147:16: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'const Rcpp::Range&'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:150:16: note: Rcpp::Matrix<RTYPE, StoragePolicy>::Sub Rcpp::Matrix<RTYPE, StoragePolicy>::operator()(Rcpp::internal::NamedPlaceHolder, const Rcpp::Range&) [with int RTYPE = 14, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Matrix<RTYPE, StoragePolicy>::Sub = Rcpp::SubMatrix<14>]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:150:16: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'Rcpp::internal::NamedPlaceHolder'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:153:16: note: Rcpp::Matrix<RTYPE, StoragePolicy>::Sub Rcpp::Matrix<RTYPE, StoragePolicy>::operator()(const Rcpp::Range&, Rcpp::internal::NamedPlaceHolder) [with int RTYPE = 14, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Matrix<RTYPE, StoragePolicy>::Sub = Rcpp::SubMatrix<14>]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:153:16: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'const Rcpp::Range&'
as.cpp:109:27: error: 'value' was not declared in this scope
make[1]: *** [as.o] Error 1
make[1]: Leaving directory `/cygdrive/d/temp/RtmpqSwUZD/R.INSTALL21f02adf7a9d/FeatureHashing/src-i386'
Warning: running command 'make -f "Makevars.win" -f "D:/Rcompile/recent/R-3.0.3/etc/i386/Makeconf" -f "D:/Rcompile/recent/R-3.0.3/etc/i386/Makevars.site" -f "D:/Rcompile/recent/R-3.0.3/share/make/winshlib.mk" -f "C:\Users\CRAN\Documents/.R/Makevars" SHLIB_LDFLAGS='$(SHLIB_CXXLDFLAGS)' SHLIB_LD='$(SHLIB_CXXLD)' SHLIB="FeatureHashing.dll" OBJECTS="MurmurHash3.o RcppExports.o as.o hash_internal.o hashed_model_matrix.o product.o subsetting.o tag.o"' had status 2
make[1]: Entering directory `/cygdrive/d/temp/RtmpqSwUZD/R.INSTALL21f02adf7a9d/FeatureHashing/src-i386'
g++  -I"D:/Rcompile/recent/R-3.0.3/include"            -I"d:/RCompile/CRANpkg/lib/3.0/Rcpp/include" -I"d:/Rcompile/CRANpkg/extralibs215/local215/include"     -O3 -Wall  -mtune=core2            -c as.cpp -o as.o
as.cpp: In function 'void pair_sort(int*, double*, size_t)':
as.cpp:8:83: warning: lambda expressions only available with -std=c++0x or -std=gnu++0x [enabled by default]
as.cpp:8:84: error: no matching function for call to 'sort(std::vector<unsigned int>::iterator, std::vector<unsigned int>::iterator, pair_sort(int*, double*, size_t)::<lambda(size_t, size_t)>)'
as.cpp:8:84: note: candidates are:
d:\compiler\gcc-4.6.3\bin\../lib/gcc/i686-w64-mingw32/4.6.3/../../../../include/c++/4.6.3/bits/stl_algo.h:5394:5: note: template<class _RAIter> void std::sort(_RAIter, _RAIter)
d:\compiler\gcc-4.6.3\bin\../lib/gcc/i686-w64-mingw32/4.6.3/../../../../include/c++/4.6.3/bits/stl_algo.h:5430:5: note: template<class _RAIter, class _Compare> void std::sort(_RAIter, _RAIter, _Compare)
as.cpp: In function 'size_t merge(int*, double*, size_t)':
as.cpp:32:16: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
as.cpp:39:31: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
as.cpp: In function 'SEXPREC* todgCMatrix(Rcpp::S4)':
as.cpp:67:7: warning: 'auto' will change meaning in C++0x; please remove it [-Wc++0x-compat]
as.cpp:67:12: error: 'pname' does not name a type
as.cpp:67:39: error: expected ';' before 'pname'
as.cpp:67:39: error: 'pname' was not declared in this scope
as.cpp:82:7: warning: 'auto' will change meaning in C++0x; please remove it [-Wc++0x-compat]
as.cpp:82:12: error: 'col' does not name a type
as.cpp:82:20: error: expected ';' before 'col'
as.cpp:82:26: error: 'Dim' cannot appear in a constant-expression
as.cpp:82:31: error: an array reference cannot appear in a constant-expression
as.cpp:82:20: error: parse error in template argument list
as.cpp:82:20: error: cannot resolve overloaded function 'col' based on conversion to type 'bool'
as.cpp:82:36: error: no post-increment operator for type
as.cpp:83:14: error: no match for 'operator[]' in 'new_p[Rcpp::col]'
as.cpp:83:14: note: candidates are:
d:\compiler\gcc-4.6.3\bin\../lib/gcc/i686-w64-mingw32/4.6.3/../../../../include/c++/4.6.3/bits/stl_vector.h:695:7: note: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) [with _Tp = int, _Alloc = std::allocator<int>, std::vector<_Tp, _Alloc>::reference = int&, std::vector<_Tp, _Alloc>::size_type = unsigned int]
d:\compiler\gcc-4.6.3\bin\../lib/gcc/i686-w64-mingw32/4.6.3/../../../../include/c++/4.6.3/bits/stl_vector.h:695:7: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'unsigned int'
d:\compiler\gcc-4.6.3\bin\../lib/gcc/i686-w64-mingw32/4.6.3/../../../../include/c++/4.6.3/bits/stl_vector.h:710:7: note: std::vector<_Tp, _Alloc>::const_reference std::vector<_Tp, _Alloc>::operator[](std::vector<_Tp, _Alloc>::size_type) const [with _Tp = int, _Alloc = std::allocator<int>, std::vector<_Tp, _Alloc>::const_reference = const int&, std::vector<_Tp, _Alloc>::size_type = unsigned int]
d:\compiler\gcc-4.6.3\bin\../lib/gcc/i686-w64-mingw32/4.6.3/../../../../include/c++/4.6.3/bits/stl_vector.h:710:7: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'unsigned int'
as.cpp:84:5: warning: 'auto' will change meaning in C++0x; please remove it [-Wc++0x-compat]
as.cpp:84:10: error: 'src_len' does not name a type
as.cpp:85:36: error: no match for 'operator[]' in 'p[Rcpp::col]'
as.cpp:85:36: note: candidates are:
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:287:18: note: Rcpp::Vector<RTYPE, StoragePolicy>::Proxy Rcpp::Vector<RTYPE, StoragePolicy>::operator[](int) [with int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Vector<RTYPE, StoragePolicy>::Proxy = int&]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:287:18: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'int'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:288:24: note: Rcpp::Vector<RTYPE, StoragePolicy>::const_Proxy Rcpp::Vector<RTYPE, StoragePolicy>::operator[](int) const [with int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Vector<RTYPE, StoragePolicy>::const_Proxy = const int&]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:288:24: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'int'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:304:22: note: Rcpp::Vector<RTYPE, StoragePolicy>::NameProxy Rcpp::Vector<RTYPE, StoragePolicy>::operator[](const string&) [with int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Vector<RTYPE, StoragePolicy>::NameProxy = Rcpp::internal::simple_name_proxy<13>, std::string = std::basic_string<char>]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:304:22: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'const string& {aka const std::basic_string<char>&}'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:311:22: note: Rcpp::Vector<RTYPE, StoragePolicy>::NameProxy Rcpp::Vector<RTYPE, StoragePolicy>::operator[](const string&) const [with int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Vector<RTYPE, StoragePolicy>::NameProxy = Rcpp::internal::simple_name_proxy<13>, std::string = std::basic_string<char>]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:311:22: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'const string& {aka const std::basic_string<char>&}'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:325:5: note: template<int RHS_RTYPE, bool RHS_NA, class RHS_T> Rcpp::SubsetProxy<RTYPE, StoragePolicy, RHS_RTYPE, RHS_NA, RHS_T> Rcpp::Vector::operator[](const Rcpp::VectorBase<RHS_RTYPE, RHS_NA, RHS_T>&) [with int RHS_RTYPE = RHS_RTYPE, bool RHS_NA = RHS_NA, RHS_T = RHS_T, int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:334:5: note: template<int RHS_RTYPE, bool RHS_NA, class RHS_T> const Rcpp::SubsetProxy<RTYPE, StoragePolicy, RHS_RTYPE, RHS_NA, RHS_T> Rcpp::Vector::operator[](const Rcpp::VectorBase<RHS_RTYPE, RHS_NA, RHS_T>&) const [with int RHS_RTYPE = RHS_RTYPE, bool RHS_NA = RHS_NA, RHS_T = RHS_T, int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:469:20: note: Rcpp::Vector<RTYPE, StoragePolicy>::Indexer Rcpp::Vector<RTYPE, StoragePolicy>::operator[](const Rcpp::Range&) [with int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Vector<RTYPE, StoragePolicy>::Indexer = Rcpp::internal::RangeIndexer<13, true, Rcpp::Vector<13, Rcpp::PreserveStorage> >]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:469:20: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'const Rcpp::Range&'
as.cpp:85:53: error: 'src_len' was not declared in this scope
as.cpp:86:36: error: no match for 'operator[]' in 'p[Rcpp::col]'
as.cpp:86:36: note: candidates are:
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:287:18: note: Rcpp::Vector<RTYPE, StoragePolicy>::Proxy Rcpp::Vector<RTYPE, StoragePolicy>::operator[](int) [with int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Vector<RTYPE, StoragePolicy>::Proxy = int&]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:287:18: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'int'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:288:24: note: Rcpp::Vector<RTYPE, StoragePolicy>::const_Proxy Rcpp::Vector<RTYPE, StoragePolicy>::operator[](int) const [with int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Vector<RTYPE, StoragePolicy>::const_Proxy = const int&]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:288:24: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'int'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:304:22: note: Rcpp::Vector<RTYPE, StoragePolicy>::NameProxy Rcpp::Vector<RTYPE, StoragePolicy>::operator[](const string&) [with int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Vector<RTYPE, StoragePolicy>::NameProxy = Rcpp::internal::simple_name_proxy<13>, std::string = std::basic_string<char>]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:304:22: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'const string& {aka const std::basic_string<char>&}'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:311:22: note: Rcpp::Vector<RTYPE, StoragePolicy>::NameProxy Rcpp::Vector<RTYPE, StoragePolicy>::operator[](const string&) const [with int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Vector<RTYPE, StoragePolicy>::NameProxy = Rcpp::internal::simple_name_proxy<13>, std::string = std::basic_string<char>]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:311:22: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'const string& {aka const std::basic_string<char>&}'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:325:5: note: template<int RHS_RTYPE, bool RHS_NA, class RHS_T> Rcpp::SubsetProxy<RTYPE, StoragePolicy, RHS_RTYPE, RHS_NA, RHS_T> Rcpp::Vector::operator[](const Rcpp::VectorBase<RHS_RTYPE, RHS_NA, RHS_T>&) [with int RHS_RTYPE = RHS_RTYPE, bool RHS_NA = RHS_NA, RHS_T = RHS_T, int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:334:5: note: template<int RHS_RTYPE, bool RHS_NA, class RHS_T> const Rcpp::SubsetProxy<RTYPE, StoragePolicy, RHS_RTYPE, RHS_NA, RHS_T> Rcpp::Vector::operator[](const Rcpp::VectorBase<RHS_RTYPE, RHS_NA, RHS_T>&) const [with int RHS_RTYPE = RHS_RTYPE, bool RHS_NA = RHS_NA, RHS_T = RHS_T, int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:469:20: note: Rcpp::Vector<RTYPE, StoragePolicy>::Indexer Rcpp::Vector<RTYPE, StoragePolicy>::operator[](const Rcpp::Range&) [with int RTYPE = 13, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Vector<RTYPE, StoragePolicy>::Indexer = Rcpp::internal::RangeIndexer<13, true, Rcpp::Vector<13, Rcpp::PreserveStorage> >]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Vector.h:469:20: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'const Rcpp::Range&'
as.cpp: In function 'SEXPREC* tomatrix(Rcpp::S4)':
as.cpp:105:7: warning: 'auto' will change meaning in C++0x; please remove it [-Wc++0x-compat]
as.cpp:105:12: error: 'col' does not name a type
as.cpp:105:20: error: expected ';' before 'col'
as.cpp:105:26: error: 'Dim' cannot appear in a constant-expression
as.cpp:105:31: error: an array reference cannot appear in a constant-expression
as.cpp:105:20: error: parse error in template argument list
as.cpp:105:20: error: cannot resolve overloaded function 'col' based on conversion to type 'bool'
as.cpp:105:36: error: no post-increment operator for type
as.cpp:106:9: warning: 'auto' will change meaning in C++0x; please remove it [-Wc++0x-compat]
as.cpp:106:14: error: 'j' does not name a type
as.cpp:106:25: error: expected ';' before 'j'
as.cpp:106:25: error: 'j' was not declared in this scope
as.cpp:106:37: error: invalid operands of types '<unresolved overloaded function type>' and 'int' to binary 'operator+'
as.cpp:107:7: warning: 'auto' will change meaning in C++0x; please remove it [-Wc++0x-compat]
as.cpp:107:12: error: 'row' does not name a type
as.cpp:108:7: warning: 'auto' will change meaning in C++0x; please remove it [-Wc++0x-compat]
as.cpp:108:12: error: 'value' does not name a type
as.cpp:109:22: error: no match for call to '(Rcpp::NumericMatrix {aka Rcpp::Matrix<14>}) (<unresolved overloaded function type>, <unresolved overloaded function type>)'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:28:7: note: candidates are:
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:131:18: note: Rcpp::Matrix<RTYPE, StoragePolicy>::Proxy Rcpp::Matrix<RTYPE, StoragePolicy>::operator()(const size_t&, const size_t&) [with int RTYPE = 14, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Matrix<RTYPE, StoragePolicy>::Proxy = double&, size_t = unsigned int]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:131:18: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'const size_t& {aka const unsigned int&}'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:134:24: note: Rcpp::Matrix<RTYPE, StoragePolicy>::const_Proxy Rcpp::Matrix<RTYPE, StoragePolicy>::operator()(const size_t&, const size_t&) const [with int RTYPE = 14, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Matrix<RTYPE, StoragePolicy>::const_Proxy = const double&, size_t = unsigned int]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:134:24: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'const size_t& {aka const unsigned int&}'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:138:16: note: Rcpp::Matrix<RTYPE, StoragePolicy>::Row Rcpp::Matrix<RTYPE, StoragePolicy>::operator()(int, Rcpp::internal::NamedPlaceHolder) [with int RTYPE = 14, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Matrix<RTYPE, StoragePolicy>::Row = Rcpp::MatrixRow<14>]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:138:16: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'int'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:141:19: note: Rcpp::Matrix<RTYPE, StoragePolicy>::Column Rcpp::Matrix<RTYPE, StoragePolicy>::operator()(Rcpp::internal::NamedPlaceHolder, int) [with int RTYPE = 14, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Matrix<RTYPE, StoragePolicy>::Column = Rcpp::MatrixColumn<14>]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:141:19: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'Rcpp::internal::NamedPlaceHolder'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:144:19: note: Rcpp::Matrix<RTYPE, StoragePolicy>::Column Rcpp::Matrix<RTYPE, StoragePolicy>::operator()(Rcpp::internal::NamedPlaceHolder, int) const [with int RTYPE = 14, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Matrix<RTYPE, StoragePolicy>::Column = Rcpp::MatrixColumn<14>]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:144:19: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'Rcpp::internal::NamedPlaceHolder'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:147:16: note: Rcpp::Matrix<RTYPE, StoragePolicy>::Sub Rcpp::Matrix<RTYPE, StoragePolicy>::operator()(const Rcpp::Range&, const Rcpp::Range&) [with int RTYPE = 14, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Matrix<RTYPE, StoragePolicy>::Sub = Rcpp::SubMatrix<14>]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:147:16: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'const Rcpp::Range&'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:150:16: note: Rcpp::Matrix<RTYPE, StoragePolicy>::Sub Rcpp::Matrix<RTYPE, StoragePolicy>::operator()(Rcpp::internal::NamedPlaceHolder, const Rcpp::Range&) [with int RTYPE = 14, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Matrix<RTYPE, StoragePolicy>::Sub = Rcpp::SubMatrix<14>]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:150:16: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'Rcpp::internal::NamedPlaceHolder'
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:153:16: note: Rcpp::Matrix<RTYPE, StoragePolicy>::Sub Rcpp::Matrix<RTYPE, StoragePolicy>::operator()(const Rcpp::Range&, Rcpp::internal::NamedPlaceHolder) [with int RTYPE = 14, StoragePolicy = Rcpp::PreserveStorage, Rcpp::Matrix<RTYPE, StoragePolicy>::Sub = Rcpp::SubMatrix<14>]
d:/RCompile/CRANpkg/lib/3.0/Rcpp/include/Rcpp/vector/Matrix.h:153:16: note:   no known conversion for argument 1 from '<unresolved overloaded function type>' to 'const Rcpp::Range&'
as.cpp:109:27: error: 'value' was not declared in this scope
make[1]: *** [as.o] Error 1
make[1]: Leaving directory `/cygdrive/d/temp/RtmpqSwUZD/R.INSTALL21f02adf7a9d/FeatureHashing/src-i386'
Warning: running command 'make -f "Makevars.win" -f "D:/Rcompile/recent/R-3.0.3/etc/i386/Makeconf" -f "D:/Rcompile/recent/R-3.0.3/etc/i386/Makevars.site" -f "D:/Rcompile/recent/R-3.0.3/share/make/winshlib.mk" -f "C:\Users\CRAN\Documents/.R/Makevars" SHLIB_LDFLAGS='$(SHLIB_CXXLDFLAGS)' SHLIB_LD='$(SHLIB_CXXLD)' SHLIB="FeatureHashing.dll" OBJECTS="MurmurHash3.o RcppExports.o as.o hash_internal.o hashed_model_matrix.o product.o subsetting.o tag.o" symbols.rds' had status 2
ERROR: compilation failed for package 'FeatureHashing'
* removing 'd:/Rcompile/CRANpkg/lib/3.0/FeatureHashing'

Example not working

I am not sure to understand how the algorithm is supposed to work but the example doesn't seem to work as expected.

> # Detail of the hashing
> ## The main effect is hashed via `hash_h`
> all(hash_h(names(mapping)) %% 2^6 == mapping %% 2^6)
[1] TRUE
> ## The sign is corrected by `hash_xi`
> hash_xi(names(mapping))
 [1] -1  1  1 -1 -1  1 -1 -1  1  1  1  1 -1  1 -1  1  1  1
> ## The interaction term is implemented as follow:
> m2 <- hashed.model.matrix(~ .^2, CO2, 2^6, keep.hashing_mapping = TRUE)
> mapping2 <- unlist(as.list(attr(m2, "mapping")))
> mapping2[2] # PlantQn2:uptake
  PlantQn1 
3789462177 
> h1 <- mapping2["PlantQn2"]
> h2 <- mapping2["uptake"]
> library(pack)
> hash_h(rawToChar(c(numToRaw(h1, 4), numToRaw(h2, 4)))) # should be mapping2[2]
[1] 974267571
> h1
  PlantQn2 
4122940517 
> h2
    uptake 
1505155248 
> mapping2[2]
  PlantQn1 
3789462177 

Result of hash_h function is not mapping2[2].

I want to update the important feature function I wrote for the package xgboost to take care of hashing trick but I am not able to reverse the splits in the trees to the orginal feature name and value (I am trying since half hour, so a bit new to the thing, I may miss something big).

One thing not related (and not enough important to deserve its own issue), I noticed that in the mapping you produce you paste the name of the data.frame column to the value. Why not separating them by a character like a point for an easier reading? There is the same behavior in Matrix package when doing one hot encoding... May be there is a reason I miss.

Kind regards,
Michael

Big memory consumption

In @3195623, the implementation is based on Matrix::sparse.model.matrix. The sparse matrix conversion exhausts the memory.

Need to implement a directly way to create hashed sparse model matrix.

Default space

For Vowpal Wabbit, default space is 2^18
https://github.com/JohnLangford/vowpal_wabbit/wiki/Feature-Hashing-and-Extraction

Is there a reason why default in this package is 2^26? I don t know if there is a reason for the default value of Vowpal but I like the idea doing like them as they have plenty experience in using this trick (adn Vowpal author is one of the author of the main paper introducing hashing trick)
Is there a maximum which should be documented? I think yes it should be the same max as documented for Vowpal.

Kind regards,
Michael

Boost failed on sparc solaris

make: Fatal error: Command failed for target `hashed_model_matrix.o'
Current working directory /tmp/Rtmp3Daynz/R.INSTALL326b7704716c/FeatureHashing/src
/opt/csw/gcc4/bin/g++ -std=c++11 -I/home/ripley/R/gcc/include -DNDEBUG -I/opt/csw/include -I/usr/local/include -I"/home/ripley/R/Lib32/Rcpp/include" -I"/home/ripley/R/Lib32/digest/include" -I"/home/ripley/R/Lib32/BH/include" -fPIC -g -O2 -mcpu=niagara2 -c hashed_model_matrix.cpp -o hashed_model_matrix.o
In file included from /home/ripley/R/Lib32/BH/include/boost/predef/architecture.h:22:0,
from /home/ripley/R/Lib32/BH/include/boost/predef/other/endian.h:142,
from /home/ripley/R/Lib32/BH/include/boost/predef/detail/endian_compat.h:11,
from /home/ripley/R/Lib32/BH/include/boost/detail/endian.hpp:9,
from hashed_model_matrix.cpp:3:
/home/ripley/R/Lib32/BH/include/boost/predef/architecture/sparc.h:40:37: error: operator '&&' has no right operand

if !defined(BOOST_ARCH_SPARC) &&

                                 ^

murmurHash

Shall we rework the interface of murmurHash which I just added to digest in 0.6.7 so that you can use it here?

Hashed value is varied on solaris sparc

checking tests ... ERROR
Running the tests in ‘tests/test-hashing_result.R’ failed.
Last 13 lines of output:
> # test consistency of hashing
> 
> mapping_value <- structure(c(3789462177, 4122940517, 1079927366, 1505155248, 4103768016, 
+ 1576910802, 248868694, 2189134401, 1321560276, 2636986885, 1980993114, 
+ 3588767725, 3873367263, 3437882550, 1125161513, 875000041, 1178743966, 
+ 1791688646), .Names = c("PlantQn1", "PlantQn2", "PlantQn3", "uptake", 
+ "TypeMississippi", "Treatmentchilled", "PlantMn1", "PlantMn2", 
+ "PlantMn3", "PlantQc1", "PlantQc2", "PlantQc3", "Treatmentnonchilled", 
+ "PlantMc1", "PlantMc2", "PlantMc3", "conc", "TypeQuebec"))
> 
> stopifnot(all(hash_h(names(mapping_value)) %% 2^32 == mapping_value))
Error: all(hash_h(names(mapping_value))%%2^32 == mapping_value) is not TRUE
Execution halted

Add a function to compute easily best size for the space

It seems there is an easy way to compute the best space size (not too big but still enough to limit collision). From Vowpal Wabbit:

Ensuring the hash is large enough

The hash table, by default, can hold 2^18 or 262144 entries. For many problems this is plenty but in some cases, more space is needed to avoid collisions. To count unique features after hashing, add the parameter --readable_model then use wc -l . Calculate the nearest power of two to the result and use this to set the -b parameter.

Counting unique feature per column in data.frame is fairly simple with unique() function and a call to apply(). Does it make sense to include such function?

Renaming argument `keep.hashing_mapping`

For naming consistency, I think is.mapping is a better name.

Also, the users should not be affected seriously because this argument is designed for demonstration. They should not want to use them.

Rewrite the README.md

The NEW README shall include:

  • Use hashed matrix to fit a linear regression model with glmnet
  • Use hashed matrix to fit a logistic regression model with glmnet
  • Use hashed matrix to fit a gradient boosted decision tree with xgboost

line too long in \examples

* checking Rd line widths ... NOTE
Rd file 'interpret.tag.Rd':
  \examples lines wider than 100 characters:
      data <- data.frame(a = c("1,2,3", "2,3,3", "1,3", "3"), type = c("a", "b", "a", "a"), stringsAsFactors = FALSE)

These lines will be truncated in the PDF manual.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.