GithubHelp home page GithubHelp logo

tidyverse / readxl Goto Github PK

View Code? Open in Web Editor NEW
719.0 41.0 194.0 11.73 MB

Read excel files (.xls and .xlsx) into R 🖇

Home Page: https://readxl.tidyverse.org

License: Other

R 18.70% C++ 47.81% C 33.49%
r excel xlsx xls spreadsheet

readxl's Introduction

readxl

CRAN_Status_Badge R-CMD-check Codecov test coverage lifecycle

Overview

The readxl package makes it easy to get data out of Excel and into R. Compared to many of the existing packages (e.g. gdata, xlsx, xlsReadWrite) readxl has no external dependencies, so it’s easy to install and use on all operating systems. It is designed to work with tabular data.

readxl supports both the legacy .xls format and the modern xml-based .xlsx format. The libxls C library is used to support .xls, which abstracts away many of the complexities of the underlying binary format. To parse .xlsx, we use the RapidXML C++ library.

Installation

The easiest way to install the latest released version from CRAN is to install the whole tidyverse.

install.packages("tidyverse")

NOTE: you will still need to load readxl explicitly, because it is not a core tidyverse package loaded via library(tidyverse).

Alternatively, install just readxl from CRAN:

install.packages("readxl")

Or install the development version from GitHub:

#install.packages("pak")
pak::pak("tidyverse/readxl")

Cheatsheet

You can see how to read data with readxl in the data import cheatsheet, which also covers similar functionality in the related packages readr and googlesheets4.

Usage

library(readxl)

readxl includes several example files, which we use throughout the documentation. Use the helper readxl_example() with no arguments to list them or call it with an example filename to get the path.

readxl_example()
#>  [1] "clippy.xls"    "clippy.xlsx"   "datasets.xls"  "datasets.xlsx"
#>  [5] "deaths.xls"    "deaths.xlsx"   "geometry.xls"  "geometry.xlsx"
#>  [9] "type-me.xls"   "type-me.xlsx"
readxl_example("clippy.xls")
#> [1] "/private/tmp/RtmpM1GkLC/temp_libpatha8e46f7f62bf/readxl/extdata/clippy.xls"

read_excel() reads both xls and xlsx files and detects the format from the extension.

xlsx_example <- readxl_example("datasets.xlsx")
read_excel(xlsx_example)
#> # A tibble: 150 × 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>  
#> 1          5.1         3.5          1.4         0.2 setosa 
#> 2          4.9         3            1.4         0.2 setosa 
#> 3          4.7         3.2          1.3         0.2 setosa 
#> # ℹ 147 more rows

xls_example <- readxl_example("datasets.xls")
read_excel(xls_example)
#> # A tibble: 150 × 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>  
#> 1          5.1         3.5          1.4         0.2 setosa 
#> 2          4.9         3            1.4         0.2 setosa 
#> 3          4.7         3.2          1.3         0.2 setosa 
#> # ℹ 147 more rows

List the sheet names with excel_sheets().

excel_sheets(xlsx_example)
#> [1] "iris"     "mtcars"   "chickwts" "quakes"

Specify a worksheet by name or number.

read_excel(xlsx_example, sheet = "chickwts")
#> # A tibble: 71 × 2
#>   weight feed     
#>    <dbl> <chr>    
#> 1    179 horsebean
#> 2    160 horsebean
#> 3    136 horsebean
#> # ℹ 68 more rows
read_excel(xls_example, sheet = 4)
#> # A tibble: 1,000 × 5
#>     lat  long depth   mag stations
#>   <dbl> <dbl> <dbl> <dbl>    <dbl>
#> 1 -20.4  182.   562   4.8       41
#> 2 -20.6  181.   650   4.2       15
#> 3 -26    184.    42   5.4       43
#> # ℹ 997 more rows

There are various ways to control which cells are read. You can even specify the sheet here, if providing an Excel-style cell range.

read_excel(xlsx_example, n_max = 3)
#> # A tibble: 3 × 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>  
#> 1          5.1         3.5          1.4         0.2 setosa 
#> 2          4.9         3            1.4         0.2 setosa 
#> 3          4.7         3.2          1.3         0.2 setosa
read_excel(xlsx_example, range = "C1:E4")
#> # A tibble: 3 × 3
#>   Petal.Length Petal.Width Species
#>          <dbl>       <dbl> <chr>  
#> 1          1.4         0.2 setosa 
#> 2          1.4         0.2 setosa 
#> 3          1.3         0.2 setosa
read_excel(xlsx_example, range = cell_rows(1:4))
#> # A tibble: 3 × 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>  
#> 1          5.1         3.5          1.4         0.2 setosa 
#> 2          4.9         3            1.4         0.2 setosa 
#> 3          4.7         3.2          1.3         0.2 setosa
read_excel(xlsx_example, range = cell_cols("B:D"))
#> # A tibble: 150 × 3
#>   Sepal.Width Petal.Length Petal.Width
#>         <dbl>        <dbl>       <dbl>
#> 1         3.5          1.4         0.2
#> 2         3            1.4         0.2
#> 3         3.2          1.3         0.2
#> # ℹ 147 more rows
read_excel(xlsx_example, range = "mtcars!B1:D5")
#> # A tibble: 4 × 3
#>     cyl  disp    hp
#>   <dbl> <dbl> <dbl>
#> 1     6   160   110
#> 2     6   160   110
#> 3     4   108    93
#> # ℹ 1 more row

If NAs are represented by something other than blank cells, set the na argument.

read_excel(xlsx_example, na = "setosa")
#> # A tibble: 150 × 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>  
#> 1          5.1         3.5          1.4         0.2 <NA>   
#> 2          4.9         3            1.4         0.2 <NA>   
#> 3          4.7         3.2          1.3         0.2 <NA>   
#> # ℹ 147 more rows

If you are new to the tidyverse conventions for data import, you may want to consult the data import chapter in R for Data Science. readxl will become increasingly consistent with other packages, such as readr.

Articles

Broad topics are explained in these articles:

We also have some focused articles that address specific aggravations presented by the world’s spreadsheets:

Features

  • No external dependency on, e.g., Java or Perl.

  • Re-encodes non-ASCII characters to UTF-8.

  • Loads datetimes into POSIXct columns. Both Windows (1900) and Mac (1904) date specifications are processed correctly.

  • Discovers the minimal data rectangle and returns that, by default. User can exert more control with range, skip, and n_max.

  • Column names and types are determined from the data in the sheet, by default. User can also supply via col_names and col_types and control name repair via .name_repair.

  • Returns a tibble, i.e. a data frame with an additional tbl_df class. Among other things, this provide nicer printing.

Other relevant packages

Here are some other packages with functionality that is complementary to readxl and that also avoid a Java dependency.

Writing Excel files: The example files datasets.xlsx and datasets.xls were created with the help of openxlsx (and Excel). openxlsx provides “a high level interface to writing, styling and editing worksheets”.

l <- list(iris = iris, mtcars = mtcars, chickwts = chickwts, quakes = quakes)
openxlsx::write.xlsx(l, file = "inst/extdata/datasets.xlsx")

writexl is a new option in this space, first released on CRAN in August 2017. It’s a portable and lightweight way to export a data frame to xlsx, based on libxlsxwriter. It is much more minimalistic than openxlsx, but on simple examples, appears to be about twice as fast and to write smaller files.

Non-tabular data and formatting: tidyxl is focused on importing awkward and non-tabular data from Excel. It also “exposes cell content, position and formatting in a tidy structure for further manipulation”.

readxl's People

Contributors

batpigandme avatar boshek avatar bquast avatar dchiu911 avatar eringrand avatar fermumen avatar fvd avatar gaborcsardi avatar gergness avatar gl-eb avatar hadley avatar jakeruss avatar jennybc avatar jeroen avatar jimhester avatar jirkalewandowski avatar kaiaragaki avatar kevinushey avatar krlmlr avatar kwstat avatar michaelchirico avatar mkuhn avatar nacnudus avatar pedramnavid avatar rohan-shah avatar sbearrows avatar stephenc avatar struckma avatar tklebel avatar zeehio avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

readxl's Issues

License

libxls is BSD-licensed. So exell must be BSD-licensed as well if libxls is included.

Minor note: How does that work with Rcpp? Can a Rcpp package have a different license than GPL? With MIT I assume its okay, but with BSD?

Numeric column containing formulae with NA not recognised as numeric even if na option set

When importing a spreadsheet column that contains a missing element designated with "na", the column is recognised as text despite the na="na" option. Using col_type to force recognition of the column as numeric results in the following error:

> test<-read_excel("test.xlsx",sheet="toR",na="na",col_types=c("text","numeric"))
Warning message:
In read_xlsx_(path, sheet, col_names = col_names, col_types = col_types,  :
  [3, 2]: expecting numeric: got 'na'

The data looks like this:
image

The input Excel file can be found here.

Also, there is a workaround: removing the formulae in the toR sheet by copy / paste-value the relevant data seems to resolve the problem -- but also means that the toR sheet doesn't update automatically when the rest of the spreadsheet is changed.

Optional anchor and ncol

It would be nice to be able to specify where the rectangular data part is located within a sheet (some people like to have "non-data" related things in their sheets too). One way could be to specify the top-left anchor cell, say "D5", and optionally the number of columns to use.

Column names like "First column" result in "First.column", why not "First_column"?

When reading sheets that have column names with restricted characters readxl replaces them with "." since the use of "." is a little ambiguous in R why not use "_" instead - as all Hadley packages and book would suggest anyways.

Maybe an replacement-character-for-characters-not-to-be-used-in-column-names-option would be a way to go

Missing columns due to blank cells underneath column header

Thanks so much for a great (and speedy) package!

Just to let you know that the read_excel function doesn't seem to import columns if the initial data in those columns are blank. The xlsx in the link has 295 columns but only the first column is imported. The rows aren't blank as the first column does contain data.

I see you're discussing anchors and specifying ranges which would make it possible to manually identify, but it might still be useful if the default brought it all in.

library(readxl)
download.file("https://www.resbank.co.za/Lists/News%20and%20Publications/Attachments/6648/01Kbp1%20%E2%80%93%20Money%20and%20Banking%20%E2%80%93%20March%202015.zip", "Banking_data.zip")
unzip("Banking_data.zip")
ncol(read_excel("Kbp1MB-March2015.xlsx", sheet = "M1"))
#2

Thanks again for all your hard work!

[feature request] read in region of sheet

A key feature of XLConnect in my work is being able to read in an arbitrary region of cells from a worksheet using startRow, startCol, endRow, and endCol arguments in the readWorksheet function. This functionality is very handy for me as I often receive Excel files which contain many smaller tables on a single sheet. Would an equivalent feature be in scope for this project? I'd love to ditch rJava if I could.

Error in install.

Hi, I just tried installing this on my mac, and received the issues below. I successfully installed the package on my RStudio server, however.

Downloading github repo hadley/readxl@master
Installing readxl
Installing dependencies for readxl:
Rcpp
trying URL 'http://mran.revolutionanalytics.com/snapshot/2014-10-01/bin/macosx/mavericks/contrib/3.1/Rcpp_0.11.3.tgz'
Content type 'application/octet-stream' length 2717447 bytes (2.6 MB)
opened URL
==================================================
downloaded 2.6 MB


The downloaded binary packages are in
    /var/folders/yb/f6mkd27553j90cdnlwk7yz800000gn/T//RtmpKd5Vgt/downloaded_packages
'/Library/Frameworks/R.framework/Resources/bin/R' --vanilla CMD INSTALL  \
  '/private/var/folders/yb/f6mkd27553j90cdnlwk7yz800000gn/T/RtmpKd5Vgt/devtoolsb0df38a6f206/hadley-readxl-0d444af'  \
  --library='/Library/Frameworks/R.framework/Versions/3.1/Resources/library'  \
  --install-tests 

* installing *source* package ‘readxl’ ...
** libs
clang++ -I/Library/Frameworks/R.framework/Resources/include -DNDEBUG -Iunix -I. -I/usr/local/include -I/usr/local/include/freetype2 -I/opt/X11/include -I"/Library/Frameworks/R.framework/Versions/3.1/Resources/library/Rcpp/include"  -std=c++11 -fPIC  -Wall -mtune=core2 -g -O2  -c RcppExports.cpp -o RcppExports.o
clang++ -I/Library/Frameworks/R.framework/Resources/include -DNDEBUG -Iunix -I. -I/usr/local/include -I/usr/local/include/freetype2 -I/opt/X11/include -I"/Library/Frameworks/R.framework/Versions/3.1/Resources/library/Rcpp/include"  -std=c++11 -fPIC  -Wall -mtune=core2 -g -O2  -c XlsWorkBook.cpp -o XlsWorkBook.o
In file included from XlsWorkBook.cpp:2:
In file included from ./XlsWorkBook.h:6:
./CellType.h:31:13: error: no member named 'warning' in namespace 'Rcpp'
      Rcpp::warning("Unknown type '%s' at position %i. Using text instead.",
      ~~~~~~^
In file included from XlsWorkBook.cpp:2:
./XlsWorkBook.h:28:39: error: too many arguments to function call, expected single argument 'message', have 2 arguments
      Rcpp::stop("Failed to open %s", path);
      ~~~~~~~~~~                      ^~~~
/Library/Frameworks/R.framework/Versions/3.1/Resources/library/Rcpp/include/Rcpp/exceptions.h:195:5: note: 'stop' declared here
    inline void stop(const std::string& message) {
    ^
In file included from XlsWorkBook.cpp:3:
./XlsWorkSheet.h:127:19: error: no member named 'warning' in namespace 'Rcpp'
            Rcpp::warning("Expecting numeric in [%i, %i] got `%s`",
            ~~~~~~^
./XlsWorkSheet.h:138:19: error: no member named 'warning' in namespace 'Rcpp'
            Rcpp::warning("Expecting date in [%i, %i] got %d",
            ~~~~~~^
./XlsWorkSheet.h:146:19: error: no member named 'warning' in namespace 'Rcpp'
            Rcpp::warning("Expecting date in [%i, %i] got '%s'",
            ~~~~~~^
XlsWorkBook.cpp:16:3: error: no matching function for call to 'stop'
  stop("Couldn't find sheet called '%s'", name);
  ^~~~
/Library/Frameworks/R.framework/Versions/3.1/Resources/library/Rcpp/include/Rcpp/exceptions.h:195:17: note: candidate function not viable: requires single argument 'message', but 2 arguments were provided
    inline void stop(const std::string& message) {
                ^
6 errors generated.
make: *** [XlsWorkBook.o] Error 1
ERROR: compilation failed for package ‘readxl’
* removing ‘/Library/Frameworks/R.framework/Versions/3.1/Resources/library/readxl’
Error: Command failed (1)



> sessionInfo()
R version 3.1.3 (2015-03-09)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.2 (Yosemite)

locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] devtools_1.6   httr_0.5       RCurl_1.95-4.3 stringr_0.6.2 
[5] tools_3.1.3   

Segmentation fault

I'm trying to load this UK government workbook but get the following error:

df <- read_excel(file, sheet="101")

 *** caught segfault ***
address 0x0, cause 'memory not mapped'

Traceback:
 1: .Call("readxl_xlsx_sheets", PACKAGE = "readxl", path)
 2: xlsx_sheets(path)
 3: match(x, table, nomatch = 0L)
 4: sheet %in% sheet_names
 5: standardise_sheet(sheet, xlsx_sheets(path))
 6: read_xlsx(path, sheet, col_names, col_types, na, skip)
 7: read_excel(file, sheet = "101")
> sessionInfo()
R version 3.1.3 (2015-03-09)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.2 (Yosemite)

locale:
[1] C

attached base packages:
[1] datasets  utils     stats     graphics  grDevices methods   base     

other attached packages:
[1] readxl_0.0.0.9000 jkr_0.1.0         arm_1.7-07        lme4_1.1-7       
[5] Rcpp_0.11.5       Matrix_1.1-5      MASS_7.3-39      

loaded via a namespace (and not attached):
 [1] abind_1.4-3      coda_0.17-1      colorspace_1.2-6 compiler_3.1.3  
 [5] digest_0.6.8     ggplot2_1.0.0    grid_3.1.3       gtable_0.1.2    
 [9] lattice_0.20-30  minqa_1.2.4      munsell_0.4.2    nlme_3.1-120    
[13] nloptr_1.0.4     plyr_1.8.1       proto_0.3-10     randtoolbox_1.16
[17] reshape2_1.4.1   rngWELL_0.10-3   scales_0.2.4     splines_3.1.3   
[21] stringr_0.6.2    tools_3.1.3   

local install of package fail

i am running in an offline enviornment and downloaded the master zip file to run
devtools::install_local("readxl-master.zip")
it is looking for dependencies (Rcpp) and it tries to download the package by default and fails, even though Rcpp is already installed.

installed.packages()%>%data.frame(row.names = NULL)%>%filter(str_detect(Package,glob2rx("Rc*")))%>%select(Package,Version)

    Package   Version

1 Rcpp 0.11.4
2 RcppArmadillo 0.4.320.0
3 RcppEigen 0.3.2.1.2

If i try to run
devtools::install_local("readxl-master.zip",dependencies=F)
i also get an error
> library("devtools", lib.loc="/R/win-library/3.1")
> install_local("w:\r_packages\readxl-master.zip",dependencies=F)
Installing package from w:\r_packages\readxl-master.zip
Installing readxl
"C:/Users/u243/DOCUME
1/R/R-31~1.2/bin/i386/R" --vanilla CMD INSTALL
"C:\Users\u243\AppData\Local\Temp\2\RtmpOYZ3j3\devtoolsf082dac658\readxl-master"
--library="C:/Users/u243/Documents/R/win-library/3.1" --install-tests

  • installing source package 'readxl' ...
    ** libs
    Warning: running command 'make -f "Makevars.win" -f "C:/Users/u243/DOCUME1/R/R-311.2/etc/i386/Makeconf" -f "C:/Users/u243/DOCUME1/R/R-311.2/share/make/winshlib.mk" SHLIB_LDFLAGS='$(SHLIB_CXXLDFLAGS)' SHLIB_LD='$(SHLIB_CXXLD)' SHLIB="readxl.dll" OBJECTS="RcppExports.o XlsWorkBook.o XlsWorkSheet.o XlsxWorkBook.o XlsxWorkSheet.o benchmarks.o endian.o ole.o xls.o xlstool.o zip.o"' had status 127

ERROR: compilation failed for package 'readxl'

  • removing 'C:/Users/u243/Documents/R/win-library/3.1/readxl'

Error: Command failed (1)

how to read in specific rows and columns only?

In the xlsx package (which is hell to install so I chose readxl), you can specify certain rows and columns to read in with the arguments colIndex and rowIndex. readxl seems to lack this - how to accomplish the same with readxl?

read_excel does not read the last column

read_excel does not read the last column in sheet

read_excel("twocolumns.xlsx")
##    A
## 1  2
## 2  4
## 3  8
## 4 16
## 5 32
# try with read.xls from the gdata package:
library(gdata)
read.xls("twocolumns.xlsx")
##    A  B
## 1  2  2
## 2  4  3
## 3  8  5
## 4 16  7
## 5 32 11

sessionInfo()
## R version 3.1.3 (2015-03-09)
## Platform: x86_64-unknown-linux-gnu (64-bit)
## Running under: Ubuntu 14.04.2 LTS
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] gdata_2.13.3      readxl_0.0.0.9000
## 
## loaded via a namespace (and not attached):
## [1] gtools_3.4.1 Rcpp_0.11.5  tools_3.1.3

Unable to install

Great to see you working on this problem (ODS too at some point?).

I just tried to install it but no joy. I haven't done much C++ so maybe it's an easy fix?

Downloading github repo hadley/readxl@master
Installing readxl
'/Library/Frameworks/R.framework/Resources/bin/R' --vanilla CMD INSTALL  \
  '/private/var/folders/18/jlpn9mf51154wkft000dfbkw0000gn/T/RtmpiM0rim/devtoolsc41b14fcb2ad/hadley-readxl-35005aa'  \
  --library='/Library/Frameworks/R.framework/Versions/3.1/Resources/library'  \
  --install-tests 

* installing *source* package 'readxl' ...
** libs
clang++ -arch x86_64 -ftemplate-depth-256 -I/Library/Frameworks/R.framework/Resources/include    -Iunix -I. -I/usr/local/include -I/usr/local/include/freetype2 -I/opt/X11/include -I"/Library/Frameworks/R.framework/Versions/3.1/Resources/library/Rcpp/include"   -fPIC  -Wall -mtune=core2 -O3    -c RcppExports.cpp -o RcppExports.o
clang++ -arch x86_64 -ftemplate-depth-256 -I/Library/Frameworks/R.framework/Resources/include    -Iunix -I. -I/usr/local/include -I/usr/local/include/freetype2 -I/opt/X11/include -I"/Library/Frameworks/R.framework/Versions/3.1/Resources/library/Rcpp/include"   -fPIC  -Wall -mtune=core2 -O3    -c XlsWorkBook.cpp -o XlsWorkBook.o
In file included from XlsWorkBook.cpp:2:
In file included from ./XlsWorkBook.h:6:
./CellType.h:31:13: error: no member named 'warning' in namespace 'Rcpp'
      Rcpp::warning("Unknown type '%s' at position %i. Using text instead.",
      ~~~~~~^
In file included from XlsWorkBook.cpp:2:
./XlsWorkBook.h:28:39: error: too many arguments to function call, expected single argument 'message', have 2 arguments
      Rcpp::stop("Failed to open %s", path);
      ~~~~~~~~~~                      ^~~~
/Library/Frameworks/R.framework/Versions/3.1/Resources/library/Rcpp/include/Rcpp/exceptions.h:195:5: note: 'stop' declared here
    inline void stop(const std::string& message) {
    ^
In file included from XlsWorkBook.cpp:3:
./XlsWorkSheet.h:127:19: error: no member named 'warning' in namespace 'Rcpp'
            Rcpp::warning("Expecting numeric in [%i, %i] got `%s`",
            ~~~~~~^
./XlsWorkSheet.h:138:19: error: no member named 'warning' in namespace 'Rcpp'
            Rcpp::warning("Expecting date in [%i, %i] got %d",
            ~~~~~~^
./XlsWorkSheet.h:146:19: error: no member named 'warning' in namespace 'Rcpp'
            Rcpp::warning("Expecting date in [%i, %i] got '%s'",
            ~~~~~~^
XlsWorkBook.cpp:16:3: error: no matching function for call to 'stop'
  stop("Couldn't find sheet called '%s'", name);
  ^~~~
/Library/Frameworks/R.framework/Versions/3.1/Resources/library/Rcpp/include/Rcpp/exceptions.h:195:17: note: candidate function not viable:
      requires single argument 'message', but 2 arguments were provided
    inline void stop(const std::string& message) {
                ^
6 errors generated.
make: *** [XlsWorkBook.o] Error 1
ERROR: compilation failed for package 'readxl'
* removing '/Library/Frameworks/R.framework/Versions/3.1/Resources/library/readxl'
Error: Command failed (1)
> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)

locale:
[1] C

attached base packages:
[1] datasets  utils     stats     graphics  grDevices methods   base     

other attached packages:
[1] devtools_1.6.1 jkr_0.1.0      arm_1.7-07     lme4_1.1-7     Rcpp_0.11.3   
[6] Matrix_1.1-4   MASS_7.3-35   

loaded via a namespace (and not attached):
 [1] RCurl_1.95-4.5   abind_1.4-0      bitops_1.0-6     coda_0.16-1     
 [5] colorspace_1.2-4 compiler_3.1.2   digest_0.6.6     ggplot2_1.0.0   
 [9] grid_3.1.2       gtable_0.1.2     httr_0.6.0       lattice_0.20-29 
[13] minqa_1.2.4      munsell_0.4.2    nlme_3.1-118     nloptr_1.0.4    
[17] plyr_1.8.1       proto_0.3-10     randtoolbox_1.16 reshape2_1.4.1  
[21] rngWELL_0.10-3   scales_0.2.4     splines_3.1.2    stringr_0.6.2   
[25] tools_3.1.2 

Time without date error

XLS files with times without dates, such as "12:34:56", give warnings like

1: In xls_cols(path, sheet, col_names = col_names, col_types = col_types, :
Expecting date in [1, 2] got 0.524259

and result in .

For XLSX files, "12:34:56" results in "1899-12-30 12:34:56", which is awkward but allowable.

Compiling issues in Mac

I am having issues compiling the package using devtools on a Mac (Yosemite - 10.10.2). Here is the error log:

In file included from XlsWorkBook.cpp:2:
In file included from ./XlsWorkBook.h:6:
./CellType.h:31:13: error: no member named 'warning' in namespace 'Rcpp'
      Rcpp::warning("Unknown type '%s' at position %i. Using text instead.",
      ~~~~~~^
In file included from XlsWorkBook.cpp:2:
./XlsWorkBook.h:28:39: error: too many arguments to function call, expected single argument 'message', have 2 arguments
      Rcpp::stop("Failed to open %s", path);
      ~~~~~~~~~~                      ^~~~
/Library/Frameworks/R.framework/Versions/3.1/Resources/library/Rcpp/include/Rcpp/exceptions.h:195:5: note: 'stop' declared here
    inline void stop(const std::string& message) {
    ^
In file included from XlsWorkBook.cpp:3:
./XlsWorkSheet.h:127:19: error: no member named 'warning' in namespace 'Rcpp'
            Rcpp::warning("Expecting numeric in [%i, %i] got `%s`",
            ~~~~~~^
./XlsWorkSheet.h:138:19: error: no member named 'warning' in namespace 'Rcpp'
            Rcpp::warning("Expecting date in [%i, %i] got %d",
            ~~~~~~^
./XlsWorkSheet.h:146:19: error: no member named 'warning' in namespace 'Rcpp'
            Rcpp::warning("Expecting date in [%i, %i] got '%s'",
            ~~~~~~^
XlsWorkBook.cpp:16:3: error: no matching function for call to 'stop'
  stop("Couldn't find sheet called '%s'", name);
  ^~~~
/Library/Frameworks/R.framework/Versions/3.1/Resources/library/Rcpp/include/Rcpp/exceptions.h:195:17: note: candidate function not viable: requires single argument 'message', but 2 arguments were provided
    inline void stop(const std::string& message) {
                ^
6 errors generated.
make: *** [XlsWorkBook.o] Error 1
ERROR: compilation failed for package ‘readxl’
* removing ‘/Library/Frameworks/R.framework/Versions/3.1/Resources/library/readxl’
Error: Command failed (1)

Issue with compilation, asprintf

Hi @hadley,

I just tried to install this package (devtools::install_github("hadley/readxl")), and got a compilation error:

clang -I/Library/Frameworks/R.framework/Resources/include -DNDEBUG -Iunix -I. -I/usr/local/include -I/usr/local/include/freetype2 -I/opt/X11/include -I"/Library/Frameworks/R.framework/Versions/3.1/Resources/library/Rcpp/include"   -fPIC  -Wall -mtune=core2 -g -O2  -c xlstool.c -o xlstool.o
xlstool.c:167:12: error: static declaration of 'asprintf' follows non-static declaration
static int asprintf(char **ret, const char *format, ...)
           ^
/usr/include/stdio.h:453:6: note: previous declaration is here
int      asprintf(char ** __restrict, const char * __restrict, ...) __printflike(2, 3);
         ^
1 error generated.
make: *** [xlstool.o] Error 1
ERROR: compilation failed for package ‘readxl’
* removing ‘/Library/Frameworks/R.framework/Versions/3.1/Resources/library/readxl’
Error: Command failed (1)

Here is the output from version

> version
               _                           
platform       x86_64-apple-darwin13.4.0   
arch           x86_64                      
os             darwin13.4.0                
system         x86_64, darwin13.4.0        
status                                     
major          3                           
minor          1.3                         
year           2015                        
month          03                          
day            09                          
svn rev        67962                       
language       R                           
version.string R version 3.1.3 (2015-03-09)
nickname       Smooth Sidewalk   

I'm on OSX 10.9, if thats helpful info too.

Something on my end?

Missing values set to empty string not NA or supplied value

Here's an example file to test. On my Windows 7 machine the missing values are being set to empty strings not NA or a supplied value.

one <- read_excel(path = "path to readxl_test.xlsx")
two <- read_excel(path = "path to readxl_test.xlsx", na = "9999")

sessionInfo()
R version 3.1.3 (2015-03-09)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

loaded via a namespace (and not attached):
[1] devtools_1.5 digest_0.6.4 evaluate_0.5.5 httr_0.5 memoise_0.2.1 parallel_3.1.3 RCurl_1.95-4.3 stringr_0.6.2
[9] tools_3.1.3 whisker_0.3-2

Failure in guessing col_types result in warnings an NA-values

When a column in the excel-file have numeric values in the first rows, and text further down in the column, the "guess col_types" will set col_types to numeric. The import will give warnings and the non-numeric values will be NA.
Suggestion is to re-evaluate the content of whole column if the error-message appear, and then import with suitable format.
Another solution may be to just fall back to set col_types for that column to "text" and reimport if there's a error message of not matching data types.

I've linked 2 example docs from my dropbox, one working and one failing.
https://www.dropbox.com/sh/8mzk6z12f99ye4w/AABe-C7fxCzm1lomNyge1pw_a?dl=0

File "working_numeric_import_small.xlsx" have non-numeric data in early rows, so that import works by just runng read_excel("working_numeric_import_small.xlsx")

The file "error_numeric_import.xlsx" have numerical data in the first rows, but other characters in later rows.
Running 'read_excel("error_numeric_import.xlsx")' will give the following warnings:

Warning messages:
1: In read_xlsx_(path, sheet, col_names = col_names, col_types = col_types, ... :
[181, 3]: expecting numeric: got '555-55555'
2: In read_xlsx_(path, sheet, col_names = col_names, col_types = col_types, ... :
[184, 3]: expecting numeric: got 'tel +46-70-55555888'
3: In read_xlsx_(path, sheet, col_names = col_names, col_types = col_types, ... :
[294, 2]: expecting numeric: got '123736/1''
...
...

Add an option to limit the number of rows to read

Often users of Excel files add additional information at the top of the page (which can be skipped with current functionality) and following the data we need to read (which I would like to skip too).

Won't read non-lowercase .XLSX or .xlsm files

Looks like the code for identifying the file type is being case-sensitive with extensions, so it will open .xlsx files but not .XLSX.

> read_excel("epoch-1900.XLSX")
Error: Don't know how to parse extension XLSX

Since the extension-based file type detection on platforms Excel runs on is typically case-insensitive, it seems like read_excel should be too, with respect to detecting file formats.

It won't read .xlsm files, either. They're the same format as .xlsx, just with macros enabled.

> read_excel("with-macro.xlsm")
Error: Don't know how to parse extension xlsm

check.names default (and reading from URL)

Great package. Two minor issues:

  1. Can we add a check.names option (set to FALSE as default) for read_excel? Multi-word column names are currently being read "as is" from Excel, rather than having spaces replaced with periods.

  2. I may have missed something, but is it possible to read .xls files directly from a URL?

Read worksheet with only one row

Thanks for this great job fast to read excel files. In my excel file, some worksheets only contain one row (head). This code will give me an error message

sheet_i <- read_excel(file, 'log')

Error: Skipped over all data

it will be better to generate a data table with 0 row.

Installation messages on Windows 7

The package successfully installed but I received more messages than I'm used to copied and pasted below.

> devtools::install_github("hadley/readxl")
Installing github repo readxl/master from hadley
Downloading master.zip from https://github.com/hadley/readxl/archive/master.zip
Installing package from C:\Users\Rubemode\AppData\Local\Temp\RtmpcR4RBJ/master.zip
Installing readxl
Installing dependencies for readxl:
Rcpp
Installing package into ‘C:/Users/Rubemode/Documents/R/win-library/3.1’
(as ‘lib’ is unspecified)
trying URL 'http://lib.stat.cmu.edu/R/CRAN/bin/windows/contrib/3.1/Rcpp_0.11.5.zip'
Content type 'application/zip' length 3191453 bytes (3.0 MB)
opened URL
downloaded 3.0 MB

package ‘Rcpp’ successfully unpacked and MD5 sums checked

The downloaded binary packages are in
    C:\Users\Rubemode\AppData\Local\Temp\RtmpcR4RBJ\downloaded_packages
"C:/PROGRA~1/R/R-31~1.3/bin/x64/R" --vanilla CMD INSTALL  \
  "C:\Users\Rubemode\AppData\Local\Temp\RtmpcR4RBJ\devtools1f4442c0750a\readxl-master"  \
  --library="C:/Users/Rubemode/Documents/R/win-library/3.1" --install-tests 

* installing *source* package 'readxl' ...
** libs

*** arch - i386
g++ -m32 -I"C:/PROGRA~1/R/R-31~1.3/include" -DNDEBUG -Iwindows -I.   -I"C:/Users/Rubemode/Documents/R/win-library/3.1/Rcpp/include" -I"d:/RCompile/CRANpkg/extralibs64/local/include"     -O2 -Wall  -mtune=core2 -c RcppExports.cpp -o RcppExports.o
g++ -m32 -I"C:/PROGRA~1/R/R-31~1.3/include" -DNDEBUG -Iwindows -I.   -I"C:/Users/Rubemode/Documents/R/win-library/3.1/Rcpp/include" -I"d:/RCompile/CRANpkg/extralibs64/local/include"     -O2 -Wall  -mtune=core2 -c XlsWorkBook.cpp -o XlsWorkBook.o
In file included from XlsWorkBook.h:6:0,
                 from XlsWorkBook.cpp:2:
CellType.h: In function 'bool isDateTime(int, std::set<int>)':
CellType.h:97:25: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:98:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:99:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:100:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:101:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h: In function 'bool isDateFormat(std::string)':
CellType.h:112:30: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
In file included from XlsWorkBook.cpp:3:0:
XlsWorkSheet.h: In member function 'Rcpp::List XlsWorkSheet::readCols(Rcpp::CharacterVector, std::vector<CellType>, std::string, int)':
XlsWorkSheet.h:92:50: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
g++ -m32 -I"C:/PROGRA~1/R/R-31~1.3/include" -DNDEBUG -Iwindows -I.   -I"C:/Users/Rubemode/Documents/R/win-library/3.1/Rcpp/include" -I"d:/RCompile/CRANpkg/extralibs64/local/include"     -O2 -Wall  -mtune=core2 -c XlsWorkSheet.cpp -o XlsWorkSheet.o
In file included from XlsWorkBook.h:6:0,
                 from XlsWorkSheet.cpp:3:
CellType.h: In function 'bool isDateTime(int, std::set<int>)':
CellType.h:97:25: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:98:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:99:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:100:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:101:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h: In function 'bool isDateFormat(std::string)':
CellType.h:112:30: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
In file included from XlsWorkSheet.cpp:4:0:
XlsWorkSheet.h: In member function 'Rcpp::List XlsWorkSheet::readCols(Rcpp::CharacterVector, std::vector<CellType>, std::string, int)':
XlsWorkSheet.h:92:50: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
XlsWorkSheet.cpp: In function 'Rcpp::CharacterVector xls_col_types(std::string, std::string, int, int, int)':
XlsWorkSheet.cpp:15:34: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
g++ -m32 -I"C:/PROGRA~1/R/R-31~1.3/include" -DNDEBUG -Iwindows -I.   -I"C:/Users/Rubemode/Documents/R/win-library/3.1/Rcpp/include" -I"d:/RCompile/CRANpkg/extralibs64/local/include"     -O2 -Wall  -mtune=core2 -c XlsxWorkBook.cpp -o XlsxWorkBook.o
In file included from XlsxWorkBook.h:6:0,
                 from XlsxWorkBook.cpp:2:
CellType.h: In function 'bool isDateTime(int, std::set<int>)':
CellType.h:97:25: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:98:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:99:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:100:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:101:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h: In function 'bool isDateFormat(std::string)':
CellType.h:112:30: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
g++ -m32 -I"C:/PROGRA~1/R/R-31~1.3/include" -DNDEBUG -Iwindows -I.   -I"C:/Users/Rubemode/Documents/R/win-library/3.1/Rcpp/include" -I"d:/RCompile/CRANpkg/extralibs64/local/include"     -O2 -Wall  -mtune=core2 -c XlsxWorkSheet.cpp -o XlsxWorkSheet.o
In file included from XlsxWorkBook.h:6:0,
                 from XlsxWorkSheet.h:6,
                 from XlsxWorkSheet.cpp:2:
CellType.h: In function 'bool isDateTime(int, std::set<int>)':
CellType.h:97:25: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:98:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:99:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:100:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:101:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h: In function 'bool isDateFormat(std::string)':
CellType.h:112:30: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
In file included from XlsxWorkSheet.h:7:0,
                 from XlsxWorkSheet.cpp:2:
XlsxCell.h: In member function 'Rcpp::RObject XlsxCell::stringFromTable(const char*, const string&, const std::vector<std::basic_string<char> >&)':
XlsxCell.h:147:42: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
In file included from XlsxWorkSheet.cpp:2:0:
XlsxWorkSheet.h: In member function 'Rcpp::List XlsxWorkSheet::readCols(Rcpp::CharacterVector, const std::vector<CellType>&, const string&, int)':
XlsxWorkSheet.h:126:50: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
XlsxWorkSheet.cpp: In function 'Rcpp::CharacterVector xlsx_col_types(std::string, int, std::string, int, int)':
XlsxWorkSheet.cpp:32:34: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
g++ -m32 -I"C:/PROGRA~1/R/R-31~1.3/include" -DNDEBUG -Iwindows -I.   -I"C:/Users/Rubemode/Documents/R/win-library/3.1/Rcpp/include" -I"d:/RCompile/CRANpkg/extralibs64/local/include"     -O2 -Wall  -mtune=core2 -c benchmarks.cpp -o benchmarks.o
In file included from XlsxWorkBook.h:6:0,
                 from benchmarks.cpp:3:
CellType.h: In function 'bool isDateTime(int, std::set<int>)':
CellType.h:97:25: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:98:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:99:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:100:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:101:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h: In function 'bool isDateFormat(std::string)':
CellType.h:112:30: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
gcc -m32 -I"C:/PROGRA~1/R/R-31~1.3/include" -DNDEBUG -Iwindows -I.   -I"C:/Users/Rubemode/Documents/R/win-library/3.1/Rcpp/include" -I"d:/RCompile/CRANpkg/extralibs64/local/include"     -O3 -Wall  -std=gnu99 -mtune=core2 -c endian.c -o endian.o
endian.c: In function 'xls_is_bigendian':
endian.c:43:2: warning: #warning NO ENDIAN [-Wcpp]
gcc -m32 -I"C:/PROGRA~1/R/R-31~1.3/include" -DNDEBUG -Iwindows -I.   -I"C:/Users/Rubemode/Documents/R/win-library/3.1/Rcpp/include" -I"d:/RCompile/CRANpkg/extralibs64/local/include"     -O3 -Wall  -std=gnu99 -mtune=core2 -c ole.c -o ole.o
ole.c: In function 'sector_read':
ole.c:462:3: warning: unknown conversion type character 'z' in format [-Wformat]
ole.c:462:3: warning: unknown conversion type character 'z' in format [-Wformat]
ole.c:462:3: warning: unknown conversion type character 'z' in format [-Wformat]
ole.c:462:3: warning: too many arguments for format [-Wformat-extra-args]
ole.c:468:3: warning: unknown conversion type character 'z' in format [-Wformat]
ole.c:468:3: warning: unknown conversion type character 'z' in format [-Wformat]
ole.c:468:3: warning: too many arguments for format [-Wformat-extra-args]
gcc -m32 -I"C:/PROGRA~1/R/R-31~1.3/include" -DNDEBUG -Iwindows -I.   -I"C:/Users/Rubemode/Documents/R/win-library/3.1/Rcpp/include" -I"d:/RCompile/CRANpkg/extralibs64/local/include"     -O3 -Wall  -std=gnu99 -mtune=core2 -c xls.c -o xls.o
xls.c: In function 'xls_appendSST':
xls.c:228:18: warning: unknown conversion type character 'z' in format [-Wformat]
xls.c:228:18: warning: too many arguments for format [-Wformat-extra-args]
xls.c: In function 'xls_parseWorkBook':
xls.c:714:4: warning: unknown conversion type character 'z' in format [-Wformat]
xls.c:714:4: warning: unknown conversion type character 'z' in format [-Wformat]
xls.c:714:4: warning: unknown conversion type character 'z' in format [-Wformat]
xls.c:714:4: warning: too many arguments for format [-Wformat-extra-args]
xls.c: In function 'xls_preparseWorkSheet':
xls.c:980:10: warning: variable 'read' set but not used [-Wunused-but-set-variable]
xls.c: In function 'xls_parseWorkSheet':
xls.c:1087:4: warning: unknown conversion type character 'z' in format [-Wformat]
xls.c:1087:4: warning: too many arguments for format [-Wformat-extra-args]
gcc -m32 -I"C:/PROGRA~1/R/R-31~1.3/include" -DNDEBUG -Iwindows -I.   -I"C:/Users/Rubemode/Documents/R/win-library/3.1/Rcpp/include" -I"d:/RCompile/CRANpkg/extralibs64/local/include"     -O3 -Wall  -std=gnu99 -mtune=core2 -c xlstool.c -o xlstool.o
xlstool.c: In function 'unicode_decode':
xlstool.c:294:17: warning: passing argument 2 of 'libiconv' from incompatible pointer type [enabled by default]
C:/PROGRA~1/R/R-31~1.3/include/iconv.h:53:8: note: expected 'const char **' but argument is of type 'char **'
xlstool.c: At top level:
xlstool.c:60:13: warning: 'xls_showBOUNDSHEET' declared 'static' but never defined [-Wunused-function]
g++ -m32 -shared -s -static-libgcc -o readxl.dll tmp.def RcppExports.o XlsWorkBook.o XlsWorkSheet.o XlsxWorkBook.o XlsxWorkSheet.o benchmarks.o endian.o ole.o xls.o xlstool.o -lRiconv -Ld:/RCompile/CRANpkg/extralibs64/local/lib/i386 -Ld:/RCompile/CRANpkg/extralibs64/local/lib -LC:/PROGRA~1/R/R-31~1.3/bin/i386 -lR
installing to C:/Users/Rubemode/Documents/R/win-library/3.1/readxl/libs/i386

*** arch - x64
g++ -m64 -I"C:/PROGRA~1/R/R-31~1.3/include" -DNDEBUG -Iwindows -I.   -I"C:/Users/Rubemode/Documents/R/win-library/3.1/Rcpp/include" -I"d:/RCompile/CRANpkg/extralibs64/local/include"     -O2 -Wall  -mtune=core2 -c RcppExports.cpp -o RcppExports.o
g++ -m64 -I"C:/PROGRA~1/R/R-31~1.3/include" -DNDEBUG -Iwindows -I.   -I"C:/Users/Rubemode/Documents/R/win-library/3.1/Rcpp/include" -I"d:/RCompile/CRANpkg/extralibs64/local/include"     -O2 -Wall  -mtune=core2 -c XlsWorkBook.cpp -o XlsWorkBook.o
In file included from XlsWorkBook.h:6:0,
                 from XlsWorkBook.cpp:2:
CellType.h: In function 'bool isDateTime(int, std::set<int>)':
CellType.h:97:25: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:98:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:99:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:100:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:101:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h: In function 'bool isDateFormat(std::string)':
CellType.h:112:30: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
In file included from XlsWorkBook.cpp:3:0:
XlsWorkSheet.h: In member function 'Rcpp::List XlsWorkSheet::readCols(Rcpp::CharacterVector, std::vector<CellType>, std::string, int)':
XlsWorkSheet.h:92:50: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
g++ -m64 -I"C:/PROGRA~1/R/R-31~1.3/include" -DNDEBUG -Iwindows -I.   -I"C:/Users/Rubemode/Documents/R/win-library/3.1/Rcpp/include" -I"d:/RCompile/CRANpkg/extralibs64/local/include"     -O2 -Wall  -mtune=core2 -c XlsWorkSheet.cpp -o XlsWorkSheet.o
In file included from XlsWorkBook.h:6:0,
                 from XlsWorkSheet.cpp:3:
CellType.h: In function 'bool isDateTime(int, std::set<int>)':
CellType.h:97:25: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:98:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:99:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:100:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:101:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h: In function 'bool isDateFormat(std::string)':
CellType.h:112:30: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
In file included from XlsWorkSheet.cpp:4:0:
XlsWorkSheet.h: In member function 'Rcpp::List XlsWorkSheet::readCols(Rcpp::CharacterVector, std::vector<CellType>, std::string, int)':
XlsWorkSheet.h:92:50: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
XlsWorkSheet.cpp: In function 'Rcpp::CharacterVector xls_col_types(std::string, std::string, int, int, int)':
XlsWorkSheet.cpp:15:34: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
g++ -m64 -I"C:/PROGRA~1/R/R-31~1.3/include" -DNDEBUG -Iwindows -I.   -I"C:/Users/Rubemode/Documents/R/win-library/3.1/Rcpp/include" -I"d:/RCompile/CRANpkg/extralibs64/local/include"     -O2 -Wall  -mtune=core2 -c XlsxWorkBook.cpp -o XlsxWorkBook.o
In file included from XlsxWorkBook.h:6:0,
                 from XlsxWorkBook.cpp:2:
CellType.h: In function 'bool isDateTime(int, std::set<int>)':
CellType.h:97:25: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:98:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:99:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:100:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:101:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h: In function 'bool isDateFormat(std::string)':
CellType.h:112:30: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
g++ -m64 -I"C:/PROGRA~1/R/R-31~1.3/include" -DNDEBUG -Iwindows -I.   -I"C:/Users/Rubemode/Documents/R/win-library/3.1/Rcpp/include" -I"d:/RCompile/CRANpkg/extralibs64/local/include"     -O2 -Wall  -mtune=core2 -c XlsxWorkSheet.cpp -o XlsxWorkSheet.o
In file included from XlsxWorkBook.h:6:0,
                 from XlsxWorkSheet.h:6,
                 from XlsxWorkSheet.cpp:2:
CellType.h: In function 'bool isDateTime(int, std::set<int>)':
CellType.h:97:25: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:98:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:99:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:100:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:101:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h: In function 'bool isDateFormat(std::string)':
CellType.h:112:30: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
In file included from XlsxWorkSheet.h:7:0,
                 from XlsxWorkSheet.cpp:2:
XlsxCell.h: In member function 'Rcpp::RObject XlsxCell::stringFromTable(const char*, const string&, const std::vector<std::basic_string<char> >&)':
XlsxCell.h:147:42: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
In file included from XlsxWorkSheet.cpp:2:0:
XlsxWorkSheet.h: In member function 'Rcpp::List XlsxWorkSheet::readCols(Rcpp::CharacterVector, const std::vector<CellType>&, const string&, int)':
XlsxWorkSheet.h:126:50: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
XlsxWorkSheet.cpp: In function 'Rcpp::CharacterVector xlsx_col_types(std::string, int, std::string, int, int)':
XlsxWorkSheet.cpp:32:34: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
g++ -m64 -I"C:/PROGRA~1/R/R-31~1.3/include" -DNDEBUG -Iwindows -I.   -I"C:/Users/Rubemode/Documents/R/win-library/3.1/Rcpp/include" -I"d:/RCompile/CRANpkg/extralibs64/local/include"     -O2 -Wall  -mtune=core2 -c benchmarks.cpp -o benchmarks.o
In file included from XlsxWorkBook.h:6:0,
                 from benchmarks.cpp:3:
CellType.h: In function 'bool isDateTime(int, std::set<int>)':
CellType.h:97:25: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:98:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:99:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:100:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h:101:23: warning: suggest parentheses around comparison in operand of '&' [-Wparentheses]
CellType.h: In function 'bool isDateFormat(std::string)':
CellType.h:112:30: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
gcc -m64 -I"C:/PROGRA~1/R/R-31~1.3/include" -DNDEBUG -Iwindows -I.   -I"C:/Users/Rubemode/Documents/R/win-library/3.1/Rcpp/include" -I"d:/RCompile/CRANpkg/extralibs64/local/include"     -O2 -Wall  -std=gnu99 -mtune=core2 -c endian.c -o endian.o
endian.c: In function 'xls_is_bigendian':
endian.c:43:2: warning: #warning NO ENDIAN [-Wcpp]
gcc -m64 -I"C:/PROGRA~1/R/R-31~1.3/include" -DNDEBUG -Iwindows -I.   -I"C:/Users/Rubemode/Documents/R/win-library/3.1/Rcpp/include" -I"d:/RCompile/CRANpkg/extralibs64/local/include"     -O2 -Wall  -std=gnu99 -mtune=core2 -c ole.c -o ole.o
ole.c: In function 'sector_read':
ole.c:462:3: warning: unknown conversion type character 'z' in format [-Wformat]
ole.c:462:3: warning: unknown conversion type character 'z' in format [-Wformat]
ole.c:462:3: warning: unknown conversion type character 'z' in format [-Wformat]
ole.c:462:3: warning: too many arguments for format [-Wformat-extra-args]
ole.c:468:3: warning: unknown conversion type character 'z' in format [-Wformat]
ole.c:468:3: warning: unknown conversion type character 'z' in format [-Wformat]
ole.c:468:3: warning: too many arguments for format [-Wformat-extra-args]
gcc -m64 -I"C:/PROGRA~1/R/R-31~1.3/include" -DNDEBUG -Iwindows -I.   -I"C:/Users/Rubemode/Documents/R/win-library/3.1/Rcpp/include" -I"d:/RCompile/CRANpkg/extralibs64/local/include"     -O2 -Wall  -std=gnu99 -mtune=core2 -c xls.c -o xls.o
xls.c: In function 'xls_appendSST':
xls.c:228:18: warning: unknown conversion type character 'z' in format [-Wformat]
xls.c:228:18: warning: too many arguments for format [-Wformat-extra-args]
xls.c: In function 'xls_parseWorkBook':
xls.c:714:4: warning: unknown conversion type character 'z' in format [-Wformat]
xls.c:714:4: warning: unknown conversion type character 'z' in format [-Wformat]
xls.c:714:4: warning: unknown conversion type character 'z' in format [-Wformat]
xls.c:714:4: warning: too many arguments for format [-Wformat-extra-args]
xls.c: In function 'xls_preparseWorkSheet':
xls.c:980:10: warning: variable 'read' set but not used [-Wunused-but-set-variable]
xls.c: In function 'xls_parseWorkSheet':
xls.c:1087:4: warning: unknown conversion type character 'z' in format [-Wformat]
xls.c:1087:4: warning: format '%d' expects argument of type 'int', but argument 3 has type 'size_t' [-Wformat]
xls.c:1087:4: warning: too many arguments for format [-Wformat-extra-args]
gcc -m64 -I"C:/PROGRA~1/R/R-31~1.3/include" -DNDEBUG -Iwindows -I.   -I"C:/Users/Rubemode/Documents/R/win-library/3.1/Rcpp/include" -I"d:/RCompile/CRANpkg/extralibs64/local/include"     -O2 -Wall  -std=gnu99 -mtune=core2 -c xlstool.c -o xlstool.o
xlstool.c: In function 'unicode_decode':
xlstool.c:294:17: warning: passing argument 2 of 'libiconv' from incompatible pointer type [enabled by default]
C:/PROGRA~1/R/R-31~1.3/include/iconv.h:53:8: note: expected 'const char **' but argument is of type 'char **'
xlstool.c: At top level:
xlstool.c:60:13: warning: 'xls_showBOUNDSHEET' declared 'static' but never defined [-Wunused-function]
g++ -m64 -shared -s -static-libgcc -o readxl.dll tmp.def RcppExports.o XlsWorkBook.o XlsWorkSheet.o XlsxWorkBook.o XlsxWorkSheet.o benchmarks.o endian.o ole.o xls.o xlstool.o -lRiconv -Ld:/RCompile/CRANpkg/extralibs64/local/lib/x64 -Ld:/RCompile/CRANpkg/extralibs64/local/lib -LC:/PROGRA~1/R/R-31~1.3/bin/x64 -lR
installing to C:/Users/Rubemode/Documents/R/win-library/3.1/readxl/libs/x64
** R
** tests
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
*** arch - i386
*** arch - x64
* DONE (readxl)

And the session info 

> sessionInfo()
R version 3.1.3 (2015-03-09)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] devtools_1.5   digest_0.6.4   evaluate_0.5.5 httr_0.5       memoise_0.2.1  parallel_3.1.3 RCurl_1.95-4.3 stringr_0.6.2 
 [9] tools_3.1.3    whisker_0.3-2 

Controlling output of read_excel()

Once you give user control of cell specification (#8), the next question will be if user can control the form of the output. I suspect right now it is always a tbl_df with default to a stringsAsFactors = FALSE mentality?

Re: header row, will you stick with col_names = instead of the more read-table-y header = TRUE/FALSE? Of course, what you've done with col_names = is more powerful.

What about dropping dimensions and returning (named) atomic vector instead of tbl_df?

I ask because, as in #8, I'd like to be a consistent as possible in gspreadr.

import with swedish characters å,ä,ö

Swedish characters å,ä,ö in the excel-file changes to ,, in the imported data.
The same file saved as .csv and imported by read.csv2() passes the original å,ä,ö into the data.

Rounding error when numbers are interpreted as characters

If I create a simple Excel worksheet like

col1
73.1

read_excel('minimal.xlsx')

works as expected, interpreting col1 as type num and the value as 73.1

But if I then add a character value "x" in the next row like this

col1
73.1
x

now col1 is interpreted as type char but the value changes to 73.099999999999994

The strange rounding happens to some numbers but not others.

I'm using Excel 2010 on 32-bit Windows 7 and readxl 0.0.0.9000

Error message when trying to install on OS X Yosemite

This is a new system, so I'm not 100% sure the problem is with readxl and not something amiss with my system. However,I have been able to install other packages from GitHub OK. But FYI when I try to install readxl, I see these messages:

Downloading github repo hadley/readxl@master
Installing readxl
'/Library/Frameworks/R.framework/Resources/bin/R' --vanilla CMD INSTALL
'/private/var/folders/l9/[specific path]'
--library='/Library/Frameworks/R.framework/Versions/3.1/Resources/library' --install-tests

  • installing source package ‘readxl’ ...
    ** libs
    clang++ -I/Library/Frameworks/R.framework/Resources/include -DNDEBUG -Iunix -I. -I/usr/local/include -I"/Library/Frameworks/R.framework/Versions/3.1/Resources/library/Rcpp/include" -fPIC -mtune=core2 -g -O2 -c RcppExports.cpp -o RcppExports.o
    In file included from RcppExports.cpp:4:
    In file included from /Library/Frameworks/R.framework/Versions/3.1/Resources/library/Rcpp/include/Rcpp.h:27:
    In file included from /Library/Frameworks/R.framework/Versions/3.1/Resources/library/Rcpp/include/RcppCommon.h:29:
    /Library/Frameworks/R.framework/Versions/3.1/Resources/library/Rcpp/include/Rcpp/platform/compiler.h:95:10: fatal error: 'cmath' file not found
    #include
    ^
    1 error generated.
    make: *** [RcppExports.o] Error 1
    ERROR: compilation failed for package ‘readxl’
  • removing ‘/Library/Frameworks/R.framework/Versions/3.1/Resources/library/readxl’
    Error: Command failed (1)

Erreur : Names and types must be same size

Related to #9

Now I get other errors:

Sheet 1

> df = read_excel("data-science-at-the-command-line/book/ch03/data/imdb-250.xlsx", sheet=1)
Erreur : Names and types must be same size

Cells with partial colouring read the first colour chunk?

I have an excel sheet with a cell containing "H2395" where the "H" is in red, and the "2395" is in black.

On reading in, this reads as "H" and drops the number. This file for example:

https://dl.dropboxusercontent.com/u/41868071/Results%20Plate%20SF325.xlsx

(On a sidenote, is it possible to retrieve the colour of a cell entry at all? I have a bunch of sheets where (unfortunately) someone decided to store data based on the colour of the text in the cell... Just "colour of first text" would do the trick.)

Address duplicate column names

Hi,

Currently, if the column names in the first row of a spreadsheet have duplicates, read_excel allows that. It would be nice to have an option (or default behavior), similar to gdata::read.xls, whereby instead it appends a numeric suffix. For example, if the names in the first row include c("xvar", "xvar") the respective column names would become c("xvar", "xvar.1").

Don

Ignore total lines

It's possible to add total lines to excel tables, and these can have aggregations that differ in type to the column it's aggregating.

At the moment total lines are being read and cause errors on loading when the data types differ.

Ideally these should not be read at all (or optionally loaded).

Issues with special characters on Windows machine.

Reading a table with names containing special characters something like "Günther" results in "G\xf6tz" causing problems along the way: E.g. neither grepl() and str_extract() do not work as expected ...

... while something like this works on handwritten data ...

str_extract("G\xfcnther", "ü")
## ü 

str_extract("G\xfcnther", "\xfc")
## "ü"

... it does not work on the data read in by readxl. My current workaround is to use iconv(column) without any further parameters to get values from "G\xf6tz" to "Günther". From then on everything works fine.

My guts tell me, that this is probably one of those R-Windows-Locale-Encoding issues as I use Windows and am used strange happening with characters when read into R.

Error: Invalid sheet xml (no <dimension>) from some xlsx files created by Google Sheets

I've gotten the error Invalid sheet xml (no <dimension>) message several times when trying to open up xlsx files created from spreadsheets on Google Sheets. This may be because they aren't following the spec, but I'm not sure. This appears to be a Google Sheets problem because I was able to open the spreadsheet using LibreOffice, and when I saved it back to an xlsx file using LibreOffice read_excel opened it without error.

Here's an example spreadsheet (it's just the iris data) for which I get this error. Open the spreadsheet https://drive.google.com/open?id=10lVYeaRV1hK8aPmhNuTyx5YXSFQC3cL5VmNppZ0u8_Q&authuser=0 and download it as Excel or download the xlsx file which was created from that spreadsheet: https://drive.google.com/open?id=0B3SVyot2nUtGWkhiekRSUjZENkk&authuser=0.
Then,

> read_excel("readxl_test.xlsx")
Error: Invalid sheet xml (no <dimension>)

Erreur : vector::_M_range_check

Related to #9

Sheet 2

> df = read_excel("/data-science-at-the-command-line/book/ch03/data/imdb-250.xlsx", sheet=2)
Erreur : vector::_M_range_check

plus some warnings:

> head(warnings())
1: In xlsx_col_types(path, sheet, na = na, nskip = skip) :
  Unknown type 'e' in [1, 8]
2: In xlsx_col_types(path, sheet, na = na, nskip = skip) :
  Unknown type 'e' in [2, 8]
3: In xlsx_col_types(path, sheet, na = na, nskip = skip) :
  Unknown type 'e' in [3, 8]
4: In xlsx_col_types(path, sheet, na = na, nskip = skip) :
  Unknown type 'e' in [4, 8]
5: In xlsx_col_types(path, sheet, na = na, nskip = skip) :
  Unknown type 'e' in [5, 8]
6: In xlsx_col_types(path, sheet, na = na, nskip = skip) :
  Unknown type 'e' in [6, 8]

read_workbook()

Awesome package!

Have you thought of a read_workbook() that would bring in an entire workbook as a list of data.frames? The use case would basically be doing exploratory work on a workbook which you haven't seen before when you don't want to open the workbook itself. Sometimes those big gnarly excel sheets can be very slow to open or will crash on opening. It might be nice to be able to flip through sheets in R even when you didn't know which sheet you were looking for.

Warning "unkown type 'inlineStr' leads to NA column with NA header

In a larger excel sheet there is one tab that results in an additional column with NA as a header. A function like dplyr::distinct will fail on this column later. A warning for each row in the excel sheet is generated:

Warning messages:
1: In read_xlsx_(path, sheet, col_names = col_names, col_types = col_types,  ... :
  [3, 4]: unknown type 'inlineStr'
2: In read_xlsx_(path, sheet, col_names = col_names, col_types = col_types,  ... :
  [4, 4]: unknown type 'inlineStr'
3: In read_xlsx_(path, sheet, col_names = col_names, col_types = col_types,  ... :

It may have something to do with working in both Microsoft Excel and LibreOffice with the same document. There are references to similar issues in the python-excel project, and a description of the conflicting xml here. The test file in that last link will repeat the issue in readxl.

col_names = FALSE gives an error

Using col_names = FALSE to not read column names from the Excel file fails:

library(readxl)
read_excel("d7x3_nohead.xlsx")
A 1
1 B 1
2 C 2
3 D 3
4 A 5
5 B 8
6 C 13
read_excel("d7x3_nohead.xlsx", col_names = FALSE)
Error: Need one name and type for each column

sessionInfo()
R version 3.1.3 (2015-03-09)
Platform: x86_64-unknown-linux-gnu (64-bit)
Running under: Ubuntu 14.04.2 LTS

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] readxl_0.0.0.9000

loaded via a namespace (and not attached):
[1] Rcpp_0.11.5 tools_3.1.3

Blank cells become 0

Blank cells are treated as 0 (numeric) or the Epoch "1970-01-01 00:00:00" (time), rather than NA.

Also, time values without date such as "12:34:56" becomes "1899-12-30 12:34:56".

Readxl add a date for a time format

With the file times_1904.xls, read_excel add a date 1904-01-01

> read_excel("times_1904.xls")
Source: local data frame [11 x 1]

                  Time
1  1904-01-01 01:02:03
2  1904-01-01 02:45:56
3  1904-01-01 04:29:49
4  1904-01-01 06:13:42
5  1904-01-01 07:57:35
6  1904-01-01 09:41:28
7  1904-01-01 11:25:21
8  1904-01-01 13:09:14
9  1904-01-01 14:53:07
10 1904-01-01 16:37:00
11 1904-01-01 18:20:53

When it should behave like Excel.

Pandas does it right (it is their test files ;) )

In [5]: pd.read_excel("times_1904.xls")
Out[5]:
               Time
0          01:02:03
1   02:45:56.100000
2   04:29:49.200000
3   06:13:42.300000
4   07:57:35.400000
5   09:41:28.500000
6   11:25:21.600000
7   13:09:14.700000
8   14:53:07.800000
9   16:37:00.900000
10         18:20:54

Same issue with times_1900.xls

readxl reads n-1 columns

I run into problem that readxl omits last column while reading xlsx file. Below, please find the code and here is link for the file.

library(readxl)
library(openxlsx)
> dsin <- read_excel('/Users/MaciejBeresewicz/Documents/data_example.xlsx')
> dim(dsin)
[1] 9 6
> dsin2 <- readWorkbook('/Users/MaciejBeresewicz/Documents/data_example.xlsx')
> dim(dsin2)
[1] 9 7
> head(dsin)
Source: local data frame [6 x 6]

  Count   City    Date     Median    Average Rooms
1     2 poznan 2011-01   541992.5   541992.5     0
2     1 poznan 2011-01 99999999.0 99999999.0     0
3     1 poznan 2011-01      135.0      135.0     0
4    47 poznan 2011-01   169000.0   185026.7     0
5   193 poznan 2011-01   219000.0   226291.5     0
6   237 poznan 2011-01   270000.0   286274.1     0
> head(dsin2)
  Count   City    Date     Median    Average Rooms Area
1     2 poznan 2011-01   541992.5   541992.5     0    0
2     1 poznan 2011-01 99999999.0 99999999.0     0   10
3     1 poznan 2011-01      135.0      135.0     0   20
4    47 poznan 2011-01   169000.0   185026.7     0   30
5   193 poznan 2011-01   219000.0   226291.5     0   40
6   237 poznan 2011-01   270000.0   286274.1     0   50

Session Info:

R version 3.1.3 (2015-03-09)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.9.5 (Mavericks)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ggplot2_1.0.1     dplyr_0.4.1       openxlsx_2.4.0    readxl_0.0.0.9000

loaded via a namespace (and not attached):
 [1] assertthat_0.1   colorspace_1.2-6 DBI_0.3.1        digest_0.6.8    
 [5] grid_3.1.3       gtable_0.1.2     magrittr_1.5     MASS_7.3-40     
 [9] munsell_0.4.2    parallel_3.1.3   plyr_1.8.1       proto_0.3-10    
[13] Rcpp_0.11.5      reshape2_1.4.1   scales_0.2.4     stringr_0.6.2   
[17] tools_3.1.3

`#NV` (Excel function error) translate to string `error` instead of `NA`

I read in values from a table where some cells with functions have an error and show #NV. While reading this data works, those function errors are translated to error within R. This is surprising and might lead to further errors along the way. I think, NA (the value for not defined/not available data) should be used instead.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.