r-lib / archive Goto Github PK
View Code? Open in Web Editor NEWR bindings to libarchive, supporting a large variety of archive formats
Home Page: https://archive.r-lib.org/
License: Other
R bindings to libarchive, supporting a large variety of archive formats
Home Page: https://archive.r-lib.org/
License: Other
Hi,
I tried to pack some file to « .rar » with the archive R package but I was not able to. From my point of view, it would be great if we could add this feature to the archive package.
Best regards,
Emmanuel
hi, thanks for the awesome software. here's a minimal reproducible example that walks through a file that fails with archive_extract
but not 7-zip software on unix. the problem does not occur on windows
## warning: large download
# example from the PISA 2015 website to show there's no download or site-specific problem
fn <- "http://vs-web-fs-1.oecd.org/pisa/PUF_SAS_COMBINED_CMB_SCH_QQQ.zip"
tf <- tempfile()
tf2 <- tempfile()
download.file( fn , tf , mode = 'wb' )
file.info( tf )
file.copy( tf , tf2 )
# works fine
system(paste0("7za e -o'" , tempdir() , "' '" , tf2 , "'"))
R.utils::countLines( file.path( tempdir() , "cy6_ms_cmb_sch_qqq.sas7bdat" ) )
file.remove( file.path( tempdir() , "cy6_ms_cmb_sch_qqq.sas7bdat" ) )
# works fine
archive::archive_extract( tf , dir = tempdir() )
R.utils::countLines( file.path( tempdir() , "cy6_ms_cmb_sch_qqq.sas7bdat" ) )
file.remove( tf )
file.remove( tf2 )
# much larger file from the PISA 2015
fn <- "http://vs-web-fs-1.oecd.org/pisa/PUF_SAS_COMBINED_CMB_STU_QQQ.zip"
download.file( fn , tf , mode = 'wb' )
file.info( tf )
file.copy( tf , tf2 )
# works fine
system(paste0("7za e -o'" , tempdir() , "' '" , tf2 , "'"))
R.utils::countLines( file.path( tempdir() , "cy6_ms_cmb_stu_qq2.sas7bdat" ) )
file.remove( file.path( tempdir() , "cy6_ms_cmb_stu_qq2.sas7bdat" ) )
# FAILS on unix WORKS on windows
archive::archive_extract( tf , dir = tempdir() )
# archive_extract() appears to have worked, but the extracted files are all blank
R.utils::countLines( file.path( tempdir() , "cy6_ms_cmb_stu_qq2.sas7bdat" ) )
I'm trying to install on Ubuntu 16.04 and get the following error:
*** installing help indices
** building package indices
** testing if installed package can be loaded
Error: package or namespace load failed for ‘archive’ in dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object '/home/austin/lib/R/library/archive/libs/archive.so':
/home/austin/lib/R/library/archive/libs/archive.so: undefined symbol: archive_write_set_format_raw
Error: loading failed
Execution halted
ERROR: loading failed
I've tried installing with devtools directly and also cloning the archive. Any suggestions?
Prepare for release:
devtools::build_readme()
urlchecker::url_check()
devtools::check(remote = TRUE, manual = TRUE)
devtools::check_win_devel()
rhub::check_for_cran()
rhub::check(platform = 'ubuntu-rchk')
rhub::check_with_sanitizers()
revdepcheck::revdep_check(num_workers = 4)
cran-comments.md
Submit to CRAN:
usethis::use_version('minor')
devtools::submit_cran()
Wait for CRAN...
usethis::use_github_release()
usethis::use_dev_version()
Hi Jim,
Not sure if this related to 'archive' or 'readRDS' or an interaction between the two.
This crash only seems to occur with tar
or cpio
compressors. zip
works fine.
# Using save/readRDS with format = 'tar' or 'cpio' will crash
saveRDS(mtcars, archive_write(archive = "archive.file", file = 'first', format = 'tar'))
for (i in 1:100) {
zz <- readRDS(archive_read(archive = "archive.file", format='tar'))
}
R(9561,0x7fffac92c380) malloc: *** error for object 0x10a6fd200: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
Abort trap: 6
If save/readRDS replaced by write/read.csv it works fine. No crash.
# Using write.csv/read.csv works
write.csv(mtcars, archive_write(archive = "archive.file", file = 'first', format = 'tar'))
for (i in 1:100) {
zz <- read.csv(archive_read(archive = "archive.file", format='tar'))
}
# * installing *source* package ‘archive’ ...
# PKG_CFLAGS=-I/usr/local/Cellar/libarchive/3.3.3/include -I/usr/local/Cellar/xz/5.2.4/include
# PKG_LIBS=-L/usr/local/Cellar/libarchive/3.3.3/lib -L/usr/local/Cellar/xz/5.2.4/lib -larchive -lexpat -llzma -lzstd -llz4 -lbz2 -lz -llzma -D_THREAD_SAFE -pthread
> devtools::session_info()
Session info ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
setting value
version R version 3.5.1 (2018-07-02)
system x86_64, darwin15.6.0
ui RStudio (1.2.907)
language (EN)
collate en_AU.UTF-8
date 2018-09-26
Packages ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
package * version date source
archive * 1.0.0 2018-09-25 Github (jimhester/archive@11e65d7)
base * 3.5.1 2018-07-05 local
compiler 3.5.1 2018-07-05 local
crayon 1.3.4 2017-09-16 CRAN (R 3.5.0)
datasets * 3.5.1 2018-07-05 local
devtools 1.13.6 2018-06-27 CRAN (R 3.5.0)
digest 0.6.15 2018-01-28 CRAN (R 3.5.0)
glue 1.3.0 2018-09-25 Github (tidyverse/glue@4e74901)
graphics * 3.5.1 2018-07-05 local
grDevices * 3.5.1 2018-07-05 local
memoise 1.1.0 2017-04-21 CRAN (R 3.5.0)
methods * 3.5.1 2018-07-05 local
packrat 0.4.9-3 2018-06-01 CRAN (R 3.5.0)
pillar 1.3.0 2018-07-14 CRAN (R 3.5.0)
Rcpp 0.12.18 2018-07-23 cran (@0.12.18)
rlang 0.2.1.9000 2018-08-10 Github (r-lib/rlang@8dc87a9)
rstudioapi 0.7 2017-09-07 CRAN (R 3.5.0)
stats * 3.5.1 2018-07-05 local
tibble 1.4.2 2018-01-22 CRAN (R 3.5.0)
tools 3.5.1 2018-07-05 local
utils * 3.5.1 2018-07-05 local
withr 2.1.2 2018-03-15 CRAN (R 3.5.0)
AFAICT there is not functionality equivalent to utils::unzip(..., list = TRUE)
, to list the files in the archive. This can be quite useful for examining the contents of an archive and then selecting a subset of files to extract. This could be an argument to archive_extract()
, or maybe a function like archive_ls()
.
For now .lz (lzip) file are unrecognized format. would it be possible to change this?
download.file(url = "https://parltrack.org/dumps/ep_votes.json.lz",
destfile = "ep_votes.json.lz",
mode = "wb")
archive("ep_votes.json.lz")
# Erreur : archive.cpp:37 archive_read_open1(): Unrecognized archive format
Hi @jimhester , this may not be practical given the constraints of the library, but just wanted to ask anyway since I've always really appreciated the very precise implementation of progress bars in vroom
etc. Would it be possible for functions like archive::extract_archive()
to display progress in some way?
Do we need to install libarchive-dev before installing this R Package? Is there a way that all the dependencies can be built within the installation of this R Package.
Currently, when I try to install the package from github, I get the following message
devtools::install_github("jimhester/archive")
Note: no visible binding for global variable '.Data'
Downloading GitHub repo jimhester/archive@master
from URL https://api.github.com/repos/jimhester/archive/zipball/master
Installing archive
'/usr/lib/R/bin/R' --no-site-file --no-environ --no-save --no-restore --quiet CMD INSTALL
'/tmp/RtmpyKaB6I/devtools4a93024acee/jimhester-archive-a203312'
--library='/home/rstudio/R/x86_64-pc-linux-gnu-library/3.2' --install-tests
ERROR: configuration failed for package ‘archive’
Currently archive_write always creates a new archive. It would be useful to add new files to an existing archive as well as appending to a file within the archive.
Hi, this package works wonders for 7z products in R. Thank you!
Are there any plans for it to return to CRAN? It would be useful as a dependency for packages aiming for a CRAN submission.
Mike
Prepare for release:
devtools::build_readme()
urlchecker::url_check()
devtools::check(remote = TRUE, manual = TRUE)
devtools::check_win_devel()
rhub::check_for_cran()
rhub::check(platform = 'ubuntu-rchk')
rhub::check_with_sanitizers()
revdepcheck::cloud_check()
cran-comments.md
Submit to CRAN:
usethis::use_version('patch')
devtools::submit_cran()
Wait for CRAN...
usethis::use_github_release()
usethis::use_dev_version()
Feature request: Expose the individual format/filter configuration options to the R interface i.e. archive_compressor_[name]_options
and archive_filter_[name]_options
For example, zstd
defaults to compression level 3 and there's no way to configure this from R.
When i try use archive my R session drops.
See the reproducible example below:
library(RCurl)
library(archive)
library(readr)
caged_url <- paste0("ftp://ftp.mtps.gov.br/pdet/microdados/CAGED/",year(Sys.Date()),"/")
caged_arquivos <- getURL(url = caged_url,
verbose=TRUE,ftp.use.epsv=TRUE, dirlistonly = TRUE)
caged_arquivos <- unlist(strsplit(caged_arquivos, "\r\n"))
caged_arquivos <- sort(caged_arquivos, decreasing = FALSE)
#Define o nome do último arquivo a ser baixado
ultimo_arquivo <- tail(caged_arquivos,n=1)
endereco_ultimo_arquivo <- paste0(caged_url,ultimo_arquivo)
bin <- getBinaryURL(endereco_ultimo_arquivo,ssl.verifypeer=FALSE)
con <- file(ultimo_arquivo, open = "wb")
writeBin(bin, con)
close(con)
caged <- archive("CAGEDEST_092017.7z")
caged <- read_csv(file = archive_read(caged),col_types = cols())
Many thanks!
hi, archive_extract
fails on unix and windows but R's default unzip()
function succeeds..thank you
tf <- tempfile()
download.file( "http://download.inep.gov.br/microdados/micro_enem1998.zip" , tf , mode = 'wb' )
first_zipped_file <- unzip( tf , exdir = tempdir() )
second_zipped_file <- grep( "\\.zip$" , first_zipped_file , value = TRUE )
# works
windows_unzip <- unzip( second_zipped_file , exdir = tempdir() )
R.utils::countLines( windows_unzip )
file.remove( windows_unzip )
# works
system(paste0("7za e -o'" , tempdir() , "' '" , second_zipped_file , "'"))
R.utils::countLines( windows_unzip )
file.remove( windows_unzip )
# fails
archive::archive_extract( second_zipped_file , dir = tempdir() )
R.utils::countLines( windows_unzip )
I ran the following reproducible code
library(archive)
tf <- tempfile() ; td <- tempdir()
file.path <- "https://www.kaggle.com/c/favorita-grocery-sales-forecasting/download/holidays_events.csv.7z"
download.file(url = file.path, destfile = tf, mode = "wb")
archive(tf)
I get the error
Error in archive_metadata(path) : Unrecognized archive format
I was wondering if unzipping .csv.7z files was supported.
I have a winrar file which is a group of 7 csv files. In the below code, if I run one file at a time, it works but when I use a "For Loop" against the set of 7 files, R gets hanged and has to be restarted
Code:
library(archive)
library(readr)
setwd("C:/Users/debsush/Documents/IEOD/")
EOD <- data.frame(matrix(NA, nrow = 1, ncol = 11),stringsAsFactors = FALSE)
colnames(EOD) <- c("TICKER","NAME","PRODUCT","EXCHANGE","DATE","OPEN","HIGH","LOW","CLOSE","VOLUME","IO")
#Remove all NA rows from GStocksFinancials
EOD<- EOD[which(!is.na(EOD$TICKER)), ]
#Process Data
doclist = list.files()
archive <- archive(doclist[[1]])
arcFile=NULL
for (counter in 1:6){
fileName = toupper(archive$path[counter])
if (length(grep(".CSV",fileName))>0 && grep(".CSV",fileName)==1) {
if (length(grep("BSE",fileName))>0 && grep("BSE",fileName)==1){
exchange="BSE"
} else {
exchange ="NSE"
}
if (length(grep("INDICES",fileName))>0 && grep("INDICES",fileName)==1){
product="EQINDEX"
} else if (length(grep("OPTIONS",fileName))>0 && grep("OPTIONS",fileName)==1){
product="OPTIONS"
} else if (length(grep("FOREX",fileName))>0 && grep("FOREX",fileName)==1){
product ="FOREXFUT"
} else if (length(grep("FUT",fileName))>0 && grep("FUT",fileName)==1){
product ="EQFUT"
} else {
product ="EQCASH"
}
arcFile = archive_read(archive,archive$path[counter])
file=read_csv(arcFile,col_type=cols())
arcFile=NULL
cat(counter)
file$exchange=exchange
file$product=product
colnames(file)=c("TICKER","NAME","DATE","OPEN","HIGH","LOW","CLOSE","VOLUME","IO","EXCHANGE","PRODUCT")
file=file[,c("TICKER","NAME","PRODUCT","EXCHANGE","DATE","OPEN","HIGH","LOW","CLOSE","VOLUME","IO")]
EOD = rbind(EOD,file)
}
}
I am not sure if it an issue with archive or readr. The behavior is very random and unpredictable.
PS: I can share the winrar file if that helps. Github does not allow attaching winrar files so I would have to email you.
Regards
SD
First release:
usethis::use_cran_comments()
Title:
and Description:
@returns
and @examples
Authors@R:
includes a copyright holder (role 'cph')Prepare for release:
devtools::build_readme()
urlchecker::url_check()
devtools::check(remote = TRUE, manual = TRUE)
devtools::check_win_devel()
rhub::check_for_cran()
rhub::check(platform = 'ubuntu-rchk')
rhub::check_with_sanitizers()
Submit to CRAN:
usethis::use_version('major')
devtools::submit_cran()
Wait for CRAN...
usethis::use_github_release()
usethis::use_dev_version()
archive
returns the following error when extracting .rar
file:
> archive_extract("frota_por_municipio_e_tipo_1-2015.rar")
Error in archive_extract_(attr(archive, "path"), file) :
archive_read_data_block(): Parsing filters is unsupported.
The file is extracted, but its size is 0 bytes
The extraction is done correctly with:
unrar e frota_por_municipio_e_tipo_1-2015.rar
R version 4.0.4
libarchive 3.5.1-1
unrar 1:6.0.3-1
Here:
Line 19 in c290517
Apparently zip is one of the few formats that doesn't require knowing the file size up front, so we could stream into zip directly from the connection rather than using the scratch file.
Not sure if this is worth the effort...
Looks like a great package, facing this currently when trying to install:
ERROR: configuration failed for package ‘archive’
fatal error: 'archive.h' file not found
Any other package dependencies for archive?
Hi
I get a note about an undefined collapse_quote_transformer when checking after building from source.
It happens here: https://github.com/jimhester/archive/blob/master/R/utils.R#L28
It seems that archive
is having trouble dealing with my system libarchive
installation. I am on Ubuntu 16.04 and I initially tried to apt-get
libarchive-dev(version 3.1.2).
archive` installation failed almost immediately with the message:
archive.cpp: In function ‘Rcpp::IntegerVector archive_filters()’:
archive.cpp:61:26: error: ‘ARCHIVE_FILTER_LZ4’ was not declared in this scope
, Rcpp::_["lz4"] = ARCHIVE_FILTER_LZ4
Then, I built libarchive from source (version 3.3.1). Now the installation proceeds further but ends with this error message:
Error: package or namespace load failed for ‘archive’ in dyn.load(file, DLLpath = DLLpath, ...):
unable to load shared object '~/R/x86_64-pc-linux-gnu-library/3.4/archive/libs/archive.so':
~/R/x86_64-pc-linux-gnu-library/3.4/archive/libs/archive.so: undefined symbol: archive_write_set_format_raw
Error: loading failed
Any ideas?
Hi,
it's possible to extract a specific file to a specific directory without all directories in archive file?
Exemple
Create new zip file with one subdirectory
a <- archive(system.file(package = "archive", "extdata", "data.zip"))
d <- tempfile()
archive_extract(a, d, c("iris.csv", "airquality.csv"))
dir.create(paste(d,"TEST", sep = "\\"))
file.rename(paste(d,"iris.csv", sep = "\\"), paste(d,"TEST", "iris.csv", sep = "\\"))
z <- tempfile(fileext = ".zip")
archive_write_dir(z,d)
archive(z)
Result:
> archive(z)
# A tibble: 2 x 3
path size date
<chr> <dbl> <dttm>
1 airquality.csv 142 2017-04-28 21:55:29
2 TEST/iris.csv 192 2017-04-28 21:55:29
Extract file iris.csv
td <- tempfile()
archive_extract(z,td,"TEST/iris.csv")
list.files(td,recursive = T)
Result:
> list.files(td,recursive = T)
[1] "TEST/iris.csv"
But i only want iris.csv in my temporary directory like that
> list.files(td,recursive = T)
[1] "iris.csv"
Thx
The master
branch of this repository will soon be renamed to main
, as part of a coordinated change across several GitHub organizations (including, but not limited to: tidyverse, r-lib, tidymodels, and sol-eng). We anticipate this will happen by the end of September 2021.
That will be preceded by a release of the usethis package, which will gain some functionality around detecting and adapting to a renamed default branch. There will also be a blog post at the time of this master
--> main
change.
The purpose of this issue is to:
message id: euphoric_snowdog
This may be related to #17
saveRDS()
with archive
, it eventually results in a warning about unclosed connections.write.csv()
.format
set to tar, cpio and zip.for (i in 1:100) {
saveRDS(mtcars, archive::archive_write(archive = "archive.file", file = 'first', format = 'tar'))
}
head(warnings())
Warning messages:
1: In .Internal(textConnection(nm, object, open, env, type)) :
closing unused connection 127 (input)
2: In .Internal(textConnection(nm, object, open, env, type)) :
closing unused connection 126 (input)
3: In .Internal(textConnection(nm, object, open, env, type)) :
closing unused connection 125 (input)
4: In .Internal(textConnection(nm, object, open, env, type)) :
closing unused connection 124 (input)
5: In .Internal(textConnection(nm, object, open, env, type)) :
closing unused connection 123 (input)
6: In .Internal(textConnection(nm, object, open, env, type)) :
closing unused connection 122 (input)
# * installing *source* package ‘archive’ ...
# PKG_CFLAGS=-I/usr/local/Cellar/libarchive/3.3.3/include -I/usr/local/Cellar/xz/5.2.4/include
# PKG_LIBS=-L/usr/local/Cellar/libarchive/3.3.3/lib -L/usr/local/Cellar/xz/5.2.4/lib -larchive -lexpat -llzma -lzstd -llz4 -lbz2 -lz -llzma -D_THREAD_SAFE -pthread
> devtools::session_info()
Session info ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
setting value
version R version 3.5.1 (2018-07-02)
system x86_64, darwin15.6.0
ui RStudio (1.2.907)
language (EN)
collate en_AU.UTF-8
date 2018-09-26
Packages ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
package * version date source
archive * 1.0.0 2018-09-25 Github (jimhester/archive@11e65d7)
base * 3.5.1 2018-07-05 local
compiler 3.5.1 2018-07-05 local
crayon 1.3.4 2017-09-16 CRAN (R 3.5.0)
datasets * 3.5.1 2018-07-05 local
devtools 1.13.6 2018-06-27 CRAN (R 3.5.0)
digest 0.6.15 2018-01-28 CRAN (R 3.5.0)
glue 1.3.0 2018-09-25 Github (tidyverse/glue@4e74901)
graphics * 3.5.1 2018-07-05 local
grDevices * 3.5.1 2018-07-05 local
memoise 1.1.0 2017-04-21 CRAN (R 3.5.0)
methods * 3.5.1 2018-07-05 local
packrat 0.4.9-3 2018-06-01 CRAN (R 3.5.0)
pillar 1.3.0 2018-07-14 CRAN (R 3.5.0)
Rcpp 0.12.18 2018-07-23 cran (@0.12.18)
rlang 0.2.1.9000 2018-08-10 Github (r-lib/rlang@8dc87a9)
rstudioapi 0.7 2017-09-07 CRAN (R 3.5.0)
stats * 3.5.1 2018-07-05 local
tibble 1.4.2 2018-01-22 CRAN (R 3.5.0)
tools 3.5.1 2018-07-05 local
utils * 3.5.1 2018-07-05 local
withr 2.1.2 2018-03-15 CRAN (R 3.5.0)
I'm using archive v1.1.2 running on Garurda Linux, an Arch derivative. As a rolling release, it is completely up-to-date, including R 4.1.1.
When I use archive to generate a list of files in a .7z archive, it works, but generates a warning that is odd to me. It says:
Setting UTF-8 locale failed
I have never changed locales in any way in R or Linux, so I'm confused. Using the locale command in the terminal gives me all UTF-8, with the exception of LC_ALL:
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC=en_US.UTF-8
LC_TIME=en_US.UTF-8
LC_COLLATE="en_US.UTF-8"
LC_MONETARY=en_US.UTF-8
LC_MESSAGES="en_US.UTF-8"
LC_PAPER=en_US.UTF-8
LC_NAME=en_US.UTF-8
LC_ADDRESS=en_US.UTF-8
LC_TELEPHONE=en_US.UTF-8
LC_MEASUREMENT=en_US.UTF-8
LC_IDENTIFICATION=en_US.UTF-8
LC_ALL=
Is this something I should be worried about?
See also the CI
Last 13 lines of output:
== Failed tests ================================================================
-- Failure (test-archive.R:258:5): archive_write_files: can write a zip file ---
read.csv(unz("data.zip", files[["iris"]]), row.names = 1) not equal to `iris`.
Component "Species": Modes: character, numeric
Component "Species": Attributes: < target is NULL, current is list >
Component "Species": target is character, current is factor
-- Failure (test-archive.R:280:5): archive_write_dir: can write a zip file -----
read.csv(unz("data.zip", files[["iris"]]), row.names = 1) not equal to `iris`.
Component "Species": Modes: character, numeric
Component "Species": Attributes: < target is NULL, current is list >
Component "Species": target is character, current is factor
[ FAIL 2 | WARN 0 | SKIP 0 | PASS 87 ]
Error: Test failures
Execution halted
I am running Ubuntu 20.04, and had successfully install archive with devtools in the previous 4.0.x versions of R. But now, I'm getting a strange installation failure. I definitely have libarchive-dev installed.
sudo apt install libarchive-dev
Reading package lists... Done
Building dependency tree
Reading state information... Done
libarchive-dev is already the newest version (3.4.0-2ubuntu1).
Any ideas what I can do?
Running `R CMD build`...
* checking for file ‘/tmp/RtmpSXhw5i/remotes4a59f2fd4cfb7/jimhester-archive-8ce0ba7/DESCRIPTION’ ... OK
* preparing ‘archive’:
* checking DESCRIPTION meta-information ... OK
* cleaning src
* running ‘cleanup’
* installing the package to process help pages
-----------------------------------
* installing *source* package ‘archive’ ...
** using staged installation
Found pkg-config cflags and libs!
'config' variable 'CXXCPP' is defunct
PKG_CFLAGS=
PKG_LIBS=-larchive
./configure: line 53: -g: command not found
------------------------- ANTICONF ERROR ---------------------------
Configuration failed because libarchive was not found. Try installing:
* deb: libarchive-dev (Debian, Ubuntu, etc)
* rpm: libarchive-devel (Fedora, CentOS, RHEL)
* csw: libarchive_dev (Solaris)
* brew: libarchive (Mac OSX)
If libarchive is already installed, check that 'pkg-config' is in your
PATH and PKG_CONFIG_PATH contains a libarchive.pc file. If pkg-config
is unavailable you can set INCLUDE_DIR and LIB_DIR manually via:
R CMD INSTALL --configure-vars='INCLUDE_DIR=... LIB_DIR=...'
--------------------------------------------------------------------
ERROR: configuration failed for package ‘archive’
* removing ‘/tmp/RtmpTk4Goh/Rinst4a8833af25591/archive’
-----------------------------------
ERROR: package installation failed
STDOUT:
* checking for file ‘/tmp/RtmpSXhw5i/remotes4a59f2fd4cfb7/jimhester-archive-8ce0ba7/DESCRIPTION’ ... OK
* preparing ‘archive’:
* checking DESCRIPTION meta-information ... OK
* cleaning src
* running ‘cleanup’
* installing the package to process help pages
-----------------------------------
* installing *source* package ‘archive’ ...
** using staged installation
Found pkg-config cflags and libs!
'config' variable 'CXXCPP' is defunct
PKG_CFLAGS=
PKG_LIBS=-larchive
./configure: line 53: -g: command not found
------------------------- ANTICONF ERROR ---------------------------
Configuration failed because libarchive was not found. Try installing:
* deb: libarchive-dev (Debian, Ubuntu, etc)
* rpm: libarchive-devel (Fedora, CentOS, RHEL)
* csw: libarchive_dev (Solaris)
* brew: libarchive (Mac OSX)
If libarchive is already installed, check that 'pkg-config' is in your
PATH and PKG_CONFIG_PATH contains a libarchive.pc file. If pkg-config
is unavailable you can set INCLUDE_DIR and LIB_DIR manually via:
R CMD INSTALL --configure-vars='INCLUDE_DIR=... LIB_DIR=...'
--------------------------------------------------------------------
ERROR: configuration failed for package ‘archive’
* removing ‘/tmp/RtmpTk4Goh/Rinst4a8833af25591/archive’
-----------------------------------
ERROR: package installation failed
STDERR:
Error: Failed to install 'archive' from GitHub:
Failed to `R CMD build` package, try `build = FALSE`.
Specifying a non-existent format
when creating an archive_write
connection will still create the connection, but will cause a segfault.
In the following minimal reprex, I'm using a double close(con)
to cause a segfault. The first close(con)
throws an error we expect (i.e. No such format). The second close(con)
causes the segfault.
Not sure if the bad format
should be caught early to stop this from happening at all, or if the segfault is indicative of a sneaky memory error elsewhere.
con = archive::archive_write(archive = "archive.something", file="Robject", format='bad_and_wrong')
open(con)
write.csv(mtcars, con)
close(con)
Error in close.connection(con) : No such format
close(con)
*** caught segfault ***
address 0x0, cause 'unknown'
Traceback:
1: close.connection(con)
2: close(con)
Running latest archive
from github.
> devtools::session_info()
Session info ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
setting value
version R version 3.5.1 (2018-07-02)
system x86_64, darwin15.6.0
ui RStudio (1.2.907)
language (EN)
collate en_AU.UTF-8
date 2018-09-28
Packages -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
package * version date source
archive * 1.0.0 2018-09-28 local
base * 3.5.1 2018-07-05 local
codetools 0.2-15 2016-10-05 CRAN (R 3.5.1)
compiler 3.5.1 2018-07-05 local
crayon 1.3.4 2017-09-16 CRAN (R 3.5.0)
datasets * 3.5.1 2018-07-05 local
devtools 1.13.6 2018-06-27 CRAN (R 3.5.0)
digest 0.6.15 2018-01-28 CRAN (R 3.5.0)
glue 1.3.0 2018-09-28 Github (tidyverse/glue@4e74901)
graphics * 3.5.1 2018-07-05 local
grDevices * 3.5.1 2018-07-05 local
magrittr 1.5 2014-11-22 CRAN (R 3.5.0)
memoise 1.1.0 2017-04-21 CRAN (R 3.5.0)
methods * 3.5.1 2018-07-05 local
packrat 0.4.9-3 2018-06-01 CRAN (R 3.5.0)
pillar 1.3.0 2018-07-14 CRAN (R 3.5.0)
pryr 0.1.4 2018-02-18 CRAN (R 3.5.0)
Rcpp 0.12.18 2018-07-23 cran (@0.12.18)
rlang 0.2.1.9000 2018-08-10 Github (r-lib/rlang@8dc87a9)
rstudioapi 0.7 2017-09-07 CRAN (R 3.5.0)
stats * 3.5.1 2018-07-05 local
stringi 1.2.4 2018-07-20 cran (@1.2.4)
stringr 1.3.1 2018-05-10 CRAN (R 3.5.0)
tibble 1.4.2 2018-01-22 CRAN (R 3.5.0)
tools 3.5.1 2018-07-05 local
utils * 3.5.1 2018-07-05 local
withr 2.1.2 2018-03-15 CRAN (R 3.5.0)
libarchive supports reading over network sockets if you define your own IO functions. https://github.com/libarchive/libarchive/wiki/LibarchiveIO.
However doing this seems somewhat non-trivial and I am not sure it is worth the implementation effort.
I've successfully installed archive in R 3.6.x (Windows) with the previous Rtools package. Now archive installation fails with R 4.0.2 and the new Rtools40. Perhaps something is missing in the new toolchain?
The error is as follows:
** using staged installation
** libs
"C:/Programs/R/R-40~1.2/bin/x64/Rscript.exe" "../tools/winlibs.R"
"C:/Programs/rtools40/mingw64/bin/"g++ -std=gnu++11 -I"C:/Programs/R/R-40~1.2/include" -DNDEBUG -I../windows/libarchive-3.2.2/include -I. -I'C:/Programs/R/R-4.0.2/library/Rcpp/include' -O3 -march=native -c RcppExports.cpp -o RcppExports.o
"C:/Programs/rtools40/mingw64/bin/"g++ -std=gnu++11 -I"C:/Programs/R/R-40~1.2/include" -DNDEBUG -I../windows/libarchive-3.2.2/include -I. -I'C:/Programs/R/R-4.0.2/library/Rcpp/include' -O3 -march=native -c archive.cpp -o archive.o
"C:/Programs/rtools40/mingw64/bin/"g++ -std=gnu++11 -I"C:/Programs/R/R-40~1.2/include" -DNDEBUG -I../windows/libarchive-3.2.2/include -I. -I'C:/Programs/R/R-4.0.2/library/Rcpp/include' -O3 -march=native -c extract.cpp -o extract.o
"C:/Programs/rtools40/mingw64/bin/"g++ -std=gnu++11 -I"C:/Programs/R/R-40~1.2/include" -DNDEBUG -I../windows/libarchive-3.2.2/include -I. -I'C:/Programs/R/R-4.0.2/library/Rcpp/include' -O3 -march=native -c r_archive.cpp -o r_archive.o
"C:/Programs/rtools40/mingw64/bin/"g++ -std=gnu++11 -I"C:/Programs/R/R-40~1.2/include" -DNDEBUG -I../windows/libarchive-3.2.2/include -I. -I'C:/Programs/R/R-4.0.2/library/Rcpp/include' -O3 -march=native -c read.cpp -o read.o
"C:/Programs/rtools40/mingw64/bin/"g++ -std=gnu++11 -I"C:/Programs/R/R-40~1.2/include" -DNDEBUG -I../windows/libarchive-3.2.2/include -I. -I'C:/Programs/R/R-4.0.2/library/Rcpp/include' -O3 -march=native -c read_file.cpp -o read_file.o
"C:/Programs/rtools40/mingw64/bin/"g++ -std=gnu++11 -I"C:/Programs/R/R-40~1.2/include" -DNDEBUG -I../windows/libarchive-3.2.2/include -I. -I'C:/Programs/R/R-4.0.2/library/Rcpp/include' -O3 -march=native -c write.cpp -o write.o
"C:/Programs/rtools40/mingw64/bin/"g++ -std=gnu++11 -I"C:/Programs/R/R-40~1.2/include" -DNDEBUG -I../windows/libarchive-3.2.2/include -I. -I'C:/Programs/R/R-4.0.2/library/Rcpp/include' -O3 -march=native -c write_file.cpp -o write_file.o
C:/Programs/rtools40/mingw64/bin/g++ -shared -s -static-libgcc -o archive.dll tmp.def RcppExports.o archive.o extract.o r_archive.o read.o read_file.o write.o write_file.o -L../windows/libarchive-3.2.2/lib/x64 -larchive -lcrypto -lnettle -lregex -lexpat -llzo2 -llzma -llz4 -lbz2 -lz -LC:/Programs/R/R-40~1.2/bin/x64 -lR
C:/Programs/rtools40/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: ../windows/libarchive-3.2.2/lib/x64/libarchive.a(archive_string.o):(.text+0x2e1): undefined reference to `locale_charset'
C:/Programs/rtools40/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: ../windows/libarchive-3.2.2/lib/x64/libarchive.a(archive_string.o):(.text+0x366): undefined reference to `libiconv_close'
C:/Programs/rtools40/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: ../windows/libarchive-3.2.2/lib/x64/libarchive.a(archive_string.o):(.text+0x375): undefined reference to `libiconv_close'
C:/Programs/rtools40/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: ../windows/libarchive-3.2.2/lib/x64/libarchive.a(archive_string.o):(.text+0x1318): undefined reference to `libiconv_open'
C:/Programs/rtools40/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: ../windows/libarchive-3.2.2/lib/x64/libarchive.a(archive_string.o):(.text+0x133f): undefined reference to `libiconv_open'
C:/Programs/rtools40/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: ../windows/libarchive-3.2.2/lib/x64/libarchive.a(archive_string.o):(.text+0x1382): undefined reference to `libiconv_open'
C:/Programs/rtools40/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: ../windows/libarchive-3.2.2/lib/x64/libarchive.a(archive_string.o):(.text+0x157d): undefined reference to `libiconv_open'
C:/Programs/rtools40/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: ../windows/libarchive-3.2.2/lib/x64/libarchive.a(archive_string.o):(.text+0x159b): undefined reference to `libiconv_open'
C:/Programs/rtools40/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: ../windows/libarchive-3.2.2/lib/x64/libarchive.a(archive_string.o):(.text+0x4990): undefined reference to `libiconv'
C:/Programs/rtools40/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: ../windows/libarchive-3.2.2/lib/x64/libarchive.a(archive_string.o):(.text+0x317): undefined reference to `locale_charset'
collect2.exe: error: ld returned 1 exit status
no DLL was created
ERROR: compilation failed for package 'archive'
- removing 'C:/Users/Boris/AppData/Local/Temp/Rtmpct45WA/Rinst8db46030695b/archive'
-----------------------------------
ERROR: package installation failed
Error: Failed to install 'archive' from GitHub:
System command 'Rcmd.exe' failed, exit status: 1, stdout + stderr (last 10 lines):
E> C:/Programs/rtools40/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: ../windows/libarchive-3.2.2/lib/x64/libarchive.a(archive_string.o):(.text+0x157d): undefined reference to `libiconv_open'
E> C:/Programs/rtools40/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: ../windows/libarchive-3.2.2/lib/x64/libarchive.a(archive_string.o):(.text+0x159b): undefined reference to `libiconv_open'
E> C:/Programs/rtools40/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: ../windows/libarchive-3.2.2/lib/x64/libarchive.a(archive_string.o):(.text+0x4990): undefined reference to `libiconv'
E> C:/Programs/rtools40/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.3.0/../../../../x86_64-w64-mingw32/bin/ld.exe: ../windows/libarchive-3.2.2/lib/x64/libarchiv
Prepare for release:
devtools::build_readme()
urlchecker::url_check()
devtools::check(remote = TRUE, manual = TRUE)
devtools::check_win_devel()
rhub::check_for_cran()
rhub::check(platform = 'ubuntu-rchk')
rhub::check_with_sanitizers()
revdepcheck::revdep_check(num_workers = 4)
cran-comments.md
Submit to CRAN:
usethis::use_version('patch')
devtools::submit_cran()
Wait for CRAN...
usethis::use_github_release()
usethis::use_dev_version()
Empty archive created when dir
is a symlinked location.
Workaround is to normalizePath()
, but might be worth having the package automatically normalize the path for the user.
Hello. First of all, thank you for your work and the generosity of making it publicly available.
I'm trying to open a .rar.001 file and I get this: Error in archive$path[[file]] : subscript out of bounds
. I guess it's because of the extension. I'm not used to debugging in R, so the following might not be helpful. However, I'm attaching the code just in case.
In case multi-part rar files are not supported, I'd be helpful to add a brief an error message explaining that.
Thanks!
#I've set the wd appropriately in the following example
> arch <- "compressed_file_name.rar.001"
> debug(archive_read)
> read_csv(archive_read(arch), col_types = cols())
debugging in: archive_read(arch)
debug: {
archive <- as_archive(archive)
if (is_number(file)) {
file <- archive$path[[file]]
}
assert("`file` must be a length one character vector or numeric",
length(file) == 1 && (is.character(file) || is.numeric(file)))
assert(paste0("`file` {file} not found in `archive` {archive}"),
file %in% archive$path)
read_connection(attr(archive, "path"), mode = mode,
file, archive_formats()[format], archive_filters()[filter])
}
Browse[2]>
debug: archive <- as_archive(archive)
Browse[2]>
debug: if (is_number(file)) {
file <- archive$path[[file]]
}
Browse[2]>
debug: file <- archive$path[[file]]
Browse[2]>
Error in archive$path[[file]] : subscript out of bounds
> debug(as_archive)
Error in debug(as_archive) : object 'as_archive' not found
archive_write_files.cpp:46:45: runtime error: nan is outside the range of representable values of type 'int'
#0 0x7f7007879af7 in archive_write_files_(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, cpp11::r_vector<cpp11::r_string>, int, cpp11::r_vector<int>, cpp11::r_vector<cpp11::r_string>, unsigned long) /data/gannet/ripley/R/packages/tests-clang-SAN/archive/src/archive_write_files.cpp:46:45
#1 0x7f700787d399 in _archive_archive_write_files_ /data/gannet/ripley/R/packages/tests-clang-SAN/archive/src/cpp11.cpp:33:27
#2 0x6dcbe1 in R_doDotCall /data/gannet/ripley/R/svn/R-devel/src/main/dotcode.c:617:17
#3 0x8397be in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7684:21
#4 0x81d3ae in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:740:8
#5 0x8861b7 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c
#6 0x881b1f in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1836:16
#7 0x841dff in bcEval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:7096:12
#8 0x81d3ae in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:740:8
#9 0x8861b7 in R_execClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c
#10 0x881b1f in Rf_applyClosure /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:1836:16
#11 0x81dde8 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:863:12
#12 0x892236 in do_set /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2982:8
#13 0x81d798 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:815:12
#14 0x8910d2 in do_begin /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:2530:10
#15 0x81d798 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:815:12
#16 0x81d798 in Rf_eval /data/gannet/ripley/R/svn/R-devel/src/main/eval.c:815:12
#17 0x94d266 in Rf_ReplIteration /data/gannet/ripley/R/svn/R-devel/src/main/main.c:264:2
#18 0x9507b0 in R_ReplConsole /data/gannet/ripley/R/svn/R-devel/src/main/main.c:316:11
#19 0x9505b9 in run_Rmainloop /data/gannet/ripley/R/svn/R-devel/src/main/main.c:1129:5
#20 0x4e247a in main /data/gannet/ripley/R/svn/R-devel/src/main/Rmain.c:29:5
#21 0x7f7016993081 in __libc_start_main (/lib64/libc.so.6+0x27081)
#22 0x43129d in _start (/data/gannet/ripley/R/R-clang-SAN/bin/exec/R+0x43129d)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior archive_write_files.cpp:46:45 in
Please correct before 2021-10-29 to safely retain your package on CRAN.
hi, here's a minimal reproducible example.. failed on two different unix machines. not sure if this is unsupported? thank you
tf <- tempfile()
download.file( 'https://nlsinfo.org/cohort-data/nlsy97_all_1997-2013.zip' , tf , mode = 'wb' )
archive::archive_extract( tf , dir = tempdir() )
> archive::archive_extract( tf , dir = tempdir() )
Error in archive_extract_(attr(archive, "path"), file) :
archive_write_data_block(): Write failed
>
> traceback()
3: stop(list(message = "archive_write_data_block(): Write failed",
call = archive_extract_(attr(archive, "path"), file), cppstack = list(
file = "", line = -1L, stack = c("/export/scratch1/home/damico/R/x86_64-redhat-linux-gnu-library/3.3/archive/libs/archive.so(Rcpp::exception::exception(char const*, bool)+0x84) [0x7f6a10b583b4]",
"/export/scratch1/home/damico/R/x86_64-redhat-linux-gnu-library/3.3/archive/libs/archive.so(void Rcpp::stop<char const*>(char const*, char const*&&)+0x4f) [0x7f6a10b662cf]",
"/export/scratch1/home/damico/R/x86_64-redhat-linux-gnu-library/3.3/archive/libs/archive.so(archive_extract_(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, Rcpp::Vector<16, Rcpp::PreserveStorage>, unsigned long)+0x2c7) [0x7f6a10b661e7]",
"/export/scratch1/home/damico/R/x86_64-redhat-linux-gnu-library/3.3/archive/libs/archive.so(_archive_archive_extract_+0x191) [0x7f6a10b560f1]",
"/usr/lib64/R/lib/libR.so(+0x10406a) [0x7f6a1ff7106a]",
"/usr/lib64/R/lib/libR.so(Rf_eval+0x180) [0x7f6a1ff788b0]",
"/usr/lib64/R/lib/libR.so(Rf_applyClosure+0x51d) [0x7f6a1ff7a51d]",
"/usr/lib64/R/lib/libR.so(+0x103ad6) [0x7f6a1ff70ad6]",
"/usr/lib64/R/lib/libR.so(Rf_eval+0x180) [0x7f6a1ff788b0]",
"/usr/lib64/R/lib/libR.so(Rf_applyClosure+0x51d) [0x7f6a1ff7a51d]",
"/usr/lib64/R/lib/libR.so(Rf_eval+0x30d) [0x7f6a1ff78a3d]",
"/usr/lib64/R/lib/libR.so(Rf_ReplIteration+0x1ba) [0x7f6a1ffa03aa]",
"/usr/lib64/R/lib/libR.so(+0x1337b1) [0x7f6a1ffa07b1]",
"/usr/lib64/R/lib/libR.so(run_Rmainloop+0x48) [0x7f6a1ffa0868]",
"/usr/lib64/R/bin/exec/R(main+0x1b) [0x55bcf9ac78cb]",
"/lib64/libc.so.6(__libc_start_main+0xf1) [0x7f6a1d4eb731]",
"/usr/lib64/R/bin/exec/R(_start+0x29) [0x55bcf9ac7909]"
))))
2: archive_extract_(attr(archive, "path"), file)
1: archive::archive_extract(tf, dir = tempdir())
> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Fedora 24 (Twenty Four)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] httr_1.2.1 R6_2.2.2 tools_3.3.3 withr_1.0.2
[5] tibble_1.3.3 curl_2.3 Rcpp_0.12.12 memoise_1.0.0
[9] git2r_0.18.0 digest_0.6.12 rlang_0.1.1 devtools_1.13.2
[13] archive_1.0.0
>
Hi, I know this is likely an issue with unix libarchive
, but if you have any insights that would be great. Here is a minimal reproducible example. Any thoughts? Thanks! Jeff
# debug extract
archive:::libarchive_version()
tf <- tempfile()
download.file( 'https://www2.census.gov/programs-surveys/acs/data/pums/2014/5-Year/csv_pus.zip' , tf , mode = 'wb' )
archive::archive_extract( tf , dir = tempdir() )
> # debug extract
> archive:::libarchive_version()
[1] ‘3.1.2’
> tf <- tempfile()
> download.file( 'https://www2.census.gov/programs-surveys/acs/data/pums/2014/5-Year/csv_pus.zip' , tf , mode = 'wb' )
trying URL 'https://www2.census.gov/programs-surveys/acs/data/pums/2014/5-Year/csv_pus.zip'
Content type 'application/zip' length 2512044838 bytes (2395.7 MB)
==================================================
downloaded 2395.7 MB
> archive::archive_extract( tf , dir = tempdir() )
Error in archive_extract_(attr(archive, "path"), file) :
archive_read_next_header(): Invalid central directory signature
I am using lodown (ajdamico). From that identical download, I was able to use jar
to extract the files no problem (it is the identical file downloaded).
Error in archive_extract_(attr(archive, "path"), file) :
archive_read_next_header(): Invalid central directory signature
year time_period base_folder db_tablename
23 2014 5-Year https://www2.census.gov/programs-surveys/acs/data/pums/2014/5-Year/ acs2014_5yr
dbfolder output_filename include_puerto_rico case_count
23 /media/jeff/jeff/ACS2015_5yr/MonetDB /media/jeff/jeff/ACS2015_5yr/acs2014_5yr.rds TRUE NA
> tf
[1] "/tmp/RtmprKBUz9/file16976313e25b"
>
jeff@jeff-Precision-7520:/tmp/RtmprKBUz9$ jar xvf file16976313e25b
inflated: ss14pusa.csv
inflated: ss14pusb.csv
inflated: ss14pusc.csv
inflated: ss14pusd.csv
inflated: ACS2010-2014_PUMS_README.pdf
jeff@jeff-Precision-7520:/tmp/RtmprKBUz9$
Here is Ubuntu info:
jeff@jeff-Precision-7520:/tmp/RtmprKBUz9$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.3 LTS
Release: 16.04
Codename: xenial
jeff@jeff-Precision-7520:/tmp/RtmprKBUz9$ uname -a
Linux jeff-Precision-7520 4.10.0-42-generic #46~16.04.1-Ubuntu SMP Mon Dec 4 15:57:59 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
jeff@jeff-Precision-7520:/tmp/RtmprKBUz9$
I have an appveyor build failing that I don't quite understand: https://ci.appveyor.com/project/raymondben/bowerbird. The failure is related to archive
: it appears that glue
is being installed OK from CRAN but then because the github version is newer, and because archive
's DESCRIPTION file includes Remotes: tidyverse/glue
, it then tries to install from github and fails.
Is the Remotes
spec needed? (Bearing in mind that I am not sure that this is even the problem here. I've tried to discourage appveyor from building glue from source, but that doesn't seem to help)
hi, here are two files created with winrar. when decompressed with winrar versus archive_extract, both give the same file.size()
but the R.utils::countLines()
result for archive_extract is too few lines. for some reason, archive_extract is missing line endings on winrar files and ends up cramming them together.. hope the diagnostics below are helpful. thanks!
[1] "currently working on ftp://ftp.cdc.gov/pub/health_statistics/nchs/datasets/dvs/natality/nat2009us.zip"
[1] "archive::archive_extract extracts 1 lines"
[1] "archive::archive_extract file.size 3215098572"
[1] "winrar extracts 4137836 lines"
[1] "winrar file.size 3215098572"
[1] "currently working on http://download.inep.gov.br/microdados/microdados_enem2009.rar"
[1] "archive::archive_extract extracts 80937 lines"
[1] "archive::archive_extract file.size 4078192743"
[1] "winrar extracts 4148721 lines"
[1] "winrar file.size 4078192743"
loop that reproduces the problem:
# install.packages("devtools")
# devtools::install_github("ajdamico/lodown")
# devtools::install_github("jimhester/archive")
# path to winrar on local machine
path_to_winrar <- normalizePath( "C:/Program Files/winrar/winrar.exe" )
tf <- tempfile()
for( this_file in c( 'ftp://ftp.cdc.gov/pub/health_statistics/nchs/datasets/dvs/natality/nat2009us.zip' , 'http://download.inep.gov.br/microdados/microdados_enem2009.rar' ) ){
print( paste( "currently working on" , this_file ) )
# archive fails for both
lodown::cachaca( this_file , tf , mode = 'wb' )
archive::archive_extract( tf , dir = tempdir() )
windows_unzip <- grep( "Nat2009|DADOS_ENEM" , list.files( tempdir() , recursive = TRUE , full.names = TRUE ) , value = TRUE )
print( paste( "archive::archive_extract extracts" , R.utils::countLines( windows_unzip ) , "lines" ) )
print( paste( "archive::archive_extract file.size" , file.size( windows_unzip ) ) )
file.remove( windows_unzip )
# winrar succeeds for both
lodown::cachaca( this_file , tf , mode = 'wb' )
sys.command <- paste0( '"' , path_to_winrar , '" x ' , tf , ' "' , tempdir() , '"' )
system( sys.command )
windows_unzip <- grep( "Nat2009|DADOS_ENEM" , list.files( tempdir() , recursive = TRUE , full.names = TRUE ) , value = TRUE )
print( paste( "winrar extracts" , R.utils::countLines( windows_unzip ) , "lines" ) )
print( paste( "winrar file.size" , file.size( windows_unzip ) ) )
file.remove( windows_unzip )
}
sessionInfo()
R version 3.4.1 (2017-06-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 15063)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.4.1 tibble_1.3.3 Rcpp_0.12.11 R.methodsS3_1.7.1 digest_0.6.12
[6] lodown_0.1.0 R.utils_2.5.0 rlang_0.1.1 R.oo_1.21.0 archive_0.0.0.9000
Prepare for release:
devtools::build_readme()
urlchecker::url_check()
devtools::check(remote = TRUE, manual = TRUE)
devtools::check_win_devel()
rhub::check_for_cran()
rhub::check(platform = 'ubuntu-rchk')
rhub::check_with_sanitizers()
revdepcheck::revdep_check(num_workers = 4)
cran-comments.md
Submit to CRAN:
usethis::use_version('patch')
devtools::submit_cran()
Wait for CRAN...
usethis::use_github_release()
usethis::use_dev_version()
The bindings only support reading one file, so how could many files with the same name across many subdirectories be read?
I'm new to R but made a script that can at least read an archive supplied as a parameter:
#!/usr/bin/env Rscript
library(archive)
options(max.print=1000000)
args <- commandArgs()
fname <- args[6]
archive <- archive_read(fname)
lines <- readLines(con=archive)
close(archive)
cat(lines, sep="\n")
Do you have any plans for supporting password-protected archives, such as zip or 7zip files?
> pak::pkg_install("jimhester/archive")
ℹ Checking for package metadata updates
✔ All 12 metadata files are current.
✔ Loading session disk cached package metadata
✔ Using cached package metadata
→ Will install 1 packages:
jimhester/archive
→ Will update 1 packages:
tidyverse/glue
→ Will not update 15 packages.
! Package(s) `glue` are already loaded, installing them may cause
problems. Use `pkgload::unload()` to unload them.
→ Will download 2 packages with unknown size.
? Do you want to continue (Y/n) Y
Error: callr subprocess failed: Failed to download archive from `https://api.github.com/repos/jimhester/archive/zipball/09754896e63a96f928aaacf9528589caba7d6128`.
Prepare for release:
devtools::build_readme()
urlchecker::url_check()
devtools::check(remote = TRUE, manual = TRUE)
devtools::check_win_devel()
rhub::check_for_cran()
rhub::check(platform = 'ubuntu-rchk')
rhub::check_with_sanitizers()
revdepcheck::revdep_check(num_workers = 4)
cran-comments.md
Submit to CRAN:
usethis::use_version('minor')
devtools::submit_cran()
Wait for CRAN...
usethis::use_github_release()
usethis::use_dev_version()
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.