GithubHelp home page GithubHelp logo

rchk's Introduction

This project consists of several bug-finding tools that look for memory protection errors in C source code using R API, that is in the source code of R itself and packages. The tools perform whole-program static analysis on LLVM bitcode and run on Linux. About 200-300 memory protection bugs have been found using rchk and fixed in R. rchk is now regularly used to check CRAN packages.

To use the tool, one needs to build R from source using a special compiler wrapper, which builds LLVM bitcode in addition to native code (both shared libraries and executables). R packages are then installed using this version of R, providing LLVM bitcode for their shared libraries as well. The core of rchk is implemented in C++ and analyzes the LLVM bitcode of R packages and R itself. Several installation options are provided, including containers.

Installation

The tool is available in pre-built containers, Docker and Singularity, for non-interactive use. The container is invoked as a command to check a particular package:

docker pull kalibera/rchk:latest
docker run kalibera/rchk:latest audio
singularity pull shub://kalibera/rchk:def
singularity run kalibera-rchk-master-def.simg jpeg

For more details, see Docker rchk container and Singularity rchk container. This setup is good for occasional checking of a single package. Docker clients are available for Linux, macOS and Windows. Singularity only for Linux.

The tool can also be used interactively in a virtual machine running Ubuntu, which can be automatically installed using Vagrant scripts. This setup is good for Linux, Windows and macOS users and makes it faster to repeatedly check the same package and easier to customize the process. See Automated installation (Docker/Virtualbox) for interactive use.

Finally, the tool can be installed natively on Linux, compiled from source. This setup is good for interactive use and reduces disk space overhead. The setup is not automated, but only requires several steps described for recent Linux distributions. See Native installation on Linux for interactive use.

An alternative docker image is also available from third parties on R-hub (rhub/ubuntu-rchk, source).

Checking the first package (interactive use)

This part applies to interactive installation of rchk (natively or automated install in Docker/Virtualbox). For this that one also needs to install subversion, rsync (apt-get install subversion rsync, but already available in the automated install). More importantly, one also needs any dependencies needed by that package.

  1. Build R producing also LLVM bitcode
    • svn checkout https://svn.r-project.org/R/trunk
    • cd trunk
    • . ../scripts/config.inc (in automated install, . /opt/rchk/scripts/config.inc)
    • . ../scripts/cmpconfig.inc (in automated install, . /opt/rchk/scripts/cmpconfig.inc)
    • ../scripts/build_r.sh (in automated install, /opt/rchk/scripts/build_r.sh)
  2. Install and check the package
    • echo 'install.packages("jpeg",repos="http://cloud.r-project.org")' | ./bin/R --no-echo
    • ../scripts/check_package.sh jpeg (in automated install, /opt/rchk/scripts/check_package.sh jpeg)

The output of the checking is in files packages/lib/jpeg/libs/jpeg.so.*check. For version 0.1-8 of the package, jpeg.so.maacheck includes

WARNING Suspicious call (two or more unprotected arguments) to Rf_setAttrib at read_jpeg /rchk/trunk/packages/build/IsnsJjDm/jpeg/src/read.c:131

which is a true error. bcheck does not find any errors, jpeg.so.bcheck only contains something like

Analyzed 15 functions, traversed 1938 states.

Version 0.1-10 of the package no longer has the error, jpeg.so.bcheck currently contains something like

ERROR: too many states (abstraction error?) in function strptime_internal
ERROR: too many states (abstraction error?) in function StringValue
ERROR: too many states (abstraction error?) in function RunGenCollect
ERROR: too many states (abstraction error?) in function tre_tnfa_run_parallel
Analyzed 17 functions, traversed 815 states.

Errors about "too many states" can be ignored, this means that the tool could not analyze some R functions in the memory limit provided.

To check the next package, just follow the same steps, installing it into this customized version of R. When checking a tarball, one would typically first install the CRAN/BIOC version of the package to get all dependencies in, and then use R CMD INSTALL to install the newest version to check from the tarball.

Warnings like objdump: Warning: Unrecognized form: 0x22 should be safe to ignore.

One can reduce the number of required R package dependencies by only installing LinkingTo dependencies of the package and then installing the package with --libs-only option (only shared libraries are built and installed). This is enough to build shared libraries of most but not all packages. Docker and singularity rchk containers for non-interactive use do this, see scripts/utils.r and definitions of the containers for more details.

Further information:

  • Installation - installation instructions.
  • User documentation - how to use the tools and what they check.
  • Internals - how the tools work internally.
  • Building - how to get the necessary bitcode files for R/packages; this is now encapsulated in scripts, but the background is here

https://www.singularity-hub.org/static/img/hosted-singularity--hub-%23e32929.svg

rchk's People

Contributors

brodieg avatar bwlewis avatar gaborcsardi avatar kalibera avatar robinlovelace avatar s-u avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rchk's Issues

Run check on a local package

Is it possible to run rchk on a local package? The readme explains how to run it on a package already on CRAN, but I'd like to run it before submitting, so I don't waste anybody's time (Dr. Ligges, etc.)

Not familiar with the tools used, so sorry if this is a simple question.

How to safely get raw pointer

The new rchk produces a lot of these for me. I never knew that RAW() may allocate. Is there an alternative api to get the raw pointer of an object that is safe?

Values protected by assignment

Could rchk consider value protected after being assigned to a protected container?

PROTECT(shelter);

SEXP value = Rf_cons(R_NilValue, R_NilValue)
SET_VECTOR_ELT(shelter, 0, value);

This pattern of create-and-assign helps avoids distracting PROTECT/UNPROTECT boilerplate when initialising objects.

minor typo in readme

For both native and virtual installation, to check GNU-R:

    Get latest version of GNU-R: svn checkout https://svn.r-project.org/R/trunk
    Build it using for rchk (run in R source tree)
        . <rchk_root>/scripts/config.inc (. /opt/rchk/scripts/config.inc)
        <rchkroot>/scripts/build_r.sh (. /opt/rchk/scripts/cmpconfig.inc)

last parenthetical should probably be .../build_r.sh. I'm submitting a PR in a minute to fix.

... as it has address taken, results will be incomplete ...

I'm starting to get more familiar with rchk (through rhub for now) and trying to systematically deal with its findings. This is very useful, and I'm making progress about the rchk findings in dplyr.

I've sent some initial pull requests to Rcpp as well:

Although I believe it needs much more than that. But results are better with these changes. I am currently at that stage with checking dplyr against a branch of Rcpp that has the fixes from both the PR above https://builder.r-hub.io/status/original/dplyr_0.8.0.9008.tar.gz-d2f2d6a9fcc34b5491d3829cc0521fa8 (this probably will disappear in a few days though).

I'm seeing many results similar to this:

Function SEXPREC* arrange_template<dplyr::NaturalDataFrame>(dplyr::NaturalDataFrame const&, dplyr::QuosureList const&, SEXPREC*)
  [UP] ignoring variable <unnamed var:   %24 = alloca %struct.SEXPREC*, align 8> as it has address taken, results will be incomplete 
  [UP] ignoring variable <unnamed var:   %26 = alloca %struct.SEXPREC*, align 8> as it has address taken, results will be incomplete 

Do I need to pay attention to those ? I don't what they mean.

linking module flags 'PIC Level': IDs have conflicting values

I'm following the excellent instructions to build natively. I've compiled rchk ok. Now when I build R-devel, it runs well until the very end.

 . ~/build/rchk/scripts/build_r.sh
[ ... snip normal good ouput ... ]
WARNING:Could not find ".llvm_bc" ELF section in "/home/mdowle/build/R-devel/src/extra/blas/libRblas.so", so skipping this entry.
WARNING:Could not find ".llvm_bc" ELF section in "/home/mdowle/build/R-devel/src/modules/lapack/libRlapack.so", so skipping this entry.
WARNING:Could not find ".llvm_bc" ELF section in "/home/mdowle/build/R-devel/lib/libRblas.so", so skipping this entry.
WARNING:Could not find ".llvm_bc" ELF section in "/home/mdowle/build/R-devel/lib/libRlapack.so", so skipping this entry.
ERROR: linking module flags 'PIC Level': IDs have conflicting values
ERROR: linking module flags 'PIC Level': IDs have conflicting values

R-devel appears to be functional, so I proceeded to check the package

~/build/R-devel/packages/lib/data.table$ cat ./libs/datatable.so.maacheck
error: linking module flags 'PIC Level': IDs have conflicting values

Almost there! I'm just looking for quick pointers where to look next. Maybe LLVM Gold linker? But I thought the scripts handled this automatically, if I read correctly.

error building on ubuntu 18.04

I get this error when following the build instructions on a fresh install of Ubuntu 18.04:

patterns.cpp: In function ‘bool isTypeSwitch(llvm::Value*, llvm::AllocaInst*&, llvm::BasicBlock*&, TypeSwitchInfoTy&)’:
patterns.cpp:380:26: error: base operand of ‘->’ has non-pointer type ‘llvm::SwitchInst::CaseIt’
     ConstantInt *val = ci->getCaseValue();
                          ^~
patterns.cpp:381:26: error: base operand of ‘->’ has non-pointer type ‘llvm::SwitchInst::CaseIt’
     BasicBlock *succ = ci->getCaseSuccessor();
                          ^~
<builtin>: recipe for target 'patterns.o' failed
make: *** [patterns.o] Error 1

Any idea what the source of this error might be?

Container always checking CRAN version instead of local tarball

Hi, thank you for the container, it is very useful.

I am trying to debug gert, however I noticed that the container seems to be checking the CRAN version, even when I specify a local path. For example:

git clone https://github.com/r-lib/gert
R CMD build gert
docker run -v "`pwd`:/workspace" kalibera/rchk --install-deb "libgit2-dev" "/workspace/gert_1.0.9000.tar.gz"

Output indicates that gert_1.0.9000.tar.gz was found, yet it proceeds with installing gert_1.0.2.tar.gz from CRAN:

...
Setting up libgit2-dev:amd64 (0.28.4+dfsg.1-2) ...
Processing triggers for libc-bin (2.31-0ubuntu9.1) ...
source("/rchk/scripts/utils.r"); install_package_libs("/workspace/gert_1.0.9000.tar.gz")
trying URL 'https://cran.r-project.org/src/contrib/gert_1.0.2.tar.gz'
Content type 'application/x-gzip' length 61324 bytes (59 KB)
==================================================
downloaded 59 KB

* installing *source* package 'gert' ...
** package 'gert' successfully unpacked and MD5 sums checked
...

And the results also reflect the CRAN version of the package.
Am I doing something wrong?

Incorrect analysis around C++ constructors and destructors

I'm experiencing what I think is a false alarm in rchk from what I guess is a trivial case to analyze and perhaps a common usage of R calls.

In short, I have a C++ class which allocates and protects an R object in its constructor, and then unprotects it in its destructor. The underlying SEXP might be value-assigned to a struct in the middle of the object lifetime without involving any extra protection (e.g. new_struct.sexp_obj = wrapper_class.sexp_obj), and there is some potential calls to Rf_error in case of errors.

This situation triggers [PB] has negative depth in the constructor and [UP] attempt to unprotect more items (1) than protected (0) in the destructor.

This is the code for the class:
https://github.com/david-cortes/LightGBM/blob/4c757534905a6029288767ba25de7751c4860a29/R-package/src/lightgbm_R.cpp#L51

And this is how it is used:
https://github.com/david-cortes/LightGBM/blob/4c757534905a6029288767ba25de7751c4860a29/R-package/src/lightgbm_R.cpp#L228

Usage of C++ objects is wrapped in a #define'd try-catch block:
https://github.com/david-cortes/LightGBM/blob/4c757534905a6029288767ba25de7751c4860a29/R-package/src/lightgbm_R.cpp#L34

I've tried different variations, such as:

  • Setting a boolean telling whether to call UNPROTECT and leaving it to false in the destructor, in case the destructor somehow ends up being called more than once.
  • Using more explicit object construction - e.g. MyObj obj = MyObj() vs. just MyObj obj (although I'm quite certain it should be the same in C++>=11).
  • Moving the C++ object which has PROTECT/UNPROTECT in its constructor/destructor to inside a try-catch block.

But all of them result in a warning.

Problems building/running on Gentoo

rchk currently builds on Gentoo Linux using LLVM 4 or 6, but it will segfault when running the check_package.sh script. Here's what happens when built with LLVM 6:

../scripts/check_package.sh: line 77: 19500 Aborted                 $RCHK/src/$T ./src/main/R.bin.bc $F > $FOUT 2>&1

If I rebuild with debugging flags and modify the check_package.sh script to run bcheck with gdb, here's what happens:

$ ../scripts/check_package.sh yaml
GNU gdb (Gentoo 8.1 p1) 8.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://bugs.gentoo.org/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/viking/Packages/rchk/src/bcheck...done.
(gdb) run ./src/main/R.bin.bc packages/lib/yaml/libs/yaml.so.bc
Starting program: /home/viking/Packages/rchk/src/bcheck ./src/main/R.bin.bc packages/lib/yaml/libs/yaml.so.bc
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
bcheck: /usr/lib64/llvm/6/include/llvm/IR/CallSite.h:187: ValTy* llvm::CallSiteBase<FunTy, BBTy, ValTy, UserTy, UseTy, InstrTy, CallTy, InvokeTy, IterTy>::getArgument(unsigned int) const [with FunTy = llvm::Function; BBTy = llvm::BasicBlock; ValTy = llvm::Value; UserTy = llvm::User; UseTy = llvm::Use; InstrTy = llvm::Instruction; CallTy = llvm::CallInst; InvokeTy = llvm::InvokeInst; IterTy = llvm::Use*]: Assertion `arg_begin() + ArgNo < arg_end() && "Argument # out of range!"' failed.

Program received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51      }
(gdb) where
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007fffeac82ae7 in __GI_abort () at abort.c:90
#2  0x00007fffeac788ea in __assert_fail_base (
    fmt=0x7fffeadd8d20 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=assertion@entry=0x5555555f8be0 "arg_begin() + ArgNo < arg_end() && \"Argument # out of range!\"", file=file@entry=0x5555555f8ba8 "/usr/lib64/llvm/6/include/llvm/IR/CallSite.h", 
    line=line@entry=187, 
    function=function@entry=0x5555555f9a60 <llvm::CallSiteBase<llvm::Function, llvm::BasicBlock, llvm::Value, llvm::User, llvm::Use, llvm::Instruction, llvm::CallInst, llvm::InvokeInst, llvm::Use*>::getArgument(unsigned int) const::__PRETTY_FUNCTION__> "ValTy* llvm::CallSiteBase<FunTy, BBTy, ValTy, UserTy, UseTy, InstrTy, CallTy, InvokeTy, IterTy>::getArgument(unsigned int) const [with FunTy = llvm::Function; BBTy = llvm::BasicBlock; ValTy = llvm::Va"...) at assert.c:92
#3  0x00007fffeac78972 in __GI___assert_fail (
    assertion=0x5555555f8be0 "arg_begin() + ArgNo < arg_end() && \"Argument # out of range!\"", 
    file=0x5555555f8ba8 "/usr/lib64/llvm/6/include/llvm/IR/CallSite.h", line=187, 
    function=0x5555555f9a60 <llvm::CallSiteBase<llvm::Function, llvm::BasicBlock, llvm::Value, llvm::User, llvm::Use, llvm::Instruction, llvm::CallInst, llvm::InvokeInst, llvm::Use*>::getArgument(unsigned int) const::__PRETTY_FUNCTION__> "ValTy* llvm::CallSiteBase<FunTy, BBTy, ValTy, UserTy, UseTy, InstrTy, CallTy, InvokeTy, IterTy>::getArgument(unsigned int) const [with FunTy = llvm::Function; BBTy = llvm::BasicBlock; ValTy = llvm::Va"...) at assert.c:101
#4  0x0000555555561307 in llvm::CallSiteBase<llvm::Function, llvm::BasicBlock, llvm::Value, llvm::User, llvm::Use, llvm::Instruction, llvm::CallInst, llvm::InvokeInst, llvm::Use*>::getArgument (
    this=0x7fffffffc6e8, ArgNo=0) at /usr/lib64/llvm/6/include/llvm/IR/CallSite.h:187
#5  0x000055555558f4cf in SEXPGuardsChecker::handleForTerminator (this=0x55555e4be710, 
    t=0x555556b8fb88, s=...) at guards.cpp:851
#6  0x000055555559f4e4 in getCalledAndWrappedFunctions (f=0x55555cdd6a08, msg=..., 
    called=std::set with 4 elements = {...}, wrapped=std::set with 0 elements)
    at callocators.cpp:738
#7  0x00005555555a0307 in CalledModuleTy::computeCalledAllocators (this=0x7fffffffd1e0)
    at callocators.cpp:855
#8  0x000055555555ca28 in CalledModuleTy::getContextSensitiveAllocatingFunctions (
    this=0x7fffffffd1e0) at callocators.h:176
#9  0x000055555555aa5f in main (argc=3, argv=0x7fffffffd748) at bcheck.cpp:535

Let me know what further information I can provide to help.

Could "// norchk" be a recognized marker for false positives?

i.e. just like the # nocov marker in the covr project here: https://github.com/r-lib/covr#exclusion-comments

I managed to reduce 189 rchk messages down to 3 for package data.table.

Package data.table version 1.11.2
Package built using 74708/R 3.6.0; x86_64-pc-linux-gnu; 2018-05-11 11:48:39 UTC; unix   
Checked with rchk version 63f79d910f5835174fcaa5a0a7d2409348f7d2ac
More information at https://github.com/kalibera/cran-checks/blob/master/rchk/PROTECT.md

Function allocateDT
  [PB] has possible protection stack imbalance data.table/src/freadR.c:378

Function preprocess
  [PB] has possible protection stack imbalance data.table/src/fmelt.c:352

Function userOverride
  [PB] has possible protection stack imbalance data.table/src/freadR.c:310

This was a very helpful exercise and I restructured wherever I could. The three remaining are all because of an allocVector() in one C function, which is passed to another C function which does the UNPROTECT() before returning to R level, as described in the excellent rchk documentation here. The fmelt.c one could possibly be restructured (with difficulty). But the two in freadR.c are because that file contains the R-API preamble which then calls fread.c which is R-API independent. fread.c is pure C code and is shared with the pydatatable project for Python. When fread.c is done populating the R vectors that freadR.c created, it returns to freadR.c which UNPROTECTs and then returns to R level. I can't see any way of restructuring that to pass rchk.

If I could add a comment on those lines, something like "// norchk", then I could mark those false positives as investigated and checked. This would then resolve and remove the rchk additional issues on the CRAN check page so that both CRAN maintainers and users can see that all issues have been dealt with.

If this isn't already possible, I'm not asking you to implement it, necessarily. I'm just asking if this approach is acceptable in principle. Then I can perhaps submit a PR to achieve this.

Best, Matt

Explicit protection cancels implicit protection

This is a minimal reprex. The pattern might occur in realistic cases via macros:

// Function foo
//   [UP] unprotected variable x while calling allocating function Rf_allocVector

SEXP foo(SEXP x) {
  PROTECT(x);
  UNPROTECT(1);

  Rf_allocVector(INTSXP, 1);

  return x;
}

Can we let rchk know that a function preserves its argument?

The rlang C library implements an alternative to R_PreserveObject(): https://github.com/r-lib/rlang/blob/64c02187/src/rlang/obj.c#L11.

Background: r_preserve() stores its argument in a hash table that is protected by the namespace of the embedding package. This makes it easier to detect what package is responsible for a memory leak (e.g. using https://github.com/r-lib/memtools) because leaked nodes are dominated by their package namespace. This is also convenient during development with devtools where namespaces are unloaded and reloaded multiple times in a given session because GC of the namespace automatically frees all preserved objects.

Unfortunately it seems that rchk can't figure out that the argument of r_preserve() becomes protected. Could we add some sort of marker or attribute to preserving functions to let rchk know about the semantics of that function? Possibly related to #28.

Installation (Fedora 34): missing `llvm/Support/StringPool.h`

I've followed the installation instructions for Fedora 32
https://github.com/kalibera/rchk/blob/master/doc/INSTALLATION.md#fedora-32
but have F 34.

  • All '0-level' dependencies were already present (i.e. gave Nothing to do.)
  • Step 1 ended successfully in
Installed:
  llvm-devel-12.0.1-1.fc34.x86_64        llvm-static-12.0.1-1.fc34.x86_64       
  llvm-test-12.0.1-1.fc34.x86_64        

Complete!
  • Step 2 (python pip ..) success ending in
Installing collected packages: wllvm
Successfully installed wllvm-1.3.1
  • Now, step 3, the installation of rchk failed during make, specifically
g++ -I/usr/include -std=c++14 -fno-exceptions -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -O3 -g3 -MMD -DBCHECK_MAX_STATES=3000000 -DCALLOCATORS_MAX_STATES=1000000 -I/usr/include/llvm-3.8/ -I/usr/include -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS  -c -o allocators.o allocators.cpp
In file included from guards.h:17,
                 from callocators.h:9,
                 from exceptions.h:5,
                 from allocators.cpp:3:
linemsg.h:14:10: fatal error: llvm/Support/StringPool.h: No such file or directory
   14 | #include <llvm/Support/StringPool.h>
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
make: *** [<builtin>: allocators.o] Error 1

and searching for the missing header in the installed llvm-devel package indeed showed it was not installed:

root@lynne rchk#  dnf repoquery -l llvm-devel | sed -n '1,3p; /StringPool.h/p'
Updating Subscription Management repositories.
Last metadata expiration check: 0:07:39 ago on Thu 09 Sep 2021 09:47:06 AM CEST.
/usr/bin/llvm-config
/usr/bin/llvm-config-32
/usr/include/llvm/ExecutionEngine/Orc/SymbolStringPool.h
/usr/include/llvm/ExecutionEngine/Orc/SymbolStringPool.h
/usr/include/llvm/ExecutionEngine/Orc/SymbolStringPool.h
/usr/include/llvm/ExecutionEngine/Orc/SymbolStringPool.h

and yes, it is rchk itself asking for that header:

root@lynne rchk# grep -rn -F StringPool.h
src/linemsg.h:14:#include <llvm/Support/StringPool.h>

As I'm really not knowledgable in versions of llvm
and as this is not really urgent, I'm giving up for now, but of course will try things if instructed
... for a time after the new RWin toolchain installer is released

objcopy: stai2D6Q: Failed to find link section for section

Hi,
Should I be getting these message about objcopy? I haven't seen them before from rchk but it has been several months since I last ran it, and I've just done a fresh install of latest rchk with latest R-devel trunk.
I see these messages about objcopy when building R itself within the rchk envioronment, so I don't think it's specific to data.table.

$ echo 'install.packages("~/GitHub/data.table/data.table_1.12.3.tar.gz",repos=NULL)' | ./bin/R --slave
Installing package into ‘/home/mdowle/build/rchk/trunk/packages/lib’
(as ‘lib’ is unspecified)
* installing *source* package ‘data.table’ ...
** using staged installation
** libs
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c assign.c -o assign.o
objcopy: st61vpcx: Failed to find link section for section 22
objcopy: st61vpcx: Failed to find link section for section 22
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c between.c -o between.o
objcopy: stzLUFkA: Failed to find link section for section 20
objcopy: stzLUFkA: Failed to find link section for section 20
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c bmerge.c -o bmerge.o
objcopy: stVLJvHK: Failed to find link section for section 24
objcopy: stVLJvHK: Failed to find link section for section 24
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c chmatch.c -o chmatch.o
objcopy: stA4GfiY: Failed to find link section for section 19
objcopy: stA4GfiY: Failed to find link section for section 19
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c dogroups.c -o dogroups.o
objcopy: stcA89La: Failed to find link section for section 22
objcopy: stcA89La: Failed to find link section for section 22
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c fastmean.c -o fastmean.o
objcopy: stPVZxGj: Failed to find link section for section 19
objcopy: stPVZxGj: Failed to find link section for section 19
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c fcast.c -o fcast.o
objcopy: stF0zN4v: Failed to find link section for section 20
objcopy: stF0zN4v: Failed to find link section for section 20
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c fmelt.c -o fmelt.o
objcopy: stIlazRz: Failed to find link section for section 21
objcopy: stIlazRz: Failed to find link section for section 21
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c forder.c -o forder.o
objcopy: st48LwNR: Failed to find link section for section 24
objcopy: st48LwNR: Failed to find link section for section 24
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c frank.c -o frank.o
objcopy: stAFjrj0: Failed to find link section for section 22
objcopy: stAFjrj0: Failed to find link section for section 22
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c fread.c -o fread.o
objcopy: stUgG5dj: Failed to find link section for section 25
objcopy: stUgG5dj: Failed to find link section for section 25
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c freadR.c -o freadR.o
objcopy: stwYx46y: Failed to find link section for section 21
objcopy: stwYx46y: Failed to find link section for section 21
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c froll.c -o froll.o
objcopy: stIsLAoI: Failed to find link section for section 20
objcopy: stIsLAoI: Failed to find link section for section 20
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c frollR.c -o frollR.o
objcopy: stArHsAJ: Failed to find link section for section 21
objcopy: stArHsAJ: Failed to find link section for section 21
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c frolladaptive.c -o frolladaptive.o
objcopy: stBw7v6V: Failed to find link section for section 20
objcopy: stBw7v6V: Failed to find link section for section 20
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c fsort.c -o fsort.o
objcopy: stU3y7h9: Failed to find link section for section 23
objcopy: stU3y7h9: Failed to find link section for section 23
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c fwrite.c -o fwrite.o
objcopy: stQlNSdn: Failed to find link section for section 25
objcopy: stQlNSdn: Failed to find link section for section 25
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c fwriteR.c -o fwriteR.o
objcopy: strOEvnw: Failed to find link section for section 24
objcopy: strOEvnw: Failed to find link section for section 24
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c gsumm.c -o gsumm.o
objcopy: stivhUdG: Failed to find link section for section 24
objcopy: stivhUdG: Failed to find link section for section 24
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c ijoin.c -o ijoin.o
objcopy: stF9tgLQ: Failed to find link section for section 22
objcopy: stF9tgLQ: Failed to find link section for section 22
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c init.c -o init.o
objcopy: stNOfkX3: Failed to find link section for section 22
objcopy: stNOfkX3: Failed to find link section for section 22
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c inrange.c -o inrange.o
objcopy: stPBndvd: Failed to find link section for section 18
objcopy: stPBndvd: Failed to find link section for section 18
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c nafill.c -o nafill.o
objcopy: stXKZlam: Failed to find link section for section 21
objcopy: stXKZlam: Failed to find link section for section 21
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c nqrecreateindices.c -o nqrecreateindices.o
objcopy: stUh5wVv: Failed to find link section for section 18
objcopy: stUh5wVv: Failed to find link section for section 18
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c openmp-utils.c -o openmp-utils.o
objcopy: stxAPDrJ: Failed to find link section for section 21
objcopy: stxAPDrJ: Failed to find link section for section 21
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c quickselect.c -o quickselect.o
objcopy: stqLYn2K: Failed to find link section for section 19
objcopy: stqLYn2K: Failed to find link section for section 19
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c rbindlist.c -o rbindlist.o
objcopy: stWrhNMX: Failed to find link section for section 19
objcopy: stWrhNMX: Failed to find link section for section 19
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c reorder.c -o reorder.o
objcopy: stMgWd56: Failed to find link section for section 19
objcopy: stMgWd56: Failed to find link section for section 19
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c shift.c -o shift.o
objcopy: stci9UFf: Failed to find link section for section 21
objcopy: stci9UFf: Failed to find link section for section 21
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c subset.c -o subset.o
objcopy: staolnlr: Failed to find link section for section 21
objcopy: staolnlr: Failed to find link section for section 21
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c transpose.c -o transpose.o
objcopy: stW4i1YE: Failed to find link section for section 21
objcopy: stW4i1YE: Failed to find link section for section 21
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c uniqlist.c -o uniqlist.o
objcopy: stai2D6Q: Failed to find link section for section 22
objcopy: stai2D6Q: Failed to find link section for section 22
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c vecseq.c -o vecseq.o
objcopy: stRhVUNO: Failed to find link section for section 19
objcopy: stRhVUNO: Failed to find link section for section 19
/home/mdowle/.local/bin/wllvm -I"/home/mdowle/build/rchk/trunk/include" -DNDEBUG   -I/usr/local/include   -fPIC  -O0 -g -c wrappers.c -o wrappers.o
objcopy: stJdzwY2: Failed to find link section for section 19
objcopy: stJdzwY2: Failed to find link section for section 19
/home/mdowle/.local/bin/wllvm -shared -L/usr/local/lib -o data.table.so assign.o between.o bmerge.o chmatch.o dogroups.o fastmean.o fcast.o fmelt.o forder.o frank.o fread.o freadR.o froll.o frollR.o frolladaptive.o fsort.o fwrite.o fwriteR.o gsumm.o ijoin.o init.o inrange.o nafill.o nqrecreateindices.o openmp-utils.o quickselect.o rbindlist.o reorder.o shift.o subset.o transpose.o uniqlist.o vecseq.o wrappers.o -lz
mv data.table.so datatable.so
if [ "" != "Windows_NT" ] && [ `uname -s` = 'Darwin' ]; then install_name_tool -id datatable.so datatable.so; fi
installing to /home/mdowle/build/rchk/trunk/packages/lib/00LOCK-data.table/00new/data.table/libs
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (data.table)

ERROR: too many states (abstraction error?) in function strptime_internal

I am seeing "ERROR: too many states (abstraction error?) in function strptime_internal" when rchking packages (more than one) on rhub's ubuntu-rchk machine.

I can't find any clues in the documentation to interpret what this message means. Is it a false alarm? Could you point me in the right direction?

Thanks!

Warnings compiling rchk

guards.cpp: In function ‘std::__cxx11::string igs_name(IntGuardState)’:
guards.cpp:101:1: warning: control reaches end of non-void function [-Wreturn-type]
 }
 ^
guards.cpp: In function ‘std::__cxx11::string sgs_name(SEXPGuardTy&)’:
guards.cpp:431:1: warning: control reaches end of non-void function [-Wreturn-type]
 }
$ g++ -v
gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04)

protect stack is too deep/unsupported form of unprotect

Running rchk on my package pomp, I get several messages like the following.

Function add_args
  [UP] protect stack is too deep, unprotecting all variables, results will be incomplete
  [UP] unsupported form of unprotect, unprotecting all variables, results will be incomplete /home/kingaa/projects/Rpkg/rchk/build/IAv0e71i/pomp/src/dprior.c:27

These arise in constructions such as the following (this one is the relevant code chunk from dprior.c mentioned above).

static R_INLINE SEXP add_args (SEXP names, SEXP log, SEXP args)
{

  int nprotect = 0;
  SEXP var;
  int v;

  PROTECT(args = LCONS(AS_LOGICAL(log),args)); nprotect++;
  SET_TAG(args,install("log"));

  for (v = LENGTH(names)-1; v >= 0; v--) {
    PROTECT(var = NEW_NUMERIC(1)); nprotect++;
    PROTECT(args = LCONS(var,args)); nprotect++;
    SET_TAG(args,install(CHAR(STRING_ELT(names,v))));
  }

  UNPROTECT(nprotect);
  return args;

}

As you can see, this function is designed to take a arbitrary-length vector of variable names and construct a call from it. This is extremely useful and, I believe, totally legitimate.

My question is as to how I should interpret the message thrown by rchk. I suspect they reflect the inability of current rchk codes to follow the loop of arbitrary length. Is this correct? If so, do you have any plans or see any way to extend rchk's capacity in this way?

Local results from run with singularity differ from CRAN results

for the duckdb package.

How can I replicate what CRAN is seeing?

Would it be useful to emit more diagnostic information (LLVM version, ...) for the CRAN runs and perhaps also for local runs?

CC @hannesmuehleisen.

Locally:

$ singularity run kalibera-rchk-master-def.simg duckdb
...

Library name (usually package name): duckdb
Initialization function: R_init_duckdb
Functions: 12
Checked call to R_registerRoutines: 1

Function DataFrameScanFunction::dataframe_scan_bind(duckdb::ClientContext&, std::__1::vector<duckdb::Value, std::__1::allocator<duckdb::Value> >&, std::__1::unordered_map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, duckdb::Value, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, duckdb::Value> > >&, std::__1::vector<duckdb::LogicalType, std::__1::allocator<duckdb::LogicalType> >&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >&)
  [UP] ignoring variable df as it has address taken, results will be incomplete 
Analyzed 24673 functions, traversed 226939 states.

On CRAN

Package duckdb version 0.2.1-2
Package built using 79279/R 4.1.0; x86_64-pc-linux-gnu; 2020-09-30 14:05:37 UTC; unix   
Checked with rchk version 36c7ad2294619ba0a81109c9acb675eea2c96e6d
More information at https://github.com/kalibera/cran-checks/blob/master/rchk/PROTECT.md

Function duckdb_execute_R
  [UP] unprotected variable <unnamed var:   %2 = alloca %struct.SEXPREC*, align 8> while calling allocating function std::__1::unique_ptr<duckdb::QueryResult, std::__1::default_delete<duckdb::QueryResult> >::~unique_ptr() duckdb/src/duckdbr.cpp:611

Puzzled about "results will be incomplete"

I got the error: ignoring variable ret as it has address taken, results will be incomplete.
But I don't understand why taking a pointer of SEXP (has address taken) can lead to incomplete results. Is there a reason for that?

The error:

Function impl_readTabixHeader
[UP] ignoring variable ret as it has address taken, results will be incomplete

Relevant codes:

SEXP impl_readTabixHeader(SEXP arg_tabixFile) {
  SEXP ret = R_NilValue;

  std::vector<std::string> FLAG_tabixFile;
  extractStringArray(arg_tabixFile, &FLAG_tabixFile);
  TabixReader tr(FLAG_tabixFile[0]);
  if (!tr.good()) {
    REprintf("Cannot open specified tabix file: %s\n", FLAG_tabixFile[0].c_str());
    return ret;
  }

  std::vector<std::string> headers;
  stringTokenize(stringStrip(tr.getHeader()), "\n", &headers);

  storeResult(headers, &ret);
  return ret;
}
void storeResult(const std::vector<std::string>& in, SEXP* ret);

Environment:

docker image rhub/ubuntu-rchk (LLVM-4.0)

rchk checks internal C functions for protect stack imbalances

For the yaml package, rchk lists the following problems:

https://raw.githubusercontent.com/kalibera/cran-checks/master/rchk/results/yaml.out

It looks like rchk is checking that internal C functions have a balanced stack before and after execution, even when these functions are not being called from R. The yaml package code handles its own protection calls, since it needs to collect R objects onto a managed stack and pop them off as needed. The objects are protected and unprotected carefully so that the end result is a balanced protection stack. Does rchk consider this type of programming style or will I need to rewrite my package so that CRAN checks will pass?

Thanks in advance.

False positive for conditional protection

E.g. this code

#include <R.h>
#include <Rinternals.h>

SEXP foobar(SEXP arg) {
  SEXP xx;
  int carg = INTEGER(arg)[0];

  if (carg > 100 || carg < 0) {
    xx = PROTECT(allocVector(INTSXP, 100));
  }

  if (carg > 100 || carg < 0) {
    UNPROTECT(1);
  }

  return R_NilValue;
}

generates

Function foobar
  [PB] has negative depth /mnt/works/dummy/src/test.c:14
  [UP] attempt to unprotect more items (1) than protected (0), results will be incomplete /mnt/works/dummy/src/test.c:14
  [PB] has possible protection stack imbalance /mnt/works/dummy/src/test.c:17
Analyzed 1 functions, traversed 27 states.

I can easily ignore these false positives, but they also show up in the CRAN checks and CRAN keep telling me to "fix" them.

It would be great to have a way to deal with false positives in general, actually.

rchk seems to flag allocation functions incorrectly sometimes

In the yaml package, rchk lists the following problems:

https://raw.githubusercontent.com/kalibera/cran-checks/master/rchk/results/yaml.out

It notes that yaml_emitter_emit is an allocation function that could cause R to run garbage collection and therefore destroy an unprotected object. However, this function does not use any R functions whatsoever. Does rchk look for any malloc call or just R-specific allocation functions?

Thanks in advance.

Possible Spurious "unprotected" Error

I'm not 100% certain this is a spurious error, but I'm guessing it is. In:

SEXP TEST_test_fun(SEXP x) {
  if(TYPEOF(x) != STRSXP) error("Internal Error: type mismatch");

  SEXP x_prev = STRING_ELT(x, 0);
  SEXP x_cur = STRING_ELT(x, 1);
  const char * prev = CHAR(x_prev);

  return x;
}

I get:

ERROR: too many states (abstraction error?) in function RunGenCollect
ERROR: too many states (abstraction error?) in function tre_tnfa_run_backtrack
ERROR: too many states (abstraction error?) in function tre_tnfa_run_parallel

Function TEST_test_fun
  [UP] unprotected variable x_prev while calling allocating function STRING_ELT /vagrant/test/src/test.c:7
Analyzed 2 functions, traversed 10 states.

presumably because it is not recognized x_prev is pointing to a CHARSXP that is protected by virtue of being a member of the input to the function (in my case x is provided via .Call).

This is a simplified example. In my actual use case I'm keeping track of the previous CHARSXP in a loop.

I'm not sure it's worth the trouble to fix this, but figured having the info could still be useful to you. Of course, there is the possibility I'm wrong about this.

Sample package on github

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.