GithubHelp home page GithubHelp logo

fabriziosandri / rcppdeepstate Goto Github PK

View Code? Open in Web Editor NEW
6.0 1.0 5.0 62.58 MB

RcppDeepState, a simple way to fuzz test code in Rcpp packages

Home Page: https://fabriziosandri.github.io/gsoc-2022-blog/

C++ 48.59% C 10.00% R 41.41%

rcppdeepstate's Introduction

RcppDeepState

RcppDeepState, a simple way to fuzz test compiled code in Rcpp packages. This package extends the DeepState framework to fully support Rcpp based packages.

Note: RcppDeepState is currently supported on Linux and macOS, with windows support in progress.

See Also

To learn more about how RcppDeepState works see:

Dependencies

First, make sure to install the following dependencies on your local machine.

  • CMake
  • GCC and G++ with multilib support
  • Python 3.6 (or newer)
  • Setuptools

Use this command line to install the dependencies.

sudo apt-get install build-essential gcc-multilib g++-multilib cmake python3-setuptools libffi-dev z3

Installation

The RcppDeepState package can be installed from GitHub as follows:

install.packages("devtools")
devtools::install_github("FabrizioSandri/RcppDeepState")

Functionalities

The main purpose of RcppDeepState is to analyze a package and find sublter bugs such memory issues.

Automatic package analysis

To test your package using RcppDeepState follow the steps below:

(a) deepstate_harness_compile_run: This function creates the TestHarnesses for all the functions in the test package with the corresponding makefiles. This function compiles and executes all of the above-mentioned TestHarnesses, as well as creates random inputs for your code. This function gives a list of functions that have been successfully compiled in the package; otherwise, a warning message will be displayed if a test harness cannot be generated automatically.

> RcppDeepState::deepstate_harness_compile_run("~/testSAN")
We can't test the function - unsupported_datatype - due to the following datatypes falling out of the allowed ones: LogicalVector

Failed to create testharness for 1 functions in the package - unsupported_datatype
Testharness created for 6 functions in the package

[1] "rcpp_read_out_of_bound"      "rcpp_use_after_deallocate"  
[3] "rcpp_use_after_free"         "rcpp_use_uninitialized"     
[5] "rcpp_write_index_outofbound" "rcpp_zero_sized_array"   

(b) deepstate_harness_analyze_pkg: This method examines each test file created in the previous step and produces a table with information on the error messages and inputs supplied to each tested function. The test run log files are saved in the same location as the inputs, i.e. /inst/function/log_*/valgrind_log

result <- RcppDeepState::deepstate_harness_analyze_pkg("~/testSAN")

The result contains a data table with three columns: binary.file, inputs, logtable. Each row of this table correspond to a single test.

> head(result,2)
                                          binaryfile
1: ~/testSAN/inst/testfiles/rcpp_read_out_of_bound/rcpp_read_out_of_bound_output/00004669c554b565471956e17bf36a67a67ecd78.pass
2: ~/testSAN/inst/testfiles/rcpp_read_out_of_bound/rcpp_read_out_of_bound_output/0001a4df441415b38d97b918f6b1e26e26fdadce.pass
      inputs          logtable
1: <list[1]> <data.table[1x5]>
2: <list[1]> <data.table[1x5]>

The inputs column contains all the inputs that are passed:

> result$inputs[[1]]
$rbound
[1] -53789918

The logtable is a table containing all of the errors identified by RcppDeepState for a single test.

> result$logtable[[1]]
      err.kind                message                 file.line
1: InvalidRead Invalid read of size 4 read_out_of_bound.cpp : 7
                                                                address.msg
1: Address 0xfffffffffe099498 is not stack'd, malloc'd or (recently) free'd
            address.trace
1: No Address Trace found

Manual package analysis

Remember from the previous paragraph that RcppDeepState automatically produces a test harness for you; however, it is possible that RcppDeepState cannot generate the test harness for you, as in the case of the aforementioned unsupported_datatype function, and will display the following error message:

We can't test the function - unsupported_datatype - due to the following datatypes falling out of the allowed ones: LogicalVector

Failed to create testharness for 1 functions in the package - unsupported_datatype

In this case there are two possible solutions:

Use RcppDeepState inside a GitHub repository

Now RcppDeepState makes it easy to use RcppDeepState with GitHub Actions.

ci_setup: this function can be used to automatically initialize a GitHub workflow file inside your repository for the RcppDeepState analysis. This workflow uses RcppDeepState-action.

> RcppDeepState::ci_setup(pathtotestpackage)

rcppdeepstate's People

Contributors

akhikolla avatar fabriziosandri avatar jbez27 avatar jestdang avatar ms609 avatar tdhock avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

rcppdeepstate's Issues

Segmentation fault not catched

I was trying to run RcppDeepState on the test package provided in the /inst/testSAN on a different machine.

First of all I ran the test harness compilation procedure deepstate_harness_compile_run and it succesfully generated the compiled test harness. After that I ran the deepstate_harness_analyze_pkg function and no matter how many times I ran the function, no bug was reported for the testSAN package.

After some investigation, I discovered that no output file was generated for each Test Harness. For example: the output folder for rcpp_use_after_deallocate function located at testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_output was empty. So I attempted to run the Test Harness execution manually. What I discovered is that before the output file is generated, the Test Harness gives me a segmentation failure error.

./rcpp_use_after_deallocate_DeepState_TestHarness --seed=5 --timeout=2 --fuzz --fuzz_save_passing --output_test_dir rcpp_use_after_deallocate_output 

[1]    167004 segmentation fault (core dumped)  ./rcpp_use_after_deallocate_DeepState_TestHarness --seed=5 --timeout=2 --fuzz -fuzz_save_passing --output_test_dir rcpp_use_after_deallocate_output 

Is there something I'm overlooking? The program appears to crash in the try-catch block, without actually catching the error.

TEST(,){
  RInside R;
  std::cout << "input starts" << std::endl;
  IntegerVector array_size(1);
  array_size[0]  = RcppDeepState_int();
  qs::c_qsave(array_size,"/home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/inputs/array_size.qs",
		"high", "zstd", 1, 15, true, 1);
  std::cout << "array_size values: "<< array_size << std::endl;
  std::cout << "input ends" << std::endl;
  try{
    rcpp_use_after_deallocate(array_size[0]);
  }
  catch(Rcpp::exception& e){
    std::cout<<"Exception Handled"<<std::endl;
  }
}

Executing the test on a different machine works perfectly: it catches the errors and saves the test case in the output directory.
@Anirban166 have you ever dealt a situation like this with DeepState?

Fuzzing functions with Rcpp parameters

@tdhock I found that fuzz testing C++ functions that need a Rcpp datatype as a parameter is not possible. In general using Rcpp constructs directly from another C++ program is not allowed.
This is exactly what I discussed with a contributor of the Rcpp package about this topic, and his response is as follows:

"All" that Rcpp does is to provide R with callable code via the .Call() interface which is meant to extend a running R session. Nowhere in the R (or Rcpp) documentation is it hinted that you can run code separately. Which is why we run all tests etc from R.

Rcpp datatypes can only be used in a R environment. They cannot be used outside of R. Executing a function using Rcpp-based types from a C++ environment will almost likely result in a segmentation fault as demonstrated in the issue #1221.

Proof of concept

Let's use a simple C++ function called bootstrap that initalizes a Rcpp::NumericVector and returns it's first element: 10.

#include <iostream>
#include <Rcpp.h>

// [[Rcpp::export]]
int bootstrap() {
    // Allocate a sample NumericVector
    Rcpp::NumericVector sample {10,20,30,40,50};
    
    return sample[0];
}

int main(int argc, char* argv[]){
    bootstrap();
    return 0;
}

If you manage to compile this function including all the headers, no error is reported. It seems everything is working well.

g++ -lR -I"/usr/include/R/" -I/usr/local/include  -I"/home/fabri/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include" -L/usr/lib64/R/lib -o bootstrap bootstrap.cpp

However if you try to run the compiled function, a segmentation fault occurs:

$ ./bootstrap
[1]    23848 segmentation fault (core dumped)  ./bootstrap

Possible solutions

I considered how to address this problem in the most effective manner and finally came up with some potential answers.

First solution

The first solution is to create a mock Rcpp header file containing some Rcpp available data types, e.g. NumericVector, IntegerVector, etc associated with their standard C++ library data type. In this way, when creating the harness instead of including the original Rcpp header file ,we include the mock header file. The result is that the Deepstate fuzz test is performed on the STL C++ library, instead of using the Rcpp data type.

For example we can do this by defining the following typedefs in a mock Rcpp.h file

#ifndef __RCPP_H__
#define __RCPP_H__
#include <vector>

namespace Rcpp{
    typedef std::vector<bool> LogicalVector;
    typedef std::vector<int> IntegerVector;
    typedef std::vector<double> NumericVector;
}

#endif

So when the compiler finds an IntegerVector definition, it converts it into a std::vector<int>.

This is accomplished by replacing all the #include <Rcpp.h> of the original library with #include "Rcpp.h". In this way we ensure that the parameters are compliant with the STL library. Referring to the example mentioned above the result is that all of the compilation procedures are working, and no segmentation fault is thrown.

Second solution

Avoid to fuzz test functions that contains Rcpp parameters. This seem to be the easiest solution, however the result is the lack of the support for packages that includes some Rcpp custom parameter as input. This will allow only fuzz testing of standard c++ functions not involving Rcpp parameters. However we can see from the Table 1 of the analysis performed by Akhila that a huge number of packages use these type of parameters. Link to the paper.

Valgrind and Clang-14 dwarf support

Description

Running RcppDeepState with Valgrind and Clang version 14 gives this strange error message:

Error in read_xml.raw(charToRaw(enc2utf8(x)), "UTF-8", ..., as_html = as_html,  : 
  Premature end of data in tag valgrindoutput line 3 [77]

This is the valgrind execution log over the rcpp_read_out_of_bound function of testSAN.

### unhandled dwarf2 abbrev form code 0x25
### unhandled dwarf2 abbrev form code 0x25
### unhandled dwarf2 abbrev form code 0x25
### unhandled dwarf2 abbrev form code 0x23
==152063== Valgrind: debuginfo reader: ensure_valid failed:
==152063== Valgrind:   during call to ML_(img_get)
==152063== Valgrind:   request for range [24992306, +4) exceeds
==152063== Valgrind:   valid image size of 731184 for image:
==152063== Valgrind:   "/home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_read_out_of_bound/rcpp_read_out_of_bound_DeepState_TestHarness"
==152063== 
==152063== Valgrind: debuginfo reader: Possibly corrupted debuginfo file.
==152063== Valgrind: I can't recover.  Giving up.  Sorry.
==152063== 

Solution

Searching online I found in this thread that the solution for this problem is to force clang to generate source-level debug information with dwarf version 4 using -gdwarf-4 parameter to clang.

Wrong inputs column

Description

I noticed that the analysis's result table contains for each test case of a specific function, the same input values.

How to reproduce

This problem can be reproduced by simply running RcppDeepState on the testSAN package and, by looking at the result's inputs column, you can notice that the inputs are the same. In the following snippet, you can observe the inputs saved in the resulting table for the rcpp_read_out_of_bound function of the testSAN package.

> result$inputs
[[1]]
[[1]]$rbound
[1] -717690435


[[2]]
[[2]]$rbound
[1] -717690435


[[3]]
[[3]]$rbound
[1] -717690435
...

Possible solution

By looking at the rcpp_read_out_of_bound_log_text log file, it is possible to confirm that the inputs are, in fact, successfully given to the test harness. Thus the inputs saved by the qs library appear to be the cause of this issue.

INFO: Starting fuzzing
input starts
rbound values: -1235514978
input ends
input starts
rbound values: 1545031884
input ends
input starts
rbound values: -635447766
input ends
...

The issue is that each time a new input is generated, the existing ./inputs/rbound.qs file is overwritten since the inputs are saved using the qs::c qsave function when the generator is invoked.
The section that saves the inputs has to be placed inside the runner in order to fix this issue. This causes the .qs file to be overwritten each time a new test case is run with valgrind, but this is not a problem because RcppDeepState reads the input file before starting the subsequent test.

Valgrind for initial pass

While writing the GitHub Action for the pull request #6 I came up again to the strange Segmentation fault error mentioned in the issue #2. I had assumed that the lack of debug symbols was to cause for the issue, however it doesn't appear that this is the case. The segmentation fault that I am referring to, occurs for the rcpp_use_after_deallocate function in the testSAN package.

Steps to reproduce

First of all I ran the test harness compilation procedure deepstate_harness_compile_run and it succesfully generated the compiled test harness. The first execution however leaves the rcpp_use_after_deallocate_output empty. So I decided to manually run the harness with the same seed that generated the segmentation fault(5) to understand the problem.

$ ../rcpp_use_after_deallocate_DeepState_TestHarness --fuzz --timeout=1 --seed=5 --fuzz_save_passing --output_test_dir  rcpp_use_after_deallocate_output
INFO: Starting fuzzing
WARNING: No test specified, defaulting to first test defined (_)
input starts
EXTERNAL: qs v0.25.3.

array_size values: 467495432
input ends
[1]    52869 segmentation fault (core dumped)  ./rcpp_use_after_deallocate_DeepState_TestHarness --fuzz --timeout=1 --seed=5

It appears that the segmentation fault occurs before deepstate can generate the output file. If I run the same program above using valgrind, the output folder is instead filled with a test file.

$ valgrind ./rcpp_use_after_deallocate_DeepState_TestHarness --fuzz --timeout=1 --seed=5 --fuzz_save_passing --output_test_dir  rcpp_use_after_deallocate_output
==52956== Memcheck, a memory error detector
==52956== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==52956== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==52956== Command: ./rcpp_use_after_deallocate_DeepState_TestHarness --fuzz --timeout=1 --seed=5 --fuzz_save_passing --output_test_dir rcpp_use_after_deallocate_output
==52956== 
INFO: Starting fuzzing
WARNING: No test specified, defaulting to first test defined (_)
input starts
EXTERNAL: qs v0.25.3.

array_size values: 467495432
input ends
==52956== Warning: set address range perms: large range [0xd65c040, 0x29432a48) (undefined)
==52956== Warning: set address range perms: large range [0xd65c028, 0x29432a60) (noaccess)
==52956== Invalid read of size 1
==52956==    at 0x4D539AA: rcpp_use_after_deallocate(int) (use_after_deallocate.cpp:8)
==52956==    by 0x114FB6: DeepState_Test__() (rcpp_use_after_deallocate_DeepState_TestHarness.cpp:23)
==52956==    by 0x114D58: DeepState_Run__() (rcpp_use_after_deallocate_DeepState_TestHarness.cpp:14)
==52956==    by 0x12B124: DeepState_RunTestNoFork (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956==    by 0x12B668: DeepState_FuzzOneTestCase (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956==    by 0x12B8FF: DeepState_Fuzz (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956==    by 0x11074A: main (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956==  Address 0xd65c045 is 5 bytes inside a block of size 467,495,432 free'd
==52956==    at 0x4849A7F: operator delete[](void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==52956==    by 0x4D539A9: rcpp_use_after_deallocate(int) (use_after_deallocate.cpp:7)
==52956==    by 0x114FB6: DeepState_Test__() (rcpp_use_after_deallocate_DeepState_TestHarness.cpp:23)
==52956==    by 0x114D58: DeepState_Run__() (rcpp_use_after_deallocate_DeepState_TestHarness.cpp:14)
==52956==    by 0x12B124: DeepState_RunTestNoFork (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956==    by 0x12B668: DeepState_FuzzOneTestCase (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956==    by 0x12B8FF: DeepState_Fuzz (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956==    by 0x11074A: main (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956==  Block was alloc'd at
==52956==    at 0x48472E3: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==52956==    by 0x4D5399D: rcpp_use_after_deallocate(int) (use_after_deallocate.cpp:6)
==52956==    by 0x114FB6: DeepState_Test__() (rcpp_use_after_deallocate_DeepState_TestHarness.cpp:23)
==52956==    by 0x114D58: DeepState_Run__() (rcpp_use_after_deallocate_DeepState_TestHarness.cpp:14)
==52956==    by 0x12B124: DeepState_RunTestNoFork (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956==    by 0x12B668: DeepState_FuzzOneTestCase (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956==    by 0x12B8FF: DeepState_Fuzz (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956==    by 0x11074A: main (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==52956== 
INFO: Done fuzzing! Ran 1 tests (1 tests/second) with 0 failed/1 passed/0 abandoned tests
==52956== 
==52956== HEAP SUMMARY:
==52956==     in use at exit: 51,641,784 bytes in 10,583 blocks
==52956==   total heap usage: 33,306 allocs, 22,723 frees, 560,059,861 bytes allocated
==52956== 
==52956== LEAK SUMMARY:
==52956==    definitely lost: 0 bytes in 0 blocks
==52956==    indirectly lost: 0 bytes in 0 blocks
==52956==      possibly lost: 0 bytes in 0 blocks
==52956==    still reachable: 51,641,784 bytes in 10,583 blocks
==52956==                       of which reachable via heuristic:
==52956==                         newarray           : 4,264 bytes in 1 blocks
==52956==         suppressed: 0 bytes in 0 blocks
==52956== Rerun with --leak-check=full to see details of leaked memory
==52956== 
==52956== For lists of detected and suppressed errors, rerun with: -s
==52956== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

The inverse problem

@tdhock, I found this old conversation in the issue akhikolla/RcppDeepState#62 about using Vaglrind in the first steps, after the harness compilation. Based on this discussion, I discovered that when the seed is set to 2, if valgrind is used to run the test, no output file is produced.

As you can see running this test standalone generates an output file.

$ ./rcpp_use_after_deallocate_DeepState_TestHarness --fuzz --timeout=1 --seed=2 --fuzz_save_passing --output_test_dir  rcpp_use_after_deallocate_output
INFO: Starting fuzzing
WARNING: No test specified, defaulting to first test defined (_)
input starts
EXTERNAL: qs v0.25.3.

array_size values: -92322737
input ends
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
ERROR: Failed: _
INFO: Saved test case in file `rcpp_use_after_deallocate_output/520d9b63d5d9e5fa249b7fae87d2621c419ddf0a.fail`
input starts
array_size values: 1957496050
input ends
[1]    53624 segmentation fault (core dumped)  ./rcpp_use_after_deallocate_DeepState_TestHarness --fuzz --timeout=1 --seed=2

Instead, if Valgrind is used, the program is aborted ad stated in the message cannot throw exceptions and so is aborting instead. Sorry.

$ valgrind  ./rcpp_use_after_deallocate_DeepState_TestHarness --fuzz --timeout=1 --seed=2 --fuzz_save_passing --output_test_dir  rcpp_use_after_deallocate_output
==53705== Memcheck, a memory error detector
==53705== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==53705== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==53705== Command: ./rcpp_use_after_deallocate_DeepState_TestHarness --fuzz --timeout=1 --seed=2 --fuzz_save_passing --output_test_dir rcpp_use_after_deallocate_output
==53705== 
INFO: Starting fuzzing
WARNING: No test specified, defaulting to first test defined (_)
input starts
EXTERNAL: qs v0.25.3.

array_size values: -92322737
input ends
==53705== Argument 'size' of function __builtin_vec_new has a fishy (possibly negative) value: -92322737
==53705==    at 0x48472E3: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==53705==    by 0x4D5399D: rcpp_use_after_deallocate(int) (use_after_deallocate.cpp:6)
==53705==    by 0x114FB6: DeepState_Test__() (rcpp_use_after_deallocate_DeepState_TestHarness.cpp:23)
==53705==    by 0x114D58: DeepState_Run__() (rcpp_use_after_deallocate_DeepState_TestHarness.cpp:14)
==53705==    by 0x12B124: DeepState_RunTestNoFork (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==53705==    by 0x12B668: DeepState_FuzzOneTestCase (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==53705==    by 0x12B8FF: DeepState_Fuzz (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==53705==    by 0x11074A: main (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==53705== 
**53705** new/new[] failed and should throw an exception, but Valgrind
**53705**    cannot throw exceptions and so is aborting instead.  Sorry.
==53705==    at 0x484551C: ??? (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==53705==    by 0x4847355: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==53705==    by 0x4D5399D: rcpp_use_after_deallocate(int) (use_after_deallocate.cpp:6)
==53705==    by 0x114FB6: DeepState_Test__() (rcpp_use_after_deallocate_DeepState_TestHarness.cpp:23)
==53705==    by 0x114D58: DeepState_Run__() (rcpp_use_after_deallocate_DeepState_TestHarness.cpp:14)
==53705==    by 0x12B124: DeepState_RunTestNoFork (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==53705==    by 0x12B668: DeepState_FuzzOneTestCase (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==53705==    by 0x12B8FF: DeepState_Fuzz (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==53705==    by 0x11074A: main (in /home/fabri/test/testHarness/RcppDeepState/inst/testpkgs/testSAN/inst/testfiles/rcpp_use_after_deallocate/rcpp_use_after_deallocate_DeepState_TestHarness)
==53705== 
==53705== HEAP SUMMARY:
==53705==     in use at exit: 51,656,156 bytes in 10,684 blocks
==53705==   total heap usage: 33,302 allocs, 22,618 frees, 92,559,782 bytes allocated
==53705== 
==53705== LEAK SUMMARY:
==53705==    definitely lost: 0 bytes in 0 blocks
==53705==    indirectly lost: 0 bytes in 0 blocks
==53705==      possibly lost: 0 bytes in 0 blocks
==53705==    still reachable: 51,656,156 bytes in 10,684 blocks
==53705==                       of which reachable via heuristic:
==53705==                         newarray           : 4,264 bytes in 1 blocks
==53705==         suppressed: 0 bytes in 0 blocks
==53705== Rerun with --leak-check=full to see details of leaked memory
==53705== 
==53705== For lists of detected and suppressed errors, rerun with: -s
==53705== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

Missing Rcpp Strings support

Description

RcppDeepState supports the analysis of functions that contain arguments which type falls in the following list:

  • Rcpp::NumericVector
  • Rcpp::NumericMatrix
  • Rcpp::CharacterVector
  • Rcpp::IntegerVector
  • arma::mat
  • std::double
  • std::string
  • std::int

However I found that if a function contains a Rcpp::String argument, the function will no be analyzed. It will be a good step to implement this lack and add the support for Rcpp::String arguments. RcppDeepState now only accepts std::string arguments, not Rcpp::String.

@tdhock I plan to solve this in a new pull request.

Steps to reproduce

I created a simple Rcpp based package containing the getLen function that calculates the length of a Rcpp::String.

#include <Rcpp.h>
#include <string>

using namespace std;

// [[Rcpp::export]]
int getLen(Rcpp::String arg1){
    return(strlen(arg1.get_cstr()));
}

However, if I attempt to compile this package using RcppDeepState, an error is thrown.
The Rcpp::String argument is not supported, which is the reason for this.

> require("RcppDeepState")
> deepstate_harness_compile_run("/home/fabri/test/testHarness/Rcpp/getLen", verbose=TRUE)
...

We can't test the function - getLen - due to the following datatypes falling out of the allowed ones: String

Error in deepstate_pkg_create(package_path, verbose) : 
  Testharnesses cannot be created for the package - datatypes fall out of specified list!!

RcppDeepState optimization options

I noticed that RcppDeepState wasn't able to find issues in some function of the testSAN package so I started to manually run and analyze each function one by one. I first tried to run RcppDeepState on the rcpp_write_index_outofbound function and, no matter of the seed, the result table was always empty. By manually running the binary file for the rcpp_write_index_outofbound function, I found that no error is thrown during the execution. After some tests I discovered that this problem is due to the use of optimization options in the compiler flags, when the command R CMD INSTALL is run for the package.

Proof of concept

To understand this problem and have a practical example, I copied the rcpp_write_index_outofbound function and added to it a main function that calls rcpp_write_index_outofbound with a parameter that will certainly smash the stack and cause an heap overflow. This is the result.

#include <iostream>
using namespace std;

// [[Rcpp::export]]
int rcpp_write_index_outofbound(int wbound){
  int x = wbound;
  int *stack_array = new int[100];
  stack_array[x+100] = 50;
  return 0; 
} 

int main(int argc, char** argv){
    int bigNumber = 1353252;
    rcpp_write_index_outofbound(bigNumber);
    cout << "Finished" << endl;
    return 0;
}

I tried to compile this program in two different ways. The first one using optimization options and the second one disabling optimizers. What I expect by each execution is a segmentation fault error, however the results are quite different:

$ g++ -o test -O2 test.cpp
$ ./test
Finished

$ g++ -o test test.cpp
$ ./test
[1]    9815 segmentation fault (core dumped)  ./test

Result and future work

As a result, we can conclude that RcppDeepState requires compiler optimizations to be disabled in order to operate at its maximum performance and detect subtler bugs. Since the change of optimization options involves the change of the CXXFLAGS in the ~/.R/Makevars file, I can conclude that this should be a prerequisite of using RcppDeepState.

Future work: I have to keep in mind that optimization options must be turned off in the Action's Docker-based system that I will create for RcppDeepState.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.