GithubHelp home page GithubHelp logo

rmongodb's Introduction

Project status

Dear R and rmongodb users, rmongodb project is based on legacy C drivers. For the moment, I (@dselivanov) don't have time for rmongodb development. And unless someone will not port new drivers, package will remain with outdated functionality (see issues ). If some of you want to udertake package maintainance - let me know.

For new R / mongodb users , I recommend to start with mongolite package wich is much more actively maintained.

rmongodb

This is an R (www.r-project.org) extension supporting access to MongoDB (www.mongodb.org) using the mongodb-c-driver (http://docs.mongodb.org/ecosystem/drivers/c/).

The latest stable version is available on CRAN: http://cran.r-project.org/package=rmongodb

Thanks to Gerald Lindsly and MongoDB, Inc. (formerly 10gen) for the initial work.

In October 2013, MongoSoup and Markus Schmidberger have overtaken the development and maintenance of the R package.

Since October 2014 package is maintained by Dmitriy Selivanov. Please feel free to send us issues or pull requests via github: https://github.com/mongosoup/rmongodb

Furthermore, I'm happy to get your feedback personally via email: selivanov.dmitriy (at) gmail.com.

Usage

Once you have installed the package, it may be loaded from within R like any other package:

library("rmongodb")

# connect to your local mongodb
mongo <- mongo.create()

# create query object 
query <- mongo.bson.from.JSON('{"age": 27}')

# Find the first 100 records
#    in collection people of database test where age == 27
cursor <- mongo.find(mongo, "test.people", query, limit=100L)
# Step through the matching records and display them
while (mongo.cursor.next(cursor))
    print(mongo.cursor.value(cursor))
mongo.cursor.destroy(cursor)

res <- mongo.find.batch(mongo, "test.people", query, limit=100L)

mongo.disconnect(mongo)
mongo.destroy(mongo)

There is also one demo available:

library("rmongodb")
demo(teachers_aid)

Supported Functionality by rmongodb

  • Connecting and disconnecting to MongoDB
  • Querying, inserting and updating to MongoDB including with JSON and BSON
  • Creating and handling BSON objects
  • Dropping collections and databases on MongoDB
  • Creating indices on MongoDB collections
  • Error handling
  • Executing commands on MongoDB
  • Adding, removing, handling files on a "Grid File System" (GridFS) on a MongoDB server
  • High Level functionality as mongo.apply, mongo.summary, mongo.get.keys, ...
  • Aggregation pipeline

Good ressources to Get Started with rmongodb

Good ressources to Install and Get Started with MongoDB

Good ressources for working with JSON-Data in R:

Development

To install the development version of rmongodb, it's easiest to use the devtools package:

# install.packages("devtools")
library(devtools)
install_github("mongosoup/rmongodb")

We advice using RStudio (www.rstudio.org) for the package development. The RStudio .Rproj file is included in the repository.

Usefull links

Versioning

We use a three step version number system, e.g. v1.2.1:

  • first: major changes as new C libraries
  • second: for each new stable CRAN release
  • third: for each new github version ready for testing

General Development Rules

  • we use roxygen2
  • we write RUnit tests for all new functionality in tests/test_XXX.R
  • for bigger changes we use branches
  • run valgrid to check for memory leaks R -d "valgrind --tool=memcheck --leak-check=full" --vanilla < test_XXX.R > log.txt 2>&1
  • CRAN submission:
  • http://cran.r-project.org/submit.html
  • create Package tar.gz via RStudio "Build Source Package"
  • run R CRAN checks via: R CMD check --as-cran package.tar.gz
  • run R CRAN checks without running mongodb installation
  • create a tag / release on github for every CRAN submission

rmongodb's People

Contributors

dselivanov avatar dtenenba avatar fred777 avatar gerald-lindsly avatar jeroen avatar kbroman avatar marians avatar mariokoppen avatar monkey101 avatar musically-ut avatar schmidb avatar stanstrup avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rmongodb's Issues

`mongo.bson.from.list` return `NULL` occasionally

I could reproduce this bug on my desktop(ubuntu) and laptop(mac OS X).

Here is my reproducible example.

library(rmongodb)
sessionInfo()
## R version 3.0.2 (2013-09-25)
## Platform: x86_64-pc-linux-gnu (64-bit)
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C         LC_TIME=C           
##  [4] LC_COLLATE=C         LC_MONETARY=C        LC_MESSAGES=C       
##  [7] LC_PAPER=C           LC_NAME=C            LC_ADDRESS=C        
## [10] LC_TELEPHONE=C       LC_MEASUREMENT=C     LC_IDENTIFICATION=C 
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] rmongodb_1.1.3 knitr_1.5     
## 
## loaded via a namespace (and not attached):
## [1] evaluate_0.5.1 formatR_0.10   stringr_0.6.2  tools_3.0.2
# 
a <- memCompress(serialize(letters, NULL), "gzip")
for (i in 1:100) {
    print(mongo.bson.from.list(list(a = a)))
}
## NULL
## NULL
## NULL
## NULL
## NULL
##  a : 5    BSON_BINDATA
##  a : 5    BSON_BINDATA
## NULL
## NULL
## NULL
## NULL
## NULL
## NULL
## NULL
## NULL
##  a : 5    BSON_BINDATA
## NULL
## NULL
## NULL
## NULL
##  a : 5    BSON_BINDATA
##  a : 5    BSON_BINDATA
## NULL
##  a : 5    BSON_BINDATA
## NULL
## NULL
## NULL
## NULL
## NULL
## NULL
## NULL
## NULL
##  a : 5    BSON_BINDATA
## NULL
## NULL
## NULL
## NULL
## NULL
##  a : 5    BSON_BINDATA
## NULL
## NULL
## NULL
## NULL
##  a : 5    BSON_BINDATA
##  a : 5    BSON_BINDATA
## NULL
## NULL
## NULL
## NULL
## NULL
## NULL
## NULL
## NULL
##  a : 5    BSON_BINDATA
##  a : 5    BSON_BINDATA
##  a : 5    BSON_BINDATA
## NULL
## NULL
##  a : 5    BSON_BINDATA
## NULL
## NULL
##  a : 5    BSON_BINDATA
## NULL
##  a : 5    BSON_BINDATA
## NULL
## NULL
## NULL
## NULL
##  a : 5    BSON_BINDATA
##  a : 5    BSON_BINDATA
## NULL
## NULL
## NULL
## NULL
## NULL
## NULL
##  a : 5    BSON_BINDATA
## NULL
##  a : 5    BSON_BINDATA
## NULL
## NULL
##  a : 5    BSON_BINDATA
## NULL
##  a : 5    BSON_BINDATA
##  a : 5    BSON_BINDATA
## NULL
##  a : 5    BSON_BINDATA
##  a : 5    BSON_BINDATA
## NULL
## NULL
## NULL
##  a : 5    BSON_BINDATA
## NULL
## NULL
##  a : 5    BSON_BINDATA
## NULL
## NULL
##  a : 5    BSON_BINDATA
## NULL
## NULL

Cannot retrieve data when $in is used

I am trying to run a query that looks like this in JSON format:

{
    "projectId": "d84b99a1123ccc6052807f81c019f17cb61151bf",
    "it": {
        "$in": [
            "ea",
            "eo"
        ]
    }
}

In the MongoDB command line, this returns some rows:

> use test
> db.intervals.find({"projectId" : "d84b99a1123ccc6052807f81c019f17cb61151bf", "it":{"$in":["ea", "eo"]}}).count()
1825

However, when I try exactly the same query in R with rmongodb, I get -1 as a result:

> mongo.count(conn, "test.intervals", mongo.bson.from.JSON('{"projectId" : "d84b99a1123ccc6052807f81c019f17cb61151bf", "it":{"$in":["ea", "eo"]}}'))
[1] -1

This behavior is the same in the version of rmongodb that is currently on CRAN (1.6.5) and also the latest master ea900cc.

Unnamed lists as BSON arrays

It seems both mongo_bson_from_list and mongo_bson_buffer_append_list have been explicitly designed to encode unnamed R lists as bson objects, using mkChar(numstr(i+1)) as keys. This is quite weird and leads to unexpected output and bug reports.

It seems much more intuitive to convert unnamed lists to a bson array. Hence when getAttrib(value, R_NamesSymbol) == R_NilValue, we should initiate the buffer with bson_append_start_array instead of bson_append_start_object.

Simple example:

> mongo.bson.to.list(mongo.bson.from.list(list(123, 456)))
$`1`
[1] 123

$`2`
[1] 456

mongo.insert results in wrong bson format in mongodb

Hi. First of all, thanks for a great package! I have an issue while inserting nested json. For example I'll provide reproducible code:

library(RJSONIO)
library(RCurl)
library(rmongodb)
sessionInfo()
# R version 3.0.2 (2013-09-25)
# Platform: x86_64-pc-linux-gnu (64-bit)
# 
# locale:
#   [1] LC_CTYPE=ru_RU.UTF-8       LC_NUMERIC=C               LC_TIME=ru_RU.UTF-8        LC_COLLATE=ru_RU.UTF-8    
# [5] LC_MONETARY=ru_RU.UTF-8    LC_MESSAGES=ru_RU.UTF-8    LC_PAPER=ru_RU.UTF-8       LC_NAME=C                 
# [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=ru_RU.UTF-8 LC_IDENTIFICATION=C       
# 
# attached base packages:
#   [1] parallel  stats     graphics  grDevices utils     datasets  methods   base     
# 
# other attached packages:
#   [1] rmongodb_1.4.2    RCurl_1.95-4.1    bitops_1.0-6      RJSONIO_1.0-3     data.table_1.8.10
# 
# loaded via a namespace (and not attached):
#   [1] jsonlite_0.9.3 tools_3.0.2   
url <- "https://api.vk.com/method/wall.get?owner_id=-44067087&offset=0&count=1&v=5.5"
json = getURL(url, .opts=list(ssl.verifypeer = FALSE))
# change connection to your mongo server
mongo = mongo.create(host = "localhost")
mongo.is.connected(mongo)
TRUE
mongo.insert(mongo, 'test.test_insert', mongo.bson.from.JSON(json))

This will insert wrong bson like this (pay attention on "items" and "attachments" fields):

{
    "_id" : ObjectId("52f479ccfb5b1584975727d8"),
    "response" : {
        "count" : 79,
        "items" : {
            "1" : {
                "id" : 486,
                "from_id" : -44067087,
                "to_id" : -44067087,
                "date" : 1391628365,
                "post_type" : "post",
                "text" : "",
                "attachments" : {
                    "1" : {
                        "type" : "photo",
                        "photo" : {
                            "id" : 320592184,
                            "album_id" : -7,
                            "owner_id" : -44067087,
                            "user_id" : 100,
                            "photo_75" : "http://cs616327.vk.me/v616327626/4391/cUoVkmOTJas.jpg",
                            "photo_130" : "http://cs616327.vk.me/v616327626/4392/tyoSH8PgNPg.jpg",
                            "photo_604" : "http://cs616327.vk.me/v616327626/4393/dd6QdD_lqEU.jpg",
                            "photo_807" : "http://cs616327.vk.me/v616327626/4394/gsYHzQsN5X8.jpg",
                            "photo_1280" : "http://cs616327.vk.me/v616327626/4395/VRma-ROTfWY.jpg",
                            "photo_2560" : "http://cs616327.vk.me/v616327626/4396/2IVbKsa4t7c.jpg",
                            "width" : 1275,
                            "height" : 1875,
                            "text" : "",
                            "date" : 1391628366,
                            "post_id" : 486,
                            "access_key" : "9710c7bfdc5544bbd0"
                        }
                    }
                },
                "comments" : {
                    "count" : 0
                },
                "likes" : {
                    "count" : 0
                },
                "reposts" : {
                    "count" : 0
                }
            }
        }
    }
}

but if I will save the same json to the disk and then use mongoimport I will recieve TRUE bson record.

writeChar(json,con='test_insert.json')
mongoimport --db test --collection test_insert --drop test_insert.json

TRUE object with "items" and "attachments" stored as arrays looks like this one:

{
    "_id" : ObjectId("52f47ce0fb5b15849757284d"),
    "response" : {
        "count" : 79,
        "items" : [
            {
                "id" : 486,
                "from_id" : -44067087,
                "to_id" : -44067087,
                "date" : 1391628365,
                "post_type" : "post",
                "text" : "",
                "attachments" : [
                    {
                        "type" : "photo",
                        "photo" : {
                            "id" : 320592184,
                            "album_id" : -7,
                            "owner_id" : -44067087,
                            "user_id" : 100,
                            "photo_75" : "http://cs616327.vk.me/v616327626/4391/cUoVkmOTJas.jpg",
                            "photo_130" : "http://cs616327.vk.me/v616327626/4392/tyoSH8PgNPg.jpg",
                            "photo_604" : "http://cs616327.vk.me/v616327626/4393/dd6QdD_lqEU.jpg",
                            "photo_807" : "http://cs616327.vk.me/v616327626/4394/gsYHzQsN5X8.jpg",
                            "photo_1280" : "http://cs616327.vk.me/v616327626/4395/VRma-ROTfWY.jpg",
                            "photo_2560" : "http://cs616327.vk.me/v616327626/4396/2IVbKsa4t7c.jpg",
                            "width" : 1275,
                            "height" : 1875,
                            "text" : "",
                            "date" : 1391628366,
                            "post_id" : 486,
                            "access_key" : "9710c7bfdc5544bbd0"
                        }
                    }
                ],
                "comments" : {
                    "count" : 0
                },
                "likes" : {
                    "count" : 0
                },
                "reposts" : {
                    "count" : 0
                }
            }
        ]
    }
}

seems mongo.bson.from.list() have to parse list as array of objects if this list components have no names.

mongo.bson.to.list converting issues

Hi. There is one issue with mongo.bson.to.list. It tries to simplify results and make from them named vector instead of lilst. So if I have {id:1, value:1} it converts it into named vector. But if I have {id:1, value:1, string: 'test'} mongo.bson.to.list return list.
I think this function have to have option like 'simplify' in sapply function.

Not catching find errors

I was playing with a 250.000 row dataset and kept having problems with mongo.find not returning any results when adding a sort. After running the same command in the shell it is now clear what the problem is:

> db.flights.find().sort({row:1}).limit(2)
error: {
    "$err" : "Runner error: Overflow sort stage buffered data usage of 33554441 bytes exceeds internal limit of 33554432 bytes",
    "code" : 17144
}

It would be nice if mongo.find was able to catch such an error in R, rather than just returning a cursor with 0 records.

data size limits

As a continueation of the discussion I started here I am wondering if you can shed some light on data size limits?
Looking in the rmongodb documentation it seems mongodb should be able to handle large datasizes. Bit rmongodb seem to fail.

I made the script below to test if it my troubles were a size limit issue and it seems so. it consistently fails when counter = 512.

start = 10000
counter = 1
success = TRUE


mongo <- mongo.create()


while(success==TRUE){
  testdata = 1:(start*counter)

  buf <- mongo.bson.buffer.create()
  mongo.bson.buffer.append(buf, "testdata", testdata)
  mongo.bson.buffer.append(buf, "size", object.size(testdata))
  buf <- mongo.bson.from.buffer(buf)
  success <- mongo.insert(mongo, "limittest_db.limit", buf)

  counter = counter*2
}

del <- mongo.disconnect(mongo)
del <- mongo.destroy(mongo)

When it fails length(testdata) is 2560000 and object.size(testdata) is 10240040 bytes. Is this a bug or a limitation or mongodb?

request for further specification of available functionality

first of all, thanks for putting this together. would it be helpful to clarify some of the available functionality? for example, the mongo.cursor.to.list() and .to.data.frame() functions return the following error for me:

"This fails for most NoSQL data structures. I am working on a new solution"

Its a kind enough error message, and I realize there's a lot of work to do to make this kind of thing work, but its pretty disappointing to go to the trouble of understanding the package only to see that error.

ability to turn lists into complex BSON objects

It would be great if there was a way to avoid creating BSON buffers. Like if you could do this:

l <- list(name="Dan", age=list("$lte"=30L), occupation=list("$in"=c("Tailor", "Sailor")))
mongo.find.one(mongo, ns, query=list2bson(l))

That doesn't seem hard to do. I might work on if I have time. It's nice that you can use lists for simple queries but it's tedious to make a bson buffer to do any non-trivial query involving a $ operator for example. Since you can express any JSON-like object in R using nested lists and vectors, we should be able to convert that to BSON.

citation of rmongodb

What would be the most proper way to cite rmongodb? citation("rmongodb") is auto-generated from the description so perhaps there is a better way?

mongo.insert.batch fails in introduction.Rmd & script suggestion to speed up

This issue concerns the introduction.Rmd script in the vignettes directory.

The zips dataset has several documents with identical _id values. The
mongo.insert.batch(mongo, "rmongodb.zips", res) statement fails with an error message in the server window regarding a duplicate _id. I got is to run by inserting the following line:

myzips <- zips[ !duplicated( zips[,"_id"]), ]

and substituting myzips for zips.

Also, using a for loop is a relatively slow way to create all the bson values stored in the variable res. It takes 5.55 seconds on my laptop (including the mongo.insert.batch function call). The process can be sped up by first using an apply to convert the list matrix into a list:

myziplist <- list()
myziplist <- apply( myzips, 1, function(x) c( myziplist, x ) )

and then using lapply to create res

res <- lapply( myziplist, function(x) mongo.bson.from.list(x) )

This takes 1.28 seconds on my laptop (including the mongo.insert.batch function call).

One more small point. It is unnecessary to check for the MongoDB connection using

if(mongo.is.connected(mongo) == TRUE)

since mongo.is.connected(mongo) returns the value TRUE if there is a connection

if(mongo.is.connected(mongo))

is sufficient.

Thanks much for working on this library. I think it may offer a database solution for an R application I'm developing. I'll be testing the potential over the next few weeks.

add examples on how to query by date with from.JSON

I need to query by date, both in a standard query and in an aggregation pipeline. Since the queries can get complex, I'd like to use the mongo.bson.fromJSON to simplify creating the query, but I can't find any examples creating a query with a date.

Here is one of my attempts:

query <- mongo.bson.from.JSON('{
"ts": {
"$gt": { "$date": "2014-03-01T00:00:00Z" }
}
}')

How should this query be created?

error in test_find

submitted by Brian Ripley:

Everything was OK until

Running ‘test_find.R’
ERROR
Running the tests in ‘tests/test_find.R’ failed.
Last 13 lines of output:

  • res <- mongo.find.all(mongo, ns)
  • checkEquals( dim(res), c(4,4) )
  • checkTrue( is.list(res) )
  • cleanup db and close connection

  • mongo.drop.database(mongo, db)
  • mongo.destroy(mongo)
  • }
    [1] 16810
    [1] "bad query: BadValue unknown operator: $bad"
    Error in checkEqualsNumeric(err, 10068) :
    Mean relative difference: 0.4010708

Unable to query a 2dsphere index

I am trying to run find on a 2dsphere index (http://docs.mongodb.org/manual/tutorial/query-a-2dsphere-index/) using rmongodb 1.4.2

db.places.find( { loc :
                  { $geoWithin :
                    { $geometry :
                      { type : "Polygon" ,
                        coordinates : 
                         [ [ [ 2 , 50 ] ,  [ 3 , 50 ] ,[ 3 , 51 ] , [ 2 , 51 ], [ 2 , 50 ] ] ]
                } } } } )

Yet though this works perfectly within MongoDB commandline, I am unable to get this to work using:

string <- paste0('{ "loc" : { "$geoWithin" : { "$geometry" : { type : "Polygon" ,"coordinates": [ [ [',longitude,', ',latitude,'], [',longitude+1,', ',latitude,'], [',longitude+1,', ',latitude+1,'], [',longitude,', ',latitude+1,'], [',longitude,', ',latitude,'] ] ] } } } }')
query <- mongo.bson.from.JSON (string)

The resulting string can be fed into a commandline find, works ok:

db.SR_Data_Collection_1.count( { "loc" : { "$geoWithin" : { "$geometry" : { "type" : "Polygon" ,"coordinates": [ [ [2, 50], [3, 50], [3, 51], [2, 51], [2, 50] ] ] } } } })

I am able to run find on a 2d index, like so:

string <- paste0('{ "loc" : { "$within" : { "$box" : [[',longitude,', ',latitude,'], [',longitude+1,', ',latitude+1,']] } } }')    
query <- mongo.bson.from.JSON (string)

The only major difference seems to be the "nestedness" of the JSON in the 2dsphere find - maybe the [[[]]] array is the problem?

How to getIndexes from rmongodb

Very minor, but I couldn't figure out f it were possible to do something similar as db.collection.getIndexes() from inside of R.

mongo.distinct()

mongo.distinct() appears to be missing from latest release- is there a plan to bring it back, or a workaround?

Problem with raw values

Looks like a bug in mongo_bson_buffer_append_raw:

 > mongo.bson.from.list(list(x=charToRaw("foo")))
 NULL

mongo.is.connected(mongo) Generates error when no connection exists

When testing to see if Mongo is connected, the call produces an error which terminates processing of knitr. It would seem that a query for a connection should not return an error. It should just return TRUE or FASE.

i if (mongo.is.connected(mongo) == FALSE) {

  • print("Mongo connection is destroyed!")
  • }
    Error in mongo.is.connected(mongo) :
    mongo connection object appears to have been destroyed.

    print("Mongo connection is destroyed!")

BTW, I really like the work you are doing with the rmongodb driver! I You are doing an awesome. job.

Missing values as NULL

Not all missing values are handled appropriately. It might be better to convert them to bson NULL values.

> mongo.bson.from.list(list(num=c(pi, NA, NaN, Inf, -Inf), bool=c(T,F,NA), int=c(7L, NA), string=c("foo", NA)))
    num : 4      
        0 : 1    3.141593
        1 : 1    nan
        2 : 1    nan
        3 : 1    inf
        4 : 1    -inf

    bool : 4     
        0 : 8    true
        1 : 8    false
        2 : 8    true

    int : 4      
        0 : 16   7
        1 : 16   -2147483648

    string : 4   
        0 : 2    foo
        1 : 2    NA

rmongodb fails to load/install on Windows

Hello,

I'm trying to install rmongodb into my environment but for some reason it will not wok with my configuration. I've tried both installing the stable package from Cran and the devtools version.

The Cran package shows that it installs, but it will not load. It fails with an error. First my nvironment:

sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252

[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C

[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices datasets utils methods base

loaded via a namespace (and not attached):
[1] packrat_0.4.0 tools_3.1.1

Here is the install and load error:

install.packages("rmongodb")
Installing package into ‘C:/Users/David/Documents/R/win-library/3.1’
(as ‘lib’ is unspecified)
trying URL 'http://cran.revolutionanalytics.com/bin/windows/contrib/3.1/rmongodb_1.6.5.zip'
Content type 'application/zip' length 1156163 bytes (1.1 Mb)
opened URL
downloaded 1.1 Mb
package ‘rmongodb’ successfully unpacked and MD5 sums checked

The downloaded binary packages are in
C:\Users\David\AppData\Local\Temp\Rtmp6xkRHl\downloaded_packages

library(rmongodb)
Error in get(".packageName", where) : lazy-load database 'P' is corrupt
In addition: Warning message:
In get(".packageName", where) : internal error -3 in R_decompress1
Error: package or namespace load failed for ‘rmongodb’
Lastly, the error on attempting to install from devtools:

library(devtools)
install_github("rmongodb", "mongosoup")
Downloading github repo mongosoup/rmongodb@master
Installing rmongodb
"C:/R/R-311.1/bin/x64/R" --vanilla CMD INSTALL
"C:\Users\David\AppData\Local\Temp\Rtmp6xkRHl\devtools280c17ba7815\mongosoup-rmongodb-a40c94c"
--library="C:/Users/David/Documents/R/win-library/3.1" --install-tests
installing source package 'rmongodb' ... ** libs
*** arch - i386
gcc -m32 -I"C:/R/R-31
1.1/include" -DNDEBUG -I"d:/RCompile/CRANpkg/extralibs64/local/include" -DMONGO_STATIC_BUILD -DR_SAFETY_NET -O3 -Wall -std=gnu99 -mtune=core2 -c libmongo/bson.c -o libmongo/bson.o
gcc -m32 -I"C:/R/R-311.1/include" -DNDEBUG -I"d:/RCompile/CRANpkg/extralibs64/local/include" -DMONGO_STATIC_BUILD -DR_SAFETY_NET -O3 -Wall -std=gnu99 -mtune=core2 -c libmongo/encoding.c -o libmongo/encoding.o
gcc -m32 -I"C:/R/R-31
1.1/include" -DNDEBUG -I"d:/RCompile/CRANpkg/extralibs64/local/include" -DMONGO_STATIC_BUILD -DR_SAFETY_NET -O3 -Wall -std=gnu99 -mtune=core2 -c libmongo/env.c -o libmongo/env.o
gcc -m32 -I"C:/R/R-311.1/include" -DNDEBUG -I"d:/RCompile/CRANpkg/extralibs64/local/include" -DMONGO_STATIC_BUILD -DR_SAFETY_NET -O3 -Wall -std=gnu99 -mtune=core2 -c libmongo/gridfs.c -o libmongo/gridfs.o
gcc -m32 -I"C:/R/R-31
1.1/include" -DNDEBUG -I"d:/RCompile/CRANpkg/extralibs64/local/include" -DMONGO_STATIC_BUILD -DR_SAFETY_NET -O3 -Wall -std=gnu99 -mtune=core2 -c libmongo/md5.c -o libmongo/md5.o
gcc -m32 -I"C:/R/R-311.1/include" -DNDEBUG -I"d:/RCompile/CRANpkg/extralibs64/local/include" -DMONGO_STATIC_BUILD -DR_SAFETY_NET -O3 -Wall -std=gnu99 -mtune=core2 -c libmongo/mongo.c -o libmongo/mongo.o
gcc -m32 -I"C:/R/R-31
1.1/include" -DNDEBUG -I"d:/RCompile/CRANpkg/extralibs64/local/include" -DMONGO_STATIC_BUILD -DR_SAFETY_NET -O3 -Wall -std=gnu99 -mtune=core2 -c libmongo/numbers.c -o libmongo/numbers.o
gcc -m32 -I"C:/R/R-311.1/include" -DNDEBUG -I"d:/RCompile/CRANpkg/extralibs64/local/include" -DMONGO_STATIC_BUILD -DR_SAFETY_NET -O3 -Wall -std=gnu99 -mtune=core2 -c api.c -o api.o
gcc -m32 -I"C:/R/R-31
1.1/include" -DNDEBUG -I"d:/RCompile/CRANpkg/extralibs64/local/include" -DMONGO_STATIC_BUILD -DR_SAFETY_NET -O3 -Wall -std=gnu99 -mtune=core2 -c api_bson.c -o api_bson.o
gcc -m32 -I"C:/R/R-311.1/include" -DNDEBUG -I"d:/RCompile/CRANpkg/extralibs64/local/include" -DMONGO_STATIC_BUILD -DR_SAFETY_NET -O3 -Wall -std=gnu99 -mtune=core2 -c api_convert.c -o api_convert.o
gcc -m32 -I"C:/R/R-31
1.1/include" -DNDEBUG -I"d:/RCompile/CRANpkg/extralibs64/local/include" -DMONGO_STATIC_BUILD -DR_SAFETY_NET -O3 -Wall -std=gnu99 -mtune=core2 -c api_gridfs.c -o api_gridfs.o
gcc -m32 -I"C:/R/R-311.1/include" -DNDEBUG -I"d:/RCompile/CRANpkg/extralibs64/local/include" -DMONGO_STATIC_BUILD -DR_SAFETY_NET -O3 -Wall -std=gnu99 -mtune=core2 -c api_mongo.c -o api_mongo.o
gcc -m32 -I"C:/R/R-31
1.1/include" -DNDEBUG -I"d:/RCompile/CRANpkg/extralibs64/local/include" -DMONGO_STATIC_BUILD -DR_SAFETY_NET -O3 -Wall -std=gnu99 -mtune=core2 -c symbols.c -o symbols.o
gcc -m32 -I"C:/R/R-311.1/include" -DNDEBUG -I"d:/RCompile/CRANpkg/extralibs64/local/include" -DMONGO_STATIC_BUILD -DR_SAFETY_NET -O3 -Wall -std=gnu99 -mtune=core2 -c utility.c -o utility.o
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/yajl/yajl.o: File format not recognized
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/yajl/yajl_alloc.o: File format not recognized
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/yajl/yajl_buf.o: File format not recognized
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/yajl/yajl_encode.o: File format not recognized
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/yajl/yajl_gen.o: File format not recognized
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/yajl/yajl_lex.o: File format not recognized
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/yajl/yajl_parser.o: File format not recognized
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/yajl/yajl_tree.o: File format not recognized
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/yajl/yajl_version.o: File format not recognized
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/bson/bcon.o: File format not recognized
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/bson/bson-clock.o: File format not recognized
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/bson/bson-context.o: File format not recognized
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/bson/bson-error.o: File format not recognized
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/bson/bson-iter.o: File format not recognized
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/bson/bson-json.o: File format not recognized
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/bson/bson-keys.o: File format not recognized
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/bson/bson-md5.o: File format not recognized
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/bson/bson-memory.o: File format not recognized
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/bson/bson-oid.o: File format not recognized
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/bson/bson-reader.o: File format not recognized
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/bson/bson-string.o: File format not recognized
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/bson/bson-utf8.o: File format not recognized
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/bson/bson-writer.o: File format not recognized
C:\Rtools\gcc-4.6.3\bin\nm.exe: libbson/bson/bson.o: File format not recognized
gcc -m32 -shared -s -static-libgcc -o rmongodb.dll tmp.def libbson/yajl/yajl.o libbson/yajl/yajl_alloc.o libbson/yajl/yajl_buf.o libbson/yajl/yajl_encode.o libbson/yajl/yajl_gen.o libbson/yajl/yajl_lex.o libbson/yajl/yajl_parser.o libbson/yajl/yajl_tree.o libbson/yajl/yajl_version.o libbson/bson/bcon.o libbson/bson/bson-clock.o libbson/bson/bson-context.o libbson/bson/bson-error.o libbson/bson/bson-iter.o libbson/bson/bson-json.o libbson/bson/bson-keys.o libbson/bson/bson-md5.o libbson/bson/bson-memory.o libbson/bson/bson-oid.o libbson/bson/bson-reader.o libbson/bson/bson-string.o libbson/bson/bson-utf8.o libbson/bson/bson-writer.o libbson/bson/bson.o libmongo/bson.o libmongo/encoding.o libmongo/env.o libmongo/gridfs.o libmongo/md5.o libmongo/mongo.o libmongo/numbers.o api.o api_bson.o api_convert.o api_gridfs.o api_mongo.o symbols.o utility.o -lws2_32 -Ld:/RCompile/CRANpkg/extralibs64/local/lib/i386 -Ld:/RCompile/CRANpkg/extralibs64/local/lib -LC:/R/R-31
1.1/bin/i386 -lR
libbson/yajl/yajl.o: file not recognized: File format not recognized
collect2: ld returned 1 exit status
no DLL was created
ERROR: compilation failed for package 'rmongodb'

removing 'C:/Users/David/Documents/R/win-library/3.1/rmongodb'
restoring previous 'C:/Users/David/Documents/R/win-library/3.1/rmongodb' Error: Command failed (1) In addition: Warning message: Username parameter is deprecated. Please use mongosoup/rmongodb >
Any help would be greatly appreciated.

Thank you,
David Parker

Batch update & insert: loop in C instead of R

I need to update a lot of documents at the same time.
I know mongo doesn't support that.
But could we imagine to have in rmongodb a C wrapper so that we do the loop in C instead of R? I would like for instance to pass a big list of BSON and update all of them, using a loop in C. Looping in R is too time consuming....

mongo.bson.from.list returns NULL

I am getting errors when inserting an empty record because:

> mongo.bson.from.list(list())
NULL

I think it should be an empty bson instead.

Can't create 'or' query

The following two queries return records from my test dataset:

q1=mongo.bson.from.list(list("x1"=list("$gt"=9)))
q2=mongo.bson.from.list(list("x2"=list("$gt"=3)))

but trying to construct an 'or' query that returns the union of both returns nothing:

q12 = mongo.bson.from.list(list("$or"=list("x1"=list("$gt"=9),"x2"=list("$gt"=3))))

maybe this is the wrong way to put an 'or' query together?

Trying to construct it from JSON doesn't work either:

j12 = '{"$or": [{"x1": {"$gt": 9}}, {"x2": {"$gt": 3}} ] }'
qj12=mongo.bson.from.JSON(j12)

creates a qj12 that isn't the same as q12 - the structure is different, as is apparent when you print them out.

This might well be user error, but I don't see a lot of documentation for constructing queries for rmongodb.

SEGV in mongo_cursor_destroy

Thank you all for the great works.

Recently our program crashed when it tries to retrieve very huge data from mongodb server, here's the back trace:
(gdb) where
0 0x00002aaab24c3385 in mongo_cursor_destroy (cursor=0x10184970) at libmongo/mongo.c:1390
1 0x00002aaab24c3620 in mongo_cursor_get_more (cursor=0x10184970) at libmongo/mongo.c:1234
2 0x00002aaab24c3b38 in mongo_cursor_next (cursor=0x10184970) at libmongo/mongo.c:1362
3 0x00002aaab24cd84d in rmongo_cursor_next (cursor=) at api_mongo.c:309
4 0x00002ba38d25f4ec in do_dotcall (call=0x10f074f8, op=, args=, env=0x1) at dotcode.c:581
5 0x00002ba38d294cae in Rf_eval (e=0x10f074f8, rho=0x11423440) at eval.c:656
6 0x00002ba38d29a53f in Rf_applyClosure (call=0x105410f0, op=0x10f079c8, arglist=0x114233d0, rho=0x10535980, suppliedenv=0xf668118) at eval.c:1043
7 0x00002ba38d29481f in Rf_eval (e=0x105410f0, rho=0x10535980) at eval.c:675
8 0x00002ba38d2985d7 in do_while (call=0x10542be8, op=0xf6345f8, args=0x105410b8, rho=0x10535980) at eval.c:1556
9 0x00002ba38d294a74 in Rf_eval (e=0x10542be8, rho=0x10535980) at eval.c:628
10 0x00002ba38d296924 in do_begin (call=0x10542750, op=0xf636e20, args=0x10542bb0, rho=0x10535980) at eval.c:1632
11 0x00002ba38d294a74 in Rf_eval (e=0x10542750, rho=0x10535980) at eval.c:628
12 0x00002ba38d294a74 in Rf_eval (e=0x10543600, rho=0x10535980) at eval.c:628
13 0x00002ba38d296924 in do_begin (call=0x10554e78, op=0xf636e20, args=0x105435c8, rho=0x10535980) at eval.c:1632
14 0x00002ba38d294a74 in Rf_eval (e=0x10554e78, rho=0x10535980) at eval.c:628
15 0x00002ba38d29a53f in Rf_applyClosure (call=0x1055d6a0, op=0x1055c960, arglist=0x105382f8, rho=0xf6680e0, suppliedenv=0xf668118) at eval.c:1043
16 0x00002ba38d29481f in Rf_eval (e=0x1055d6a0, rho=0xf6680e0) at eval.c:675
17 0x00002ba38d297375 in do_set (call=0x1055d748, op=0xf637018, args=0x1055d710, rho=0xf6680e0) at eval.c:2029
18 0x00002ba38d294a74 in Rf_eval (e=0x1055d748, rho=0xf6680e0) at eval.c:628
19 0x00002ba38d2bbb49 in Rf_ReplIteration (rho=0xf6680e0, savestack=0, browselevel=, state=0x7fffe5df2970) at main.c:257
20 0x00002ba38d2bbf68 in R_ReplConsole (rho=0xf6680e0, savestack=0, browselevel=0) at main.c:306
21 0x00002ba38d2bc434 in run_Rmainloop () at main.c:998
22 0x0000000000bbe91b in r::session::runEmbeddedR(core::FilePath const&, core::FilePath const&, bool, bool, SA_TYPE, r::session::Callbacks const&, r::session::InternalCallbacks*) ()
23 0x0000000000b96641 in r::session::run(r::session::ROptions const&, r::session::RCallbacks const&) ()
24 0x00000000006655d8 in main ()
(gdb) p *cursor
$4 = {reply = 0x2aaab26d7010, conn = 0x10060e90, ns = 0x10fa42f0 "PRV.HDS_PID_HDS_SID__auto_k1mEvent_Rel", flags = 3, seen = 12772, current = {
data = 0x2aaab2ad6f33 <Address 0x2aaab2ad6f33 out of bounds>, cur = 0x0, dataSize = 0, finished = 1, stack = {0 <repeats 32 times>}, stackPos = 0, err = 0,
errstr = 0x0}, err = MONGO_CURSOR_EXHAUSTED, query = 0x7fffe5df0990, fields = 0x10b95690, options = 0, limit = 0, skip = 0}
(gdb) p *cursor->reply
Cannot access memory at address 0x2aaab26d7010
(gdb)

It can reproduce with a high possibility.
I noticed that was a mongo-c-driver bug in v0.7.1 and was fixed at v0.8 (https://github.com/mongodb/mongo-c-driver/commits/v0.8)
I tried to use the code in v0.8 and created a patch to fix it, but I think it is better to use the latest mongo-c-driver.
Thank you again.

10/6 edit to remove "# number" to avoid search engine's problem:)

mongo.distinct returns dates in different time zone than mongo shell

Hi,

If you load this database:

https://s3.amazonaws.com/rmongodb-problem/dump.tar.gz

As follows:

tar zxf dump.tar.gz
cd dump
mongorestore .

Then query the database in the mongo shell as follows:

mongo AnnotationHub

db.metadata.distinct("RDataDateAdded")

It returns:

[
    ISODate("2013-03-20T00:00:00Z"),
    ISODate("2013-03-27T00:00:00Z"),
    ISODate("2013-03-21T00:00:00Z"),
    ISODate("2013-03-22T00:00:00Z"),
    ISODate("2013-04-05T00:00:00Z"),
    ISODate("2013-04-30T00:00:00Z"),
    ISODate("2013-06-24T00:00:00Z"),
    ISODate("2013-06-25T00:00:00Z"),
    ISODate("2013-06-26T00:00:00Z"),
    ISODate("2013-06-29T00:00:00Z"),
    ISODate("2013-06-27T00:00:00Z"),
    ISODate("2013-06-28T00:00:00Z"),
    ISODate("2013-10-30T00:00:00Z"),
    ISODate("2013-11-21T00:00:00Z"),
    ISODate("2013-12-20T00:00:00Z"),
    ISODate("2013-12-27T00:00:00Z")
]

If you do the same in R:

library(rmongodb)
mongo <- mongo.create()
mongo.distinct(mongo, "AnnotationHub.metadata", "RDataDateAdded")

it returns:

 [1] "2013-03-19 17:00:00 PDT" "2013-03-26 17:00:00 PDT"
 [3] "2013-03-20 17:00:00 PDT" "2013-03-21 17:00:00 PDT"
 [5] "2013-04-04 17:00:00 PDT" "2013-04-29 17:00:00 PDT"
 [7] "2013-06-23 17:00:00 PDT" "2013-06-24 17:00:00 PDT"
 [9] "2013-06-25 17:00:00 PDT" "2013-06-28 17:00:00 PDT"
[11] "2013-06-26 17:00:00 PDT" "2013-06-27 17:00:00 PDT"
[13] "2013-10-29 17:00:00 PDT" "2013-11-20 16:00:00 PST"
[15] "2013-12-19 16:00:00 PST" "2013-12-26 16:00:00 PST"

The mongo shell returns the dates in GMT ("Z") and rmongodb returns them in PDT/PST. Is this a bug or a feature? What should I do if I want the results to be consistent between the two?

Thanks,
Dan

convenience function to write data.frames

Hi.
I started playing around with mongo today. As this was all new to me I was quite annoyed that there was no easy way to write a data.frame to the database as far as I could find.

To make it easier I made the function below and I though I would post it here in case you found it an interesting addition.

It converts a data.frame to a list of mongo.bson's that can be fed to mongo.insert.batch directly. The original data.frame can be re-produced with mongo.find.all.

dataframe2bson=function(dataframe){


# Put each row to a seperate list item
data_list = apply(dataframe,1,as.list)

# Convert any numbers saved as string to numeric adata
data_list = lapply(data_list,function(x) {    lapply(x,function(y) {
                                                                      if (suppressWarnings(!is.na(as.numeric(y)))) {as.numeric(y)}else{y}
                                                    })
                  })

# Iterate over the table and create the BSON object 
bson_data = lapply(data_list,function(x){
                                          idx=1
                                          names = names(x)
                                          buf <- mongo.bson.buffer.create()

                                          lapply(x,function(y) {
                                                                  mongo.bson.buffer.append(buf, names[idx], y)
                                                                  idx<<- idx+1
                                                })

                                          mongo.bson.from.buffer(buf)
                   })


return(bson_data)
}

Avoid DEPENDS

The current package has a

Depends: RJSONIO

It would be better to use Imports instead of Depends, and add an entry import(RJSONIO) to your NAMESPACE file. The problem with Depends is that it will attach the RJSONIO to user search path, potentially creating namespace conflicts.

You could also use Suggests and then in your code use:

RJSONIO::fromJSON

That way the RJSONIO package is only loaded when it is actually needed, instead of every time the user loads rmongodb.

Of course you can also use jsonlite instead of RJSONIO :-)

mongo.find.all not returning correct or consistent _id

When I use mongo.find.all "_id" keeps changing and it is not consistent nor the real "_id". mongo.find.one works.

mongo.find.one(mongo, ns=ns, query=criteria)

    _id : 7      536b461740ec2e2e2f7b2a56
    system_name : 2      LIFE_new
    name : 2     1-methyluric acid
    rt : 1   3.000000
    inchi : 2    InChI=1S/C6H6N4O3/c1-10-4(11)2-3(9-6(10)13)8-5(12)7-2/h1H3,(H,9,13)(H2,7,8,12)

mongo.find.all(mongo, ns=ns, query=criteria)

    _id      system_name name                rt
val 38305736 "LIFE_new"  "1-methyluric acid" 3 
val 1        "LIFE_old"  "1-methyluric acid" 3 
    inchi                                                                           
val "InChI=1S/C6H6N4O3/c1-10-4(11)2-3(9-6(10)13)8-5(12)7-2/h1H3,(H,9,13)(H2,7,8,12)"
val "InChI=1S/C6H6N4O3/c1-10-4(11)2-3(9-6(10)13)8-5(12)7-2/h1H3,(H,9,13)(H2,7,8,12)"

The problem is the rbind of the mongo.oid object in "_id".

temp = mongo.find.one(mongo, ns=ns, query=criteria)
mongo.bson.to.list(temp)

$`_id`
{ $oid : "536b461740ec2e2e2f7b2a56" }

$system_name
[1] "LIFE_new"

$name
[1] "1-methyluric acid"

$rt
[1] 3

$inchi
[1] "InChI=1S/C6H6N4O3/c1-10-4(11)2-3(9-6(10)13)8-5(12)7-2/h1H3,(H,9,13)(H2,7,8,12)"

class(mongo.bson.to.list(temp)$_id)

"mongo.oid"

The result of this command keep changing:
rbind(mongo.bson.to.list(temp)$_id,mongo.bson.to.list(temp)$_id)

So special treatment of "_id" need to done.
Something like if "_id" exist and is a mongo.oid the following need to be done:

as.character.mongo.oid((mongo.bson.to.list(temp)$`_id`))

mongo.bson.to.list simplify not working

When we set simplify=FALSE for mongo.bson.to.list, it still simplifies arrays into vectors rather than lists:

> mongo.bson.to.list(mongo.bson.from.list(list(foo=list("foo", "bar", "baz"))))
$foo
[1] "foo" "bar" "baz"

This is undesired. When simplify=FALSE, mongo.bson.to.list should convert all BSON objects to named lists, and BSON arrays to unnamed lists.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.