james75 / solrmongoimporter Goto Github PK
View Code? Open in Web Editor NEWMongoDb plugin for Solr Data Import
MongoDb plugin for Solr Data Import
Hi,
I have tried your SolrMongoImporter on solr 4.4 with mongo-java dirver 2.10.1.
After following your docs, i restarted the tomcat (where solr resists) an started the first import (using the solr admin gui).
But there was a error befor the import starts:
ERROR org.apache.solr.servlet.SolrDispatchFilter ? null:org.apache.solr.handler.dataimport.DataImportHandlerException:
Data Config problem:
The processing instruction target matching "[xX][mM][lL]" is not allowed
Is there any idea, how to fix that?
Howdy,
I have played around with the configuration file and I can't seem to get Solr to send to MongoDB a query that includes an ISODate or a Date parameter. For example,
as a test this gets no errors but imports nothing:
query="{'updated_at':{'$gt': 'anything' }}"
this gives me an "Indexing failed. Rolled back all changes" error in Solr:
query="{'updated_at':{'$gt': ISODate('2013-04-17 12:11') }}"
I can't figure out how to actually send a query with a valid date comparison to MongoDB. If you could help me that would be great.
Thanks!
B.
I tried to connect my remote mongo server and i gave inbound rules permission for 27017
I couldn't import data from my mongo cluster.
I checked my primary db server i got this following error.
2018-11-20 06:08:32.915 INFO (Thread-27) [ x:businesses] o.m.d.connection Closed connection [connectionId{localValue:136, serverValue:72}] to xx.xxx.xx.xxx:27017 because the pool has been closed.
My data-config.xml
<dataConfig> <dataSource name="MongoSource" type="MongoDataSource" host="xx.xxx.xx.xxx" database="mydbname" username="xxxxxx" password="xxxxxx" /> <document name="import"> <entity processor="MongoEntityProcessor" datasource="MongoSource" transformer="MongoMapperTransformer" collection="users" name="users" query=""> <field column="_id" name="_id" mongoField="_id" /> <field column="id" name="id" mongoField="_id" /> <field column="name" name="j_name" mongoField="name" /> ............
is there any option to use mongo connection uri like mongodb://localhost1:27017,localhost2:27017,localhost3:27017/mydbname?readPreference=nearest&replicaSet=rs01
Please anybody help me. Thanks in advance.
Any idea how $deleteDocById & $skipDoc can be done?
http://wiki.apache.org/solr/DataImportHandler#Special_Commands
use solr-mongo-importer-1.1.0.jar need mongo-java-drive version?
We use this param in order to set up host/port/username/pwd/database to avoid to be written on data-config.xml
For that we declare in solrconfig.xml
data-config.xml
${dihMongoDBHost:defaultHost}
${dihMongoDBPort:defaultPort}
${dihMongoUsername:defaultUsername}
${dihMongoDBPassword:defaultPwd}
${dihMongoDBDatabase:defaultDB}
Unfortunatly, used in the datasource it's not working and we have the following issue
<dataSource name="mongo" type="MongoDataSource" host="${dataimporter.request.dihMongoDBHost}" port="${dataimporter.request.dihMongoDBPort}" username="${dataimporter.request.dihMongoDBUsername}" password="${dataimporter.request.dihMongoDBPassword}" database="${dataimporter.request.dihMongoDBDatabase}" />
Caused by: java.lang.NumberFormatException: For input string: "${dataimporter.request.dihMongoDBPort}"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:569)
at java.lang.Integer.parseInt(Integer.java:615)
at org.apache.solr.handler.dataimport.MongoDataSource.init(MongoDataSource.java:49)
at org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(DataImporter.java:389)
And if I change with the port written, still not working
<dataSource name="mongo" type="MongoDataSource" host="${dataimporter.request.dihMongoDBHost}" port="27017" username="${dataimporter.request.dihMongoDBUsername}" password="${dataimporter.request.dihMongoDBPassword}" database="${dataimporter.request.dihMongoDBDatabase}" />
Caused by: com.mongodb.MongoTimeoutException: Timed out after 10000 ms while waiting for a server that matches AnyServerSelector{}. Client view of cluster state is {type=Unknown, servers=[{address=${dataimporter.request.dihmongodbhost}:27017, type=Unknown, state=Connecting, exception={com.mongodb.MongoException$Network: Exception opening the socket}, caused by {java.net.UnknownHostException: ${dataimporter.request.dihmongodbhost}}}]
at com.mongodb.BaseCluster.getServer(BaseCluster.java:82)
at com.mongodb.DBTCPConnector.getServer(DBTCPConnector.java:669)
at com.mongodb.DBTCPConnector.access$500(DBTCPConnector.java:40)
at com.mongodb.DBTCPConnector$MyPort.getConnection(DBTCPConnector.java:518)
at com.mongodb.DBTCPConnector$MyPort.get(DBTCPConnector.java:461)
at com.mongodb.DBTCPConnector.authenticate(DBTCPConnector.java:639)
at com.mongodb.DBApiLayer.doAuthenticate(DBApiLayer.java:247)
at com.mongodb.DB.authenticateCommandHelper(DB.java:745)
at com.mongodb.DB.authenticate(DB.java:701)
at org.apache.solr.handler.dataimport.MongoDataSource.init(MongoDataSource.java:53)
at org.apache.solr.handler.dataimport.DataImporter.getDataSourceInstance(DataImporter.java:389)
It seems that we can use the param, it would be great if we can, as with other datasource.
I hope you still working on this project and you'll be able to add that fix !!
Tks again for your work
com.mongodb.util.JSONParseException:
{'UpdateDate':{$ gt:{$date:'2017-12-12 07:14:38'}}}
^
at com.mongodb.util.JSONParser.read(JSON.java:302)
at com.mongodb.util.JSONParser.parseObject(JSON.java:263)
at com.mongodb.util.JSONParser.parse(JSON.java:228)
at com.mongodb.util.JSONParser.parseObject(JSON.java:264)
at com.mongodb.util.JSONParser.parse(JSON.java:228)
at com.mongodb.util.JSONParser.parse(JSON.java:156)
at com.mongodb.util.JSON.parse(JSON.java:98)
at com.mongodb.util.JSON.parse(JSON.java:79)
at org.apache.solr.handler.dataimport.MongoDataSource.getData(MongoDataSource.java:72)
at org.apache.solr.handler.dataimport.MongoDataSource.getData(MongoDataSource.java:86)
at org.apache.solr.handler.dataimport.MongoEntityProcessor.initQuery(MongoEntityProcessor.java:39)
at org.apache.solr.handler.dataimport.MongoEntityProcessor.nextModifiedRowKey(MongoEntityProcessor.java:65)
at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextModifiedRowKey(EntityProcessorWrapper.java:267)
at org.apache.solr.handler.dataimport.DocBuilder.collectDelta(DocBuilder.java:801)
at org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:344)
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:224)
at org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:444)
at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:482)
at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)
Hi James,
This project looks just like what we need, except we are struggling to authenticate using mongo-java-driver 3.3 to a MongoDB server that uses ssl and has an authenticationmechanism of SCRAM-SHA-1
Are you able to help us out with a patch please?
Many thanks,
John
I'm confused between https://github.com/5missions/mongoSolrImporter and this project. Though the former is quite old but says it has multi-threading support and someone here commented that he has used the former and achieved 10x performance.
Could you please shed some light on the performance aspects related to large data import for this project, assuming that the mongo collections have flat data structure with max 10 fields.
Not working with replicaset, also slaveOk
I am trying to point the ivy.xml file to solr 4.4.0 but when I try to do ant jar it's still pulling down the solr 3.6 jars
and also ant jar gives me the following error
srcdir "/space/git_space/SolrMongoImporter/${src.dir}" does not exist!
I'm trying to index two collections with a matching common field. I'm using sub entities to do it, where the FieldA4 in EntityA is the string version of the _id (ObjectId type) on EntityB.
When I'm trying to import data I get a JSONParseException,
This is my data-config.xml
How can I compare this two fields with different types on these two collections without getting an error?
Thanks.
i just walk as you figure out above, but it does not work at all !
java.lang.IllegalAccessError: tried to access field org.apache.solr.handler.dataimport.DataImporter.QUERY_COUNT from class org.apache.solr.handler.dataimport.MongoEntityProcessor
I am getting this exception. I am using following jars:
solr-dataimporthandler-5.3.0
solr-mongo-importer-1.0.0
mongo-java-driver-3.0.2
Hello,
I'm trying to use MongoConnector to replicate a collection from Mongo to SOLR, Initial data import (~1.8 mill docs) works without any errors, after that if I manually edit the collection the changes are replicated, but when I'm running a tool on the Mongo collection which approx. changes around 9k documents, the replication stops and it isn't working anymore, even if I change some doc manually. Does anyone have any idea what could be the problem? MongoConnector log is empty and eventually the process stops because the timestamp gets overwritten in the oplog. Thank you in advance for the help.
with regards,
Mitereiter Balázs
Hi,
If i need a delta command/query to pull only some data, could you guide me how to go with it?
Hi!
Most MongoDB documents are not flat, you have embedded documents, arrays and arrays of documents, so it would be nice if this SolrMongoImporter will allow you to select fields from subdocuments, and to import arrays.
I will gladly write this functionality if you are ok with it!
Regards,
Cristi
Hi there,
I am using SolrMongoImporter as a way to populate sub-entity in Solr. The problem is in the query section of the MongoEntityProcessor. In order to refer the main entity I have to do (for example)
Thanks.
Zul
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.