GithubHelp home page GithubHelp logo

connectors-sdk-resources's People

Contributors

acesar avatar blorincz1 avatar demboos avatar dustinguericke avatar kaitillman-lucidworks avatar kramano avatar laurel avatar mcnaggets avatar mcondo avatar mwmitchell avatar nddipiazza avatar pj-doerner avatar puneetkhanal avatar roblucar avatar semionpar avatar udireches avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

connectors-sdk-resources's Issues

Cannot seem to get JAXWS web service to work in Fusion 4.2.2

Hi,

We're trying to connect to a data source using a custom connector that still uses SOAP web services. Locally, we can connect to the web service without any problems. When we deploy it to Fusion, we are getting a NullPointerException when we try to connect to the service. Can you let me know if there's anything specific that we should be doing to integrate JAX-WS. We're using 2.3.0 of com.sun.xml.ws.*.

Thanks,
Eric

Here's the stack trace:
2020-08-24T10:59:30,743 - ERROR [plugin-iqvia.fusion.contentserver.content-9:connector.client.ClientMessageHandler@425] - {collectionId=EricTest, connectorType=iqvia.fusion.contentserver.content, datasourceId=cs-test, jobRunId=j9TtXqJSoU, requestId=7wNuRd0Hck} - Problem while calling client fetch()
java.lang.NullPointerException: null
at javax.xml.ws.Service.(Service.java:112) ~[?:1.8.0_151]
at com.opentext.livelink.service.core.Authentication_Service.(Authentication_Service.java:42) ~[?:?]
at com.iqvia.fusion.connector.plugin.impl.ContentServerGenerator.authorize(ContentServerGenerator.java:35) ~[?:?]
at com.iqvia.fusion.connector.plugin.ContentServerContentFetcher.emitCandidates(ContentServerContentFetcher.java:165) ~[?:?]
at com.iqvia.fusion.connector.plugin.ContentServerContentFetcher.fetch(ContentServerContentFetcher.java:89) ~[?:?]
at com.iqvia.fusion.connector.plugin.ContentServerContentFetcher.fetch(ContentServerContentFetcher.java:35) ~[?:?]
at com.lucidworks.fusion.connector.plugin.ext.controller.ContentFetcherController.lambda$fetch$5(ContentFetcherController.java:127) ~[lucidworks-connector-plugin-ext-4.2.2.jar:?]
at com.lucidworks.fusion.connector.plugin.ext.controller.AbstractFetcherController.doCall(AbstractFetcherController.java:97) ~[lucidworks-connector-plugin-ext-4.2.2.jar:?]
at com.lucidworks.fusion.connector.plugin.ext.controller.ContentFetcherController.fetch(ContentFetcherController.java:118) ~[lucidworks-connector-plugin-ext-4.2.2.jar:?]
at com.lucidworks.fusion.connector.plugin.ext.controller.FetchManager.lambda$doFetch$7(FetchManager.java:209) ~[lucidworks-connector-plugin-ext-4.2.2.jar:?]
at io.grpc.Context.call(Context.java:580) ~[grpc-context-1.15.1.jar:1.15.1]
at com.lucidworks.fusion.connector.plugin.Logging.call(Logging.java:141) ~[connector-plugin-sdk-1.4.0-dev.12+204023d.jar:1.4.0-dev.12+204023d]
at com.lucidworks.fusion.connector.plugin.ext.controller.FetchManager.doFetch(FetchManager.java:203) ~[lucidworks-connector-plugin-ext-4.2.2.jar:?]
at com.lucidworks.fusion.connector.client.ClientMessageHandler.handleFetchReq(ClientMessageHandler.java:417) ~[lucidworks-connector-plugin-client-4.2.2.jar:?]
at com.lucidworks.fusion.connector.client.ClientMessageHandler.handleServerCall(ClientMessageHandler.java:166) ~[lucidworks-connector-plugin-client-4.2.2.jar:?]
at com.lucidworks.fusion.connector.client.ConnectorClientResponseObserver._onNext(ConnectorClientResponseObserver.java:96) ~[lucidworks-connector-plugin-client-4.2.2.jar:?]
at com.lucidworks.fusion.connector.client.ConnectorClientResponseObserver.lambda$onNext$2(ConnectorClientResponseObserver.java:90) ~[lucidworks-connector-plugin-client-4.2.2.jar:?]
at io.grpc.Context.run(Context.java:565) [grpc-context-1.15.1.jar:1.15.1]
at com.lucidworks.fusion.connector.plugin.Logging.run(Logging.java:118) [connector-plugin-sdk-1.4.0-dev.12+204023d.jar:1.4.0-dev.12+204023d]
at com.lucidworks.fusion.connector.client.ConnectorClientResponseObserver.onNext(ConnectorClientResponseObserver.java:88) [lucidworks-connector-plugin-client-4.2.2.jar:?]
at com.lucidworks.fusion.connector.client.ConnectorClientResponseObserver.onNext(ConnectorClientResponseObserver.java:27) [lucidworks-connector-plugin-client-4.2.2.jar:?]
at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onMessage(ClientCalls.java:421) [grpc-stub-1.15.1.jar:1.15.1]
at io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33) [grpc-core-1.15.1.jar:1.15.1]
at io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33) [grpc-core-1.15.1.jar:1.15.1]
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInContext(ClientCallImpl.java:519) [grpc-core-1.15.1.jar:1.15.1]
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) [grpc-core-1.15.1.jar:1.15.1]
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) [grpc-core-1.15.1.jar:1.15.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_151]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_151]
at io.grpc.Context.run(Context.java:565) [grpc-context-1.15.1.jar:1.15.1]
at com.lucidworks.fusion.connector.plugin.Logging.run(Logging.java:118) [connector-plugin-sdk-1.4.0-dev.12+204023d.jar:1.4.0-dev.12+204023d]
at com.lucidworks.fusion.connector.plugin.ext.logging.wrappers.RunnableMDC.run(RunnableMDC.java:32) [lucidworks-connector-plugin-ext-4.2.2.jar:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_151]

Getting Timeout exception in custom connector

Hello,

I'm trying to create a custom connector and it seems that I'm reaching a timeout before the connector has time to complete. Can someone tell me where I can find the setting to control the timeout?

2019-03-11T17:11:00,841 - ERROR [rpc-server-10:connectors.job.GrpcFetcherController@370] - {collectionId=MyTest, connectorType=corp.fusion.ldap.content, datasourceId=myds, jobRunId=FbbwENNR2o, requestId=cFpPxefMSc} - Timed out waiting for fetch call to complete; aborting call.

Thanks,
Eric

How do you purge stray items?

In Fusion 4, there was an option to purge stray items at the end of the fetch (post-fetch). Since there's no post-fetch anymore, how is this handled in Fusion 5?

Issue with number of documents created

Hi,

I've created my own connector but something is wrong with the number of items that get created. The source system has around 1,700 but, as you can see, over 6,000 documents are getting created. I'm basically just creating a document emitter and emitting it during fetch.

com.lucidworks.fusion.connector.plugin.api.fetcher.type.content.Content.Emitter newEmittedDoc = fetchContext.newContent(asset.getId(), streamSupplier);
newEmittedDoc.withFields(fields);
newEmittedDoc.emit();

Any thoughts of why I would get so many?

Thanks!

Question about installing connectors built using this sdk

Hello there,
Sorry for filing another issue here, but I'm very curious about the process for deploying the connectors built using this sdk. Currently, I can run the sample connector and my modified version via the command:
java -jar ${fusionHome}/apps/connectors/connectors-rpc/client/connector-plugin-client-${fusionVersion}-uberjar.jar build/plugins/phab-wiki-connector-0.1.1.zip

However, I'm wondering if it's possible to install this connector into Fusion as with the blob store method mentioned here:
https://doc.lucidworks.com/fusion-server/4.0/reference-guides/connectors/index.html#installing-a-connector-using-the-fusion-ui

Thanks,
Erik

Examples of using the persistent store?

Hello,
I am trying to migrate my connectors for Phabricator written previously for fusion 3.0 that uses the crawlDB. Is it possible to add an example on how to use the new distributed data store or crawlDB?
Thanks!

Schema Annotation for Secrets in Plugin Configuration

Hi, Tsuyoshi from Raytion GmbH here.
Is there a Schema Annotation provided by your Connector SDK v.3.0.0 which marks a property as a secret value.
I am expecting the annotated property to be shown hidden in the Datasource Configuration of the Admin UI.

Thank you in advance.
Tsuyoshi

unable to make apache cxf and tika work within a custom connector

Hi,
we are unable to make apache cxf and tika work within a custom connector.
We are launching the connector remotely and see that it never reaches fetch or prefetch methods. It dies during initialization, trying to initialize apache cxf

the source of the problem is probably
Caused by: javax.xml.bind.JAXBException: ClassCastException: attempting to cast jar:file:/Library/Java/JavaVirtualMachines/jdk1.8.0_45.jdk/Contents/Home/jre/lib/rt.jar!/javax/xml/bind/JAXBContext.class to jar:file:/Users/matteogrolla/Development/alf_sourcesense/bancaditalia/connectors-sdk-resources/java-sdk/connectors/build/plugins/bdi-enterprise-search/lib/jakarta.xml.bind-api-2.3.2.jar!/javax/xml/bind/JAXBContext.class. Please make sure that you are specifying the proper ClassLoader.

here are the relevant excerpts from the logs

 https://docs.google.com/document/d/1g-UA8Ssd6wMU5Xy8tQNfzEQJ5SaJqLWIVBB5ot_k6mA/edit?usp=sharing

ContentFetcher postFetch() is never called

Hello, I am writing a connector and I am having some issues with the ContentFetcher.
The postFetch method is never called and the fetched objects never seem to be marked as complete.
When the job runs the first time, the preFetch() emits the items and they are fetched as expected.
When the job runs the second time, the preFetch() emits no items (as expected as there is no new content in my case), however, all of the previous items are fetched again as they appear to be incomplete.

So my questions are:

  1. Am I missing something to mark these items as complete so they won't be indexed again until the preFetch() emits them again?
  2. Is there something specific I need to do to get the postFetch() function to be called?
  3. Is the fact postFetch() is never called the reason these objects are never marked as complete and indexed every time the job runs?

Thanks.

Compatible APIs

Hi everyone,

For a connector, I would like to use the Spring Framework libraries (e.g. RestTemplate). It seems that some versions are not compatible. Is there any list or any guidance on compatibility with other libraries?

Thanks!

Any ideas what this error means: Waited the max PT1M waiting on indexing results to come back documentReceivedCount=xxxxxxx, deleteReceivedCount=0, indexedResultCount=xxxxxxx. Everything is done so crawl being marked as completed now. But this may result in documents being missed.

At the end of a crawl, I get this error in the logs:

Waited the max PT1M waiting on indexing results to come back documentReceivedCount=375532, deleteReceivedCount=0, indexedResultCount=375518. Everything is done so crawl being marked as completed now. But this may result in documents being missed.

Any ideas?

When calling newContent with a binary file, you end up with a lot of null values warning in the logs

When you use code like the following:
context.newContent(input.getId(), new FileInputStream(workFile))
.fields(f -> f.merge(candidateFields))
.emit();

where workFile is a binary file, you get a lot of warnings depending on the file type. In the case of Excel files, you end up with a lot of messages like this:
invalid object type:null for key:clrMapOvr.masterClrMapping

Is there any way to stop these error messages from appearing?

On Windows, trying to remote debug causes java.lang.NoClassDefFoundError: org/apache/commons/configuration/Configuration

Hi, I'm running the sample on Windows and tried to used the connect option to remotely debug but I'm receiving this error:

java.lang.NoClassDefFoundError: org/apache/commons/configuration/Configuration

c:\Users\eborisow\workspaces\fusion\connectors-june-2019>gradlew connect

Task :imap-connector:connect
-- Using client-jar D:/programs/fusion/4.2.2/apps/connectors/connectors-rpc/client/connector-plugin-client-4.2.2-uberjar.jar
-- Connecting plugin C:\Users\eborisow\workspaces\fusion\connectors-june-2019\imap-connector\build/libs/imap-connector.zip to localhost:8771
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/configuration/Configuration
at com.lucidworks.fusion.plugin.handler.ConnectorPluginModuleHandler.(ConnectorPluginModuleHandler.java:185)
at com.lucidworks.fusion.connector.client.ConnectorClientMain$1.configure(ConnectorClientMain.java:89)
at com.google.inject.AbstractModule.configure(AbstractModule.java:62)
at com.google.inject.spi.Elements$RecordingBinder.install(Elements.java:340)
at com.google.inject.spi.Elements.getElements(Elements.java:110)
at com.google.inject.internal.InjectorShell$Builder.build(InjectorShell.java:138)
at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:104)
at com.google.inject.Guice.createInjector(Guice.java:99)
at com.google.inject.Guice.createInjector(Guice.java:73)
at com.google.inject.Guice.createInjector(Guice.java:62)
at com.lucidworks.fusion.connector.client.ConnectorClientMain.(ConnectorClientMain.java:109)
at com.lucidworks.fusion.connector.client.ConnectorClientMain.loadPlugins(ConnectorClientMain.java:155)
at com.lucidworks.fusion.connector.client.ConnectorClientMain.main(ConnectorClientMain.java:129)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.configuration.Configuration
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 13 more
After this, the compile just gets stuck in a loop.

Any thoughts?

Different behavior when running with connectorPluginClient

Hi,

My custom connector runs fine when deployed on the Fusion instance, however when I run it using the connector plugin client it throws the following errors:

Caused by: io.grpc.StatusRuntimeException: UNKNOWN: java.lang.IllegalAccessException: Unable to create injector, see the following errors:

_1) An exception was caught and reported. Message: java.lang.RuntimeException: java.lang.IllegalAccessException: access to public member failed: com.lucidworks.fusion.connector.plugin.api.config.ConnectorConfig.schematize[Ljava.lang.Object;@4ca83faa/invokeSpecial, from com.lucidworks.fusion.connector.plugin.api.config.ConnectorConfig/2 (unnamed module @36ef1d65)
at com.google.inject.internal.InjectorShell$Builder.build(InjectorShell.java:138)

  1. No implementation for com.lucidworks.fusion.connector.plugin.api.config.ConnectorConfig was bound.
    at com.lucidworks.fusion.connector.plugin.ext.controller.FetchManagerRegistry$1.configure(FetchManagerRegistry.java:71)_

I am not sure where is the issue and how to debug, can you give me some pointers please?

Thanks,
Roxana

Formatting Error - gradlew command

In README.asciidoc, one of the commands has a couple of formatting errors.

../gradlew clean build :assemblePlugin

SHOULD BE

./gradlew clean build :assemblePlugins

How to use JSON Parser Stage with custom connector?

I'm getting JSON string from site and I prefer to use JSON Parse Stage to parse JSON and create documents,
Is it possible? and how to do it?

I thought I can use newContent, but this code doesn't work for me:

String pageBody = "{"cards":[{"id":1, "field":"somefield 1"}, {"id":2, "field":"somefield 2"}, {"id":3, "field":"somefield 3"}]}";
InputStream targetStream = new ByteArrayInputStream(pageBody.getBytes(StandardCharsets.UTF_8));
fetchContext.newContent(request.getUrl(), targetStream)
      .fields(f->{
            f.setString("body", pageBody)
       })
       .emit();

Error management on random-connector-incremental

Hi,
I'm Matteo Grolla from Sourcesense, Lucidwork's partner in Italy.
I'm developing a custom connector for a customer but I have questions about error management, Robert Lucarini suggested to post my questions here.
Let's use random-content-incremental for our discussion and let's focus on the fetch method
What I've noticed is:

  • if an exception is thrown inside generateRandom the framework restarts the crawl from previous checkpoint (or the beginning if it was the first)
    How can I terminate the crawl marking it as failed?
    I'd like that next time I restart the crawl it proceeds from last saved checkpoint
  • if an exception is thrown inside emitDocument the framework logs the error and proceeds with the crawl.
    Will this document be recrawled? When? Can we control this?
    Thanks a lot

How do I handle deleted items from the source system that are orphaned in Fusion?

First time, you execute a fetch and your source system gives you 20 items. You emit 20 documents. Next index, 2 items were deleted from the source. How do you handle the deletion of the orphans? In the source system, the items are just gone - no record of the deletion (think table in database). In the connector SDK, I cannot find any method to access the current collection (in order to delete it, read items from it, etc.) How can you read through the items from the collection so that you can remove orphans?

Remote deployment of connectors

There seems to be limited documentation for deploying connectors as a remote instance. Would like to get some additional clarity on this type of implementation. I have changed the following values in the gradles.properties, but it doesn't seem to have any effect.

fusionApiTarget=localhost:8765 fusionRpcTarget=localhost:8771

Results in the following loop:
2019-09-24T21:11:48,863 - WARN [conn-rpc-registry-watcher-0:com.lucidworks.fusion.plugin.handler.ConnectorPluginModuleHandler@219] - {} - Service registry not ready - will try again

Also, using an ssh tunnel doesn't seem to help either:

2019-09-24T13:35:21,622 - ERROR [conn-rpc-registry-watcher-0:com.lucidworks.fusion.plugin.handler.ConnectorPluginModuleHandler@198] - {} - Problem discovering connector-services: Status{code=UNAVAILABLE, description=io exception, cause=io.grpc.netty.shaded.io.netty.channel.ConnectTimeoutException: connection timed out: /10.10.123.4:8771 at io.grpc.netty.shaded.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:267) at io.grpc.netty.shaded.io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38) at io.grpc.netty.shaded.io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:127) at io.grpc.netty.shaded.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) at io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) at io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:464) at io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) at io.grpc.netty.shaded.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:748) }

My question would what is the parameters that in the CLI that can override the previous values and how would you run the connector through a ssh tunnel?

API Documentation?

Is there any Javadoc-style documentation available?

For example, for com.lucidworks.fusion.connector.plugin.api.fetcher.type.content.MessageHelper, what are the parameters for the method expecting? There is the id, a Map for fields and a Map for Metadata? Is there some reference for the values expected?

How to enable "Purge Items" Feature ?

Hi, Tsuyoshi from Raytion GmbH here.
I read about the Purge Items Feature which is also partially mentioned in your documentation here.
I assume that this is a feature where items stored in the Crawl DB and not being fetched in the previous job will be cleaned up.
Will those documents will be also deleted from the Solr Collection (they are also registered as Documents) ?
If yes, does this feature requires to be enabled explicitly as we are currently not able to observe our unvisited items to be deleted.
Does the same mechanism also applies to Access Controls ?

Thank you in advance.

Is there a particular method to use to extract content from binary content?

Hi,

In my custom connector, I have a URL to the source system. I'd like to extract the content and place it into a field. Is there an approach that I should use for that? Do I stream the file during the connector fetch to get the content? Do I leave it for another process to handle later? I would most likely have authentication issues and that type of thing to deal with.

Thanks,
Eric

Updates to ACL handling for graph security trimming?

Hi,

Can you let me know if there have been any updates for handling creating the ACL for graph security trimming? Currently, we are writing directly to Solr. It would be good to revert back to the ACL handling in the API at some point.

How to use code/JavaScript fields?

I want to use JavaScript code in my Datasource/Connector but I am faced with two problems:

  • the javascript code isn't highlighting. Fields open the editor in the popup but it is just a text editor.
  • but the biggest problem - the text doesn't save to the datasource, when I re-open datasource or use export/objects API I don't see field with code that I saved
@Property(
            name = "someJsCode",
            title = "Some Script",
            description = "Some js function ",
            hints = {UIHints.CODE, UIHints.JAVASCRIPT}
    )
    @SchemaAnnotations.StringSchema
    String beforeRequestJSCode();

Do you have examples/explanations on how to use code fields?
Thanks

LucidWorks: 5.4.1
SDK: 4.0.0

How can I use the last modified of current items to check whether to run subsequent fetches?

Hi,

My source content has metadata with the last modified date/time assigned to each item. How can I use the last modified of the current content items to compare to the newly fetched content? Do I need to execute a query on each item to retrieve it's last modified from Fusion on each iteration? Or, is there something else in the API that will perform the check for me? I'm wondering how this would work.

Regards,
Eric

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.