lambda-3 / indra Goto Github PK
View Code? Open in Web Editor NEWIndra is a Web Service which allows easy access to different distributional semantics models in several languages.
License: MIT License
Indra is a Web Service which allows easy access to different distributional semantics models in several languages.
License: MIT License
There are some minor adaptation the must be implemented to really support those languages like Stemmer and Stopwords.
See #73
Jaccard Score metric is missing
Why are various documentation links pointing to a relatively empty page? Is this project alive?
We're still using version 3.0. There are plenty of new improvements on the MongoDB that we can take advantage, like experimenting with the Wired Tiger engine, better memory management etc.
Until now is harmless but needs investigation.
WARN org.glassfish.jersey.internal.Errors - The following warnings have been detected: WARNING: Parameter 1 of type org.lambda3.indra.core.VectorSpaceFactory<? extends org.la mbda3.indra.core.VectorSpace> from public org.lambda3.indra.service.impl.InfoResourceImpl(org.lambda3.indra.core.VectorSpaceFactory<? extends org.lambda3.indra.core.VectorSpace>,org.lambda 3.indra.core.translation.IndraTranslatorFactory<? extends org.lambda3.indra.core.translation.IndraTranslator>) is not resolvable to a concrete type. WARNING: Parameter 2 of type org.lambda3.indra.core.translation.IndraTranslatorFactory<? extends org.lambda3.indra.core.translation.IndraTranslator> from public org.lambda3.indra.service.i mpl.InfoResourceImpl(org.lambda3.indra.core.VectorSpaceFactory<? extends org.lambda3.indra.core.VectorSpace>,org.lambda3.indra.core.translation.IndraTranslatorFactory<? extends org.lambda3 .indra.core.translation.IndraTranslator>) is not resolvable to a concrete type. WARNING: Parameter 1 of type org.lambda3.indra.core.VectorSpaceFactory<? extends org.lambda3.indra.core.VectorSpace> from public org.lambda3.indra.service.impl.InfoResourceImpl(org.lambda3 .indra.core.VectorSpaceFactory<? extends org.lambda3.indra.core.VectorSpace>,org.lambda3.indra.core.translation.IndraTranslatorFactory<? extends org.lambda3.indra.core.translation.IndraTra nslator>) is not resolvable to a concrete type. WARNING: Parameter 2 of type org.lambda3.indra.core.translation.IndraTranslatorFactory<? extends org.lambda3.indra.core.translation.IndraTranslator> from public org.lambda3.indra.service.i mpl.InfoResourceImpl(org.lambda3.indra.core.VectorSpaceFactory<? extends org.lambda3.indra.core.VectorSpace>,org.lambda3.indra.core.translation.IndraTranslatorFactory<? extends org.lambda3.indra.core.translation.IndraTranslator>) is not resolvable to a concrete type.
Users should be able to generate models with their own corpora.
We should rely on pure JAX-RS standards and just adopt jersey.
I want to export the word embedding model to import it into another project, probably gensim or another similar but supported library. How can this be done?
Decide the appropriate license for the project, if needed at all!
Currently we only support a list of text pairs and the users can only repeat the pairs to achieve the same result.
We have this:
[{"t1": "a", "t2": "b"}, {"t1": "a", "t2": "c"}, {"t1": "a", "t2": "d"}]
Ideally could be:
{"t1": "a", "t2": ["b", "c", "d"]}
We should give instructions on how to instantiate those models.
Indra Version: lambdacube/indra:2.0.2 (docker)
Indra Client Version: 2.0.3-rc
24ef31f
javax.ws.rs: 2.0.1
Jersey: 2.25.1
Jackson: 2.8.6
Request gets serialized with unrecognized field "mt"
Request:
RelatednessPairRequest request = new RelatednessPairRequest()
.corpus("googlenews300neg")
.language("EN")
.scoreFunction(ScoreFunction.COSINE)
.model("W2V")
.mt(false)
.pairs(pairs);
WebTarget webTarget = this.client.target(this.url);
RelatednessPairResponse response = webTarget.request().buildPost(Entity.entity(request, MediaType.APPLICATION_JSON_TYPE)).invoke(RelatednessPairResponse.class);
Debug output:
1 > POST http://127.0.0.1:8916/relatedness
1 > Content-Type: application/json
{"corpus":"googlenews300neg","model":"W2V","language":"EN","mt":false,"applyStopWords":null,"minWordLength":-1,"scoreFunction":"COSINE","pairs":[{"t1":"mother","t2":"love"},{"t1":"father","t2":"love"}]}
Nov 09, 2017 8:29:16 PM org.glassfish.jersey.logging.LoggingInterceptor log
INFO: 1 * Client response received on thread main
1 < 400
1 < Connection: close
1 < Content-Length: 421
1 < Content-Type: text/plain
1 < Date: Thu, 09 Nov 2017 19:29:16 GMT
Unrecognized field "mt" (class org.lambda3.indra.client.RelatednessRequest), not marked as ignorable (7 known properties: "minWordLength", "corpus", "language", "model", "pairs", "applyStopWords", "scoreFunction"])
at [Source: org.glassfish.jersey.message.internal.ReaderInterceptorExecutor$UnCloseableInputStream@8d36676; line: 1, column: 70] (through reference chain: org.lambda3.indra.client.RelatednessRequest["mt"])
javax.ws.rs.BadRequestException: HTTP 400 Bad Request
at org.glassfish.jersey.client.JerseyInvocation.convertToException(JerseyInvocation.java:1011)
at org.glassfish.jersey.client.JerseyInvocation.translate(JerseyInvocation.java:819)
at org.glassfish.jersey.client.JerseyInvocation.access$700(JerseyInvocation.java:92)
at org.glassfish.jersey.client.JerseyInvocation$2.call(JerseyInvocation.java:701)
implement a org.lambda3.indra.core.VectorSpace based on the Lucene store and search engine.
if I run the following command, I get an error message, while 'dep-en-wiki-2014' is in http://data.lambda3.org/indra/dumps/.
curl -X POST -H "Content-Type: application/json" -d '{
"corpus": "wiki-2014",
"model": "dep",
"language": "EN",
"terms": ["love", "mother", "santa claus"]
}' "http://indra.lambda3.org/vectors"
"msg":"Model 'dep-EN-wiki-2014' not found."}
I tried running ./downloader.sh esa-en-wiki-2018
to download the model and try Indra, but it is not connecting.
(I also tried with other models)
Are they still available?
Make it clear how to start the service, dependencies, client use-cases, limitations..
Hi,
i installed Indra using IndraComposed and have problems to get the "nearest neighbors" running. While pair relatedness, OTM nearest neighbors work, i get an error for the nearest neighbors relatedness:
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: http://localhost:8916/neighbors/relatedness
The code i am using is as follows:
data = {'corpus': 'wiki-2018',
'model': 'W2V',
'language': 'EN',
'topk' : '3',
'terms': ['house', 'engine']
}
res = requests.post("http://localhost:8916/neighbors/vectors", data = json.dumps(data), headers = headers)
res.raise_for_status()
print('[Nearest neighbor vectors]', res.json())
implement a org.lambda3.indra.core.translation.IndraTranslator based on the Lucene store and search engine.
Package javax.xml.bind.*
were moved out from standard JDK distribution.
https://www.jesperdj.com/2018/09/30/jaxb-on-java-9-10-11-and-beyond/
I've provided a PR #71 that solves this issue. I hope it is helpful.
╰─$ java --version
openjdk 11.0.10 2021-01-19
OpenJDK Runtime Environment (build 11.0.10+9-Ubuntu-0ubuntu1.20.04)
OpenJDK 64-Bit Server VM (build 11.0.10+9-Ubuntu-0ubuntu1.20.04, mixed mode, sharing)
$ mvn clean verify
[INFO] --- maven-failsafe-plugin:2.20:integration-test (default) @ indra-mongo ---
[WARNING] Error injecting: org.apache.maven.plugin.failsafe.IntegrationTestMojo
java.lang.NoClassDefFoundError: javax/xml/bind/JAXBException
at java.lang.Class.getDeclaredConstructors0 (Native Method)
at java.lang.Class.privateGetDeclaredConstructors (Class.java:3137)
at java.lang.Class.getDeclaredConstructors (Class.java:2357)
at com.google.inject.spi.InjectionPoint.forConstructorOf (InjectionPoint.java:245)
at com.google.inject.internal.ConstructorBindingImpl.create (ConstructorBindingImpl.java:115)
at com.google.inject.internal.InjectorImpl.createUninitializedBinding (InjectorImpl.java:706)
at com.google.inject.internal.InjectorImpl.createJustInTimeBinding (InjectorImpl.java:930)
at com.google.inject.internal.InjectorImpl.createJustInTimeBindingRecursive (InjectorImpl.java:852)
at com.google.inject.internal.InjectorImpl.getJustInTimeBinding (InjectorImpl.java:291)
at com.google.inject.internal.InjectorImpl.getBindingOrThrow (InjectorImpl.java:222)
at com.google.inject.internal.InjectorImpl.getProviderOrThrow (InjectorImpl.java:1040)
at com.google.inject.internal.InjectorImpl.getProvider (InjectorImpl.java:1071)
at com.google.inject.internal.InjectorImpl.getProvider (InjectorImpl.java:1034)
at com.google.inject.internal.InjectorImpl.getInstance (InjectorImpl.java:1086)
at org.eclipse.sisu.space.AbstractDeferredClass.get (AbstractDeferredClass.java:48)
at com.google.inject.internal.ProviderInternalFactory.provision (ProviderInternalFactory.java:85)
at com.google.inject.internal.InternalFactoryToInitializableAdapter.provision (InternalFactoryToInitializableAdapter.java:57)
at com.google.inject.internal.ProviderInternalFactory$1.call (ProviderInternalFactory.java:66)
at com.google.inject.internal.ProvisionListenerStackCallback$Provision.provision (ProvisionListenerStackCallback.java:112)
at com.google.inject.internal.ProvisionListenerStackCallback$Provision.provision (ProvisionListenerStackCallback.java:127)
at com.google.inject.internal.ProvisionListenerStackCallback.provision (ProvisionListenerStackCallback.java:66)
at com.google.inject.internal.ProviderInternalFactory.circularGet (ProviderInternalFactory.java:61)
at com.google.inject.internal.InternalFactoryToInitializableAdapter.get (InternalFactoryToInitializableAdapter.java:47)
at com.google.inject.internal.InjectorImpl$1.get (InjectorImpl.java:1050)
at org.eclipse.sisu.inject.Guice4$1.get (Guice4.java:162)
at org.eclipse.sisu.inject.LazyBeanEntry.getValue (LazyBeanEntry.java:81)
at org.eclipse.sisu.plexus.LazyPlexusBean.getValue (LazyPlexusBean.java:51)
at org.codehaus.plexus.DefaultPlexusContainer.lookup (DefaultPlexusContainer.java:263)
at org.codehaus.plexus.DefaultPlexusContainer.lookup (DefaultPlexusContainer.java:255)
at org.apache.maven.plugin.internal.DefaultMavenPluginManager.getConfiguredMojo (DefaultMavenPluginManager.java:520)
at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:124)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:210)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128)
at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
at org.apache.maven.cli.MavenCli.execute (MavenCli.java:957)
at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:289)
at org.apache.maven.cli.MavenCli.main (MavenCli.java:193)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke (Method.java:566)
at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
Caused by: java.lang.ClassNotFoundException: javax.xml.bind.JAXBException
at org.codehaus.plexus.classworlds.strategy.SelfFirstStrategy.loadClass (SelfFirstStrategy.java:50)
at org.codehaus.plexus.classworlds.realm.ClassRealm.unsynchronizedLoadClass (ClassRealm.java:271)
at org.codehaus.plexus.classworlds.realm.ClassRealm.loadClass (ClassRealm.java:247)
at org.codehaus.plexus.classworlds.realm.ClassRealm.loadClass (ClassRealm.java:239)
at java.lang.Class.getDeclaredConstructors0 (Native Method)
at java.lang.Class.privateGetDeclaredConstructors (Class.java:3137)
at java.lang.Class.getDeclaredConstructors (Class.java:2357)
at com.google.inject.spi.InjectionPoint.forConstructorOf (InjectionPoint.java:245)
at com.google.inject.internal.ConstructorBindingImpl.create (ConstructorBindingImpl.java:115)
at com.google.inject.internal.InjectorImpl.createUninitializedBinding (InjectorImpl.java:706)
at com.google.inject.internal.InjectorImpl.createJustInTimeBinding (InjectorImpl.java:930)
at com.google.inject.internal.InjectorImpl.createJustInTimeBindingRecursive (InjectorImpl.java:852)
at com.google.inject.internal.InjectorImpl.getJustInTimeBinding (InjectorImpl.java:291)
at com.google.inject.internal.InjectorImpl.getBindingOrThrow (InjectorImpl.java:222)
at com.google.inject.internal.InjectorImpl.getProviderOrThrow (InjectorImpl.java:1040)
at com.google.inject.internal.InjectorImpl.getProvider (InjectorImpl.java:1071)
at com.google.inject.internal.InjectorImpl.getProvider (InjectorImpl.java:1034)
at com.google.inject.internal.InjectorImpl.getInstance (InjectorImpl.java:1086)
at org.eclipse.sisu.space.AbstractDeferredClass.get (AbstractDeferredClass.java:48)
at com.google.inject.internal.ProviderInternalFactory.provision (ProviderInternalFactory.java:85)
at com.google.inject.internal.InternalFactoryToInitializableAdapter.provision (InternalFactoryToInitializableAdapter.java:57)
at com.google.inject.internal.ProviderInternalFactory$1.call (ProviderInternalFactory.java:66)
at com.google.inject.internal.ProvisionListenerStackCallback$Provision.provision (ProvisionListenerStackCallback.java:112)
at com.google.inject.internal.ProvisionListenerStackCallback$Provision.provision (ProvisionListenerStackCallback.java:127)
at com.google.inject.internal.ProvisionListenerStackCallback.provision (ProvisionListenerStackCallback.java:66)
at com.google.inject.internal.ProviderInternalFactory.circularGet (ProviderInternalFactory.java:61)
at com.google.inject.internal.InternalFactoryToInitializableAdapter.get (InternalFactoryToInitializableAdapter.java:47)
at com.google.inject.internal.InjectorImpl$1.get (InjectorImpl.java:1050)
at org.eclipse.sisu.inject.Guice4$1.get (Guice4.java:162)
at org.eclipse.sisu.inject.LazyBeanEntry.getValue (LazyBeanEntry.java:81)
at org.eclipse.sisu.plexus.LazyPlexusBean.getValue (LazyPlexusBean.java:51)
at org.codehaus.plexus.DefaultPlexusContainer.lookup (DefaultPlexusContainer.java:263)
at org.codehaus.plexus.DefaultPlexusContainer.lookup (DefaultPlexusContainer.java:255)
at org.apache.maven.plugin.internal.DefaultMavenPluginManager.getConfiguredMojo (DefaultMavenPluginManager.java:520)
at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:124)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:210)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128)
at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
at org.apache.maven.cli.MavenCli.execute (MavenCli.java:957)
at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:289)
at org.apache.maven.cli.MavenCli.main (MavenCli.java:193)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke (Method.java:566)
at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Indra 2.2.1:
[INFO]
[INFO] Indra .............................................. SUCCESS [ 0.502 s]
[INFO] Indra Essentials Module ............................ SUCCESS [ 3.392 s]
[INFO] Indra Core Module .................................. SUCCESS [ 3.571 s]
[INFO] Indra Web Module ................................... SUCCESS [ 0.154 s]
[INFO] Indra Mongo Module ................................. FAILURE [ 0.265 s]
[INFO] Indra Web Service Module ........................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 7.994 s
[INFO] Finished at: 2021-03-07T11:18:09+01:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-failsafe-plugin:2.20:integration-test (default) on project indra-mongo: Execution default of goal org.apache.maven.plugins:maven-failsafe-plugin:2.20:integration-test failed: A required class was missing while executing org.apache.maven.plugins:maven-failsafe-plugin:2.20:integration-test: javax/xml/bind/JAXBException
Remove sspace library dependency because the license is not suitable for us.
If the client makes a request to the same model just changing parms like scoreFuntion or minWordLenght then another RelatednessClient is built and cached. It only get reused if the client repeats the exact same request. Needs investigation.
Hi I am a PhD student working with semantic relatedness. I am trying to use IndraComposed by following the instructions given here: https://github.com/Lambda-3/IndraComposed
However, while I am trying to download the model with /downloader.sh w2v-en-googlenews
I am getting the connection refusal error. The full trace is below:
Downloading http://data.lambda3.org/indra/dumps/w2v-en-wiki-2018.annoy.tar.gz ..
--2020-02-05 18:19:09-- http://data.lambda3.org/indra/dumps/w2v-en-wiki-2018.annoy.tar.gz
Resolving data.lambda3.org (data.lambda3.org)... 88.99.85.116
Connecting to data.lambda3.org (data.lambda3.org)|88.99.85.116|:80... failed: Connection refused.
Extracting w2v-en-wiki-2018.annoy.tar.gz
tar: w2v-en-wiki-2018.annoy.tar.gz: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
Can you kindly provide a solution?
Hello,
i created my own word2vec model for english documents with indra indexer and got as an ouput a corpus.metadata, a model.metadata and vectors.bin file.
When i place the vectors.bin and model.metadata file in indracomposed/data/annoy/w2v/en/
it shows up in the info/resources
request.
Where do i have to place the files to use them with Indra?
We are going to use org.lambda3 as the main namespace.
We need to be careful about dependencies and licesing.
https://maven.apache.org/guides/mini/guide-central-repository-upload.html
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.