GithubHelp home page GithubHelp logo

lambda-3 / indra Goto Github PK

View Code? Open in Web Editor NEW
47.0 9.0 14.0 11.06 MB

Indra is a Web Service which allows easy access to different distributional semantics models in several languages.

License: MIT License

Java 99.72% Shell 0.08% Python 0.16% Dockerfile 0.04%
natural-language-processing distributional-semantics search-engine ai indra semantic-models w2v lsa glove aixprize

indra's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

indra's Issues

Empty documentation page

Why are various documentation links pointing to a relatively empty page? Is this project alive?

Upgrade MongoDB backend.

We're still using version 3.0. There are plenty of new improvements on the MongoDB that we can take advantage, like experimenting with the Wired Tiger engine, better memory management etc.

Jersey Error on initialization

Until now is harmless but needs investigation.

WARN org.glassfish.jersey.internal.Errors - The following warnings have been detected: WARNING: Parameter 1 of type org.lambda3.indra.core.VectorSpaceFactory<? extends org.la mbda3.indra.core.VectorSpace> from public org.lambda3.indra.service.impl.InfoResourceImpl(org.lambda3.indra.core.VectorSpaceFactory<? extends org.lambda3.indra.core.VectorSpace>,org.lambda 3.indra.core.translation.IndraTranslatorFactory<? extends org.lambda3.indra.core.translation.IndraTranslator>) is not resolvable to a concrete type. WARNING: Parameter 2 of type org.lambda3.indra.core.translation.IndraTranslatorFactory<? extends org.lambda3.indra.core.translation.IndraTranslator> from public org.lambda3.indra.service.i mpl.InfoResourceImpl(org.lambda3.indra.core.VectorSpaceFactory<? extends org.lambda3.indra.core.VectorSpace>,org.lambda3.indra.core.translation.IndraTranslatorFactory<? extends org.lambda3 .indra.core.translation.IndraTranslator>) is not resolvable to a concrete type. WARNING: Parameter 1 of type org.lambda3.indra.core.VectorSpaceFactory<? extends org.lambda3.indra.core.VectorSpace> from public org.lambda3.indra.service.impl.InfoResourceImpl(org.lambda3 .indra.core.VectorSpaceFactory<? extends org.lambda3.indra.core.VectorSpace>,org.lambda3.indra.core.translation.IndraTranslatorFactory<? extends org.lambda3.indra.core.translation.IndraTra nslator>) is not resolvable to a concrete type. WARNING: Parameter 2 of type org.lambda3.indra.core.translation.IndraTranslatorFactory<? extends org.lambda3.indra.core.translation.IndraTranslator> from public org.lambda3.indra.service.i mpl.InfoResourceImpl(org.lambda3.indra.core.VectorSpaceFactory<? extends org.lambda3.indra.core.VectorSpace>,org.lambda3.indra.core.translation.IndraTranslatorFactory<? extends org.lambda3.indra.core.translation.IndraTranslator>) is not resolvable to a concrete type.

Export word embeddings

I want to export the word embedding model to import it into another project, probably gensim or another similar but supported library. How can this be done?

Licensing model

Decide the appropriate license for the project, if needed at all!

Request method with a term agains a list of terms

Currently we only support a list of text pairs and the users can only repeat the pairs to achieve the same result.

We have this:

[{"t1": "a", "t2": "b"}, {"t1": "a", "t2": "c"}, {"t1": "a", "t2": "d"}]

Ideally could be:

{"t1": "a", "t2": ["b", "c", "d"]}

Indra Client: Unrecognized field "mt"

Indra Version: lambdacube/indra:2.0.2 (docker)
Indra Client Version: 2.0.3-rc 24ef31f
javax.ws.rs: 2.0.1
Jersey: 2.25.1
Jackson: 2.8.6

Problem

Request gets serialized with unrecognized field "mt"

Request:

RelatednessPairRequest request = new RelatednessPairRequest()
                .corpus("googlenews300neg")
                .language("EN")
                .scoreFunction(ScoreFunction.COSINE)
                .model("W2V")
                .mt(false)
                .pairs(pairs);
WebTarget webTarget = this.client.target(this.url);
RelatednessPairResponse response = webTarget.request().buildPost(Entity.entity(request, MediaType.APPLICATION_JSON_TYPE)).invoke(RelatednessPairResponse.class);

Debug output:

1 > POST http://127.0.0.1:8916/relatedness
1 > Content-Type: application/json
{"corpus":"googlenews300neg","model":"W2V","language":"EN","mt":false,"applyStopWords":null,"minWordLength":-1,"scoreFunction":"COSINE","pairs":[{"t1":"mother","t2":"love"},{"t1":"father","t2":"love"}]}

Nov 09, 2017 8:29:16 PM org.glassfish.jersey.logging.LoggingInterceptor log
INFO: 1 * Client response received on thread main
1 < 400
1 < Connection: close
1 < Content-Length: 421
1 < Content-Type: text/plain
1 < Date: Thu, 09 Nov 2017 19:29:16 GMT
Unrecognized field "mt" (class org.lambda3.indra.client.RelatednessRequest), not marked as ignorable (7 known properties: "minWordLength", "corpus", "language", "model", "pairs", "applyStopWords", "scoreFunction"])
 at [Source: org.glassfish.jersey.message.internal.ReaderInterceptorExecutor$UnCloseableInputStream@8d36676; line: 1, column: 70] (through reference chain: org.lambda3.indra.client.RelatednessRequest["mt"])


javax.ws.rs.BadRequestException: HTTP 400 Bad Request

	at org.glassfish.jersey.client.JerseyInvocation.convertToException(JerseyInvocation.java:1011)
	at org.glassfish.jersey.client.JerseyInvocation.translate(JerseyInvocation.java:819)
	at org.glassfish.jersey.client.JerseyInvocation.access$700(JerseyInvocation.java:92)
	at org.glassfish.jersey.client.JerseyInvocation$2.call(JerseyInvocation.java:701)

can't get dep word vectors

if I run the following command, I get an error message, while 'dep-en-wiki-2014' is in http://data.lambda3.org/indra/dumps/.

curl -X POST -H "Content-Type: application/json" -d '{
	"corpus": "wiki-2014",
	"model": "dep",
	"language": "EN",
	"terms": ["love", "mother", "santa claus"]
}' "http://indra.lambda3.org/vectors"

"msg":"Model 'dep-EN-wiki-2014' not found."}

Unable to download models

I tried running ./downloader.sh esa-en-wiki-2018
to download the model and try Indra, but it is not connecting.
(I also tried with other models)

Are they still available?

Improve documentation

Make it clear how to start the service, dependencies, client use-cases, limitations..

400 Client Error: Bad Request for url

Hi,
i installed Indra using IndraComposed and have problems to get the "nearest neighbors" running. While pair relatedness, OTM nearest neighbors work, i get an error for the nearest neighbors relatedness:

requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: http://localhost:8916/neighbors/relatedness

The code i am using is as follows:

data = {'corpus': 'wiki-2018', 
             'model': 'W2V', 
             'language': 'EN', 
             'topk' : '3', 
             'terms': ['house', 'engine']
           }
res = requests.post("http://localhost:8916/neighbors/vectors", data = json.dumps(data), headers = headers)
res.raise_for_status()
print('[Nearest neighbor vectors]', res.json())

Cannot build on Java 11 - missing javax.xml.bind.*

Package javax.xml.bind.* were moved out from standard JDK distribution.

https://www.jesperdj.com/2018/09/30/jaxb-on-java-9-10-11-and-beyond/

I've provided a PR #71 that solves this issue. I hope it is helpful.

╰─$ java --version
openjdk 11.0.10 2021-01-19
OpenJDK Runtime Environment (build 11.0.10+9-Ubuntu-0ubuntu1.20.04)
OpenJDK 64-Bit Server VM (build 11.0.10+9-Ubuntu-0ubuntu1.20.04, mixed mode, sharing)

$ mvn clean verify

[INFO] --- maven-failsafe-plugin:2.20:integration-test (default) @ indra-mongo ---
[WARNING] Error injecting: org.apache.maven.plugin.failsafe.IntegrationTestMojo
java.lang.NoClassDefFoundError: javax/xml/bind/JAXBException
    at java.lang.Class.getDeclaredConstructors0 (Native Method)
    at java.lang.Class.privateGetDeclaredConstructors (Class.java:3137)
    at java.lang.Class.getDeclaredConstructors (Class.java:2357)
    at com.google.inject.spi.InjectionPoint.forConstructorOf (InjectionPoint.java:245)
    at com.google.inject.internal.ConstructorBindingImpl.create (ConstructorBindingImpl.java:115)
    at com.google.inject.internal.InjectorImpl.createUninitializedBinding (InjectorImpl.java:706)
    at com.google.inject.internal.InjectorImpl.createJustInTimeBinding (InjectorImpl.java:930)
    at com.google.inject.internal.InjectorImpl.createJustInTimeBindingRecursive (InjectorImpl.java:852)
    at com.google.inject.internal.InjectorImpl.getJustInTimeBinding (InjectorImpl.java:291)
    at com.google.inject.internal.InjectorImpl.getBindingOrThrow (InjectorImpl.java:222)
    at com.google.inject.internal.InjectorImpl.getProviderOrThrow (InjectorImpl.java:1040)
    at com.google.inject.internal.InjectorImpl.getProvider (InjectorImpl.java:1071)
    at com.google.inject.internal.InjectorImpl.getProvider (InjectorImpl.java:1034)
    at com.google.inject.internal.InjectorImpl.getInstance (InjectorImpl.java:1086)
    at org.eclipse.sisu.space.AbstractDeferredClass.get (AbstractDeferredClass.java:48)
    at com.google.inject.internal.ProviderInternalFactory.provision (ProviderInternalFactory.java:85)
    at com.google.inject.internal.InternalFactoryToInitializableAdapter.provision (InternalFactoryToInitializableAdapter.java:57)
    at com.google.inject.internal.ProviderInternalFactory$1.call (ProviderInternalFactory.java:66)
    at com.google.inject.internal.ProvisionListenerStackCallback$Provision.provision (ProvisionListenerStackCallback.java:112)
    at com.google.inject.internal.ProvisionListenerStackCallback$Provision.provision (ProvisionListenerStackCallback.java:127)
    at com.google.inject.internal.ProvisionListenerStackCallback.provision (ProvisionListenerStackCallback.java:66)
    at com.google.inject.internal.ProviderInternalFactory.circularGet (ProviderInternalFactory.java:61)
    at com.google.inject.internal.InternalFactoryToInitializableAdapter.get (InternalFactoryToInitializableAdapter.java:47)
    at com.google.inject.internal.InjectorImpl$1.get (InjectorImpl.java:1050)
    at org.eclipse.sisu.inject.Guice4$1.get (Guice4.java:162)
    at org.eclipse.sisu.inject.LazyBeanEntry.getValue (LazyBeanEntry.java:81)
    at org.eclipse.sisu.plexus.LazyPlexusBean.getValue (LazyPlexusBean.java:51)
    at org.codehaus.plexus.DefaultPlexusContainer.lookup (DefaultPlexusContainer.java:263)
    at org.codehaus.plexus.DefaultPlexusContainer.lookup (DefaultPlexusContainer.java:255)
    at org.apache.maven.plugin.internal.DefaultMavenPluginManager.getConfiguredMojo (DefaultMavenPluginManager.java:520)
    at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:124)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:210)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
    at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
    at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
    at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
    at org.apache.maven.cli.MavenCli.execute (MavenCli.java:957)
    at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:289)
    at org.apache.maven.cli.MavenCli.main (MavenCli.java:193)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:566)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
    at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
    at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
Caused by: java.lang.ClassNotFoundException: javax.xml.bind.JAXBException
    at org.codehaus.plexus.classworlds.strategy.SelfFirstStrategy.loadClass (SelfFirstStrategy.java:50)
    at org.codehaus.plexus.classworlds.realm.ClassRealm.unsynchronizedLoadClass (ClassRealm.java:271)
    at org.codehaus.plexus.classworlds.realm.ClassRealm.loadClass (ClassRealm.java:247)
    at org.codehaus.plexus.classworlds.realm.ClassRealm.loadClass (ClassRealm.java:239)
    at java.lang.Class.getDeclaredConstructors0 (Native Method)
    at java.lang.Class.privateGetDeclaredConstructors (Class.java:3137)
    at java.lang.Class.getDeclaredConstructors (Class.java:2357)
    at com.google.inject.spi.InjectionPoint.forConstructorOf (InjectionPoint.java:245)
    at com.google.inject.internal.ConstructorBindingImpl.create (ConstructorBindingImpl.java:115)
    at com.google.inject.internal.InjectorImpl.createUninitializedBinding (InjectorImpl.java:706)
    at com.google.inject.internal.InjectorImpl.createJustInTimeBinding (InjectorImpl.java:930)
    at com.google.inject.internal.InjectorImpl.createJustInTimeBindingRecursive (InjectorImpl.java:852)
    at com.google.inject.internal.InjectorImpl.getJustInTimeBinding (InjectorImpl.java:291)
    at com.google.inject.internal.InjectorImpl.getBindingOrThrow (InjectorImpl.java:222)
    at com.google.inject.internal.InjectorImpl.getProviderOrThrow (InjectorImpl.java:1040)
    at com.google.inject.internal.InjectorImpl.getProvider (InjectorImpl.java:1071)
    at com.google.inject.internal.InjectorImpl.getProvider (InjectorImpl.java:1034)
    at com.google.inject.internal.InjectorImpl.getInstance (InjectorImpl.java:1086)
    at org.eclipse.sisu.space.AbstractDeferredClass.get (AbstractDeferredClass.java:48)
    at com.google.inject.internal.ProviderInternalFactory.provision (ProviderInternalFactory.java:85)
    at com.google.inject.internal.InternalFactoryToInitializableAdapter.provision (InternalFactoryToInitializableAdapter.java:57)
    at com.google.inject.internal.ProviderInternalFactory$1.call (ProviderInternalFactory.java:66)
    at com.google.inject.internal.ProvisionListenerStackCallback$Provision.provision (ProvisionListenerStackCallback.java:112)
    at com.google.inject.internal.ProvisionListenerStackCallback$Provision.provision (ProvisionListenerStackCallback.java:127)
    at com.google.inject.internal.ProvisionListenerStackCallback.provision (ProvisionListenerStackCallback.java:66)
    at com.google.inject.internal.ProviderInternalFactory.circularGet (ProviderInternalFactory.java:61)
    at com.google.inject.internal.InternalFactoryToInitializableAdapter.get (InternalFactoryToInitializableAdapter.java:47)
    at com.google.inject.internal.InjectorImpl$1.get (InjectorImpl.java:1050)
    at org.eclipse.sisu.inject.Guice4$1.get (Guice4.java:162)
    at org.eclipse.sisu.inject.LazyBeanEntry.getValue (LazyBeanEntry.java:81)
    at org.eclipse.sisu.plexus.LazyPlexusBean.getValue (LazyPlexusBean.java:51)
    at org.codehaus.plexus.DefaultPlexusContainer.lookup (DefaultPlexusContainer.java:263)
    at org.codehaus.plexus.DefaultPlexusContainer.lookup (DefaultPlexusContainer.java:255)
    at org.apache.maven.plugin.internal.DefaultMavenPluginManager.getConfiguredMojo (DefaultMavenPluginManager.java:520)
    at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:124)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:210)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
    at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
    at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
    at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
    at org.apache.maven.cli.MavenCli.execute (MavenCli.java:957)
    at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:289)
    at org.apache.maven.cli.MavenCli.main (MavenCli.java:193)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:566)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
    at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
    at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Indra 2.2.1:
[INFO] 
[INFO] Indra .............................................. SUCCESS [  0.502 s]
[INFO] Indra Essentials Module ............................ SUCCESS [  3.392 s]
[INFO] Indra Core Module .................................. SUCCESS [  3.571 s]
[INFO] Indra Web Module ................................... SUCCESS [  0.154 s]
[INFO] Indra Mongo Module ................................. FAILURE [  0.265 s]
[INFO] Indra Web Service Module ........................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  7.994 s
[INFO] Finished at: 2021-03-07T11:18:09+01:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-failsafe-plugin:2.20:integration-test (default) on project indra-mongo: Execution default of goal org.apache.maven.plugins:maven-failsafe-plugin:2.20:integration-test failed: A required class was missing while executing org.apache.maven.plugins:maven-failsafe-plugin:2.20:integration-test: javax/xml/bind/JAXBException

RelatednessClient being initialized multiple times

If the client makes a request to the same model just changing parms like scoreFuntion or minWordLenght then another RelatednessClient is built and cached. It only get reused if the client repeats the exact same request. Needs investigation.

IndraComposed connection refused while downloading Model

Hi I am a PhD student working with semantic relatedness. I am trying to use IndraComposed by following the instructions given here: https://github.com/Lambda-3/IndraComposed

However, while I am trying to download the model with /downloader.sh w2v-en-googlenews I am getting the connection refusal error. The full trace is below:

Downloading http://data.lambda3.org/indra/dumps/w2v-en-wiki-2018.annoy.tar.gz ..
--2020-02-05 18:19:09--  http://data.lambda3.org/indra/dumps/w2v-en-wiki-2018.annoy.tar.gz
Resolving data.lambda3.org (data.lambda3.org)... 88.99.85.116
Connecting to data.lambda3.org (data.lambda3.org)|88.99.85.116|:80... failed: Connection refused.
Extracting w2v-en-wiki-2018.annoy.tar.gz
tar: w2v-en-wiki-2018.annoy.tar.gz: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now

Can you kindly provide a solution?

How to use own model?

Hello,
i created my own word2vec model for english documents with indra indexer and got as an ouput a corpus.metadata, a model.metadata and vectors.bin file.

When i place the vectors.bin and model.metadata file in indracomposed/data/annoy/w2v/en/ it shows up in the info/resources request.

Where do i have to place the files to use them with Indra?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.