GithubHelp home page GithubHelp logo

dice-group / limes Goto Github PK

View Code? Open in Web Editor NEW
126.0 28.0 54.0 39.28 MB

Link Discovery Framework for Metric Spaces.

Home Page: https://limes.demos.dice-research.org/

License: GNU Affero General Public License v3.0

Java 32.63% Shell 0.19% HTML 0.08% JavaScript 60.89% Dockerfile 0.02% Python 1.24% Dart 1.11% Lua 1.04% PHP 1.41% Vue 1.40%
machine-learning artificial-intelligence optimization linked-data semantic-web record-linkage scalability rdf

limes's Introduction

LIMES - Link Discovery Framework for Metric Spaces.

Build Status DockerHub GNU Affero General Public License v3.0 Java 1.8+

Running LIMES

To bundle LIMES as a single jar file, do

mvn clean package shade:shade -Dmaven.test.skip=true

Then execute it using

java -jar limes-core/target/limes-core-${current-version}.jar

Using Docker

For running LIMES server in Docker, we expose port 8080. The image accepts the same arguments as the limes-core.jar, i.e. to run a configuration at ./my-configuration:

docker run -it --rm \
  -v $(pwd):/data \
  dicegroup/limes:latest \
    /data/my-configuration.xml

To run LIMES server:

docker run -it --rm \
  -p 8080:8080 \
  dicegroup/limes:latest \
    -s

To build and run Docker with WordNet:

docker build -f wordnet.Dockerfile . -t limes-wordnet

docker run -it --rm \
  -v $(pwd):/data \
  limes-wordnet \
    /data/my-configuration.xml

Maven

<dependencies>
    <dependency>
        <groupId>org.aksw.limes</groupId>
        <artifactId>limes-core</artifactId>
        <version>1.7.5</version>
    </dependency>
</dependencies>
<repositories>
    <repository>
        <id>maven.aksw.internal</id>
        <name>University Leipzig, AKSW Maven2 Internal Repository</name>
        <url>http://maven.aksw.org/repository/internal/</url>
    </repository>

    <repository>
        <id>maven.aksw.snapshots</id>
        <name>University Leipzig, AKSW Maven2 Snapshot Repository</name>
        <url>http://maven.aksw.org/repository/snapshots/</url>
    </repository>
</repositories>

How to cite

@article{KI_LIMES_2021,
  title={{LIMES - A Framework for Link Discovery on the Semantic Web}},
  author={Axel-Cyrille {Ngonga Ngomo} and Mohamed Ahmed Sherif and Kleanthi Georgala and Mofeed Hassan and Kevin Dreßler and Klaus Lyko and Daniel Obraczka and Tommaso Soru},
  journal={KI-K{\"u}nstliche Intelligenz, German Journal of Artificial Intelligence - Organ des Fachbereichs "Künstliche Intelligenz" der Gesellschaft für Informatik e.V.},
  year={2021},
  url = {https://papers.dice-research.org/2021/KI_LIMES/public.pdf},
  publisher={Springer}
}

More details

limes's People

Contributors

aalexandrasilva avatar abdullahfathi avatar aklakan avatar becker-al avatar dobraczka avatar drmalex07 avatar earthquakesan avatar ironjan avatar kleanthi avatar klyko avatar konradhoeffner avatar kvndrsslr avatar lukasbluebaum avatar mofeed-hassan avatar mofeedhassan avatar mommi84 avatar msherif avatar mueslie avatar mwauer avatar psklana avatar suganya31 avatar swantescholz avatar vemonet avatar yamalight avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

limes's Issues

NoClassDefFoundError when using jar on Windows 7

After trying to load classes in the process of making a new Config the following exception is thrown:
java.lang.NoClassDefFoundError: Could not initialize class org.apache.jena.query.ARQ
at org.aksw.limes.core.gui.util.sparql.SPARQLHelper.queryExecution(SPARQLHelper.java:322)
at org.aksw.limes.core.gui.util.sparql.SPARQLHelper.querySelect(SPARQLHelper.java:377)
at org.aksw.limes.core.gui.util.sparql.SPARQLHelper.rootClassesUncached(SPARQLHelper.java:151)
at org.aksw.limes.core.gui.model.GetClassesTask.call(GetClassesTask.java:72)
at org.aksw.limes.core.gui.model.GetClassesTask.call(GetClassesTask.java:24)
at javafx.concurrent.Task$TaskCallable.call(Task.java:1423)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

Measures and getSimilarity function

Dear all,
Helios and Dynamic planners give the opportunity to use a child LS as filter applied to the resulting mapping of the other child LS. In order to use a LS as a filter, each similarity measure implemented in the measures package SHOULD implement the public double getSimilarity(Instance a, Instance b, String property1, String property2); function.

SLF4J issue

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/Users/tom/.m2/repository/org/slf4j/slf4j-log4j12/1.7.6/slf4j-log4j12-1.7.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/Users/tom/.m2/repository/org/apache/jena/jena-jdbc-driver-bundle/1.1.2/jena-jdbc-driver-bundle-1.1.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

Error while trying to save config

When using persons11.nt and persons21.nt as endpoints, the first problem is that ID has to be specified, so this should be either obligatory when making new config or set some default value.
If ID is set, the whole GUI closes when trying to save throwing ERROR [RDFConfigurationWriter] Undefined prefix www in console

GUI: loading wrong filetype as config throws nullpointer exception

Just as title says - trying to load wrong filetype as a config throws NullPointer exception after "unknown filetype" dialogue box.

Steps to reproduce:

  1. Start GUI
  2. File -> Load config
  3. Select random file (e.g. *.png, *.md)
  4. See the "unknown filetype" dialogue box
  5. See the NullPointer dialogue box

Needs a fix.
It also would be better to restrict user to only allow selecting files with correct extensions.

CSVMappingReader is not working correctly

Using DataSetChooser i load DBPLINKEDMDB. Since there are 2 different versions (one with tab-seperated .csv files the other with comma seperated .csv files) the behaviour is different for each.

Using the tab seperated (reference.csv, source.csv, target.csv):
InstanceURIs in the caches have the form
The URIs in the referenceMapping don't have these angle brackets. Therefore I am unable to get Instances out of the caches by using URIs. The problem seems to be line 96 in CSVMappingReader:

m.add(split[0].substring(1, split[0].length() - 1), split[1].substring(1, split[1].length() - 1), 1.0);

which works fine for comma-seperated files, but not with tab-seperated. I would propose to change it, given the typical form of the .csv files, by using:

 if(split[0].startsWith("\"")){
     m.add(split[0].substring(1, split[0].length() - 1), split[1].substring(1, split[1].length() - 1), 1.0);  
}else{
     m.add(split[0],split[1],1.0);
}

This does not fix another problem in commaseperated files though: For example the line in reference2.csv

"<http://dbpedia.org/resource/Elton_John:_Me,_Myself_&_I>","<http://data.linkedmdb.org/resource/film/78002>" 

gets split on the wrong comma. Therefore in the AMapping, which is the result of CSVMappingReader reading the file, this looks like this:

[<http://dbpedia.org/resource/Elton_John:_M -> (Myself_&_I>|1.0)]

Also the mapping contains the line

[iddbpedia -> (idlinkedmdb|1.0)]

which probably should not be there.

log4j issue

log4j:WARN No appenders could be found for logger (org.apache.jena.riot.system.stream.JenaIOEnvironment).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

limes.dtd is not found when using build jar

I am testing the build jar. At first I thought it was a GUI issue, but I also get it when running it in command-line. When I run java -jar git/LIMES-dev2/limes-core/target/limes-core-1.0.0-SNAPSHOT-jar-with-dependencies.jar -f xml path/to/example.xml I get the following error stacktrace: http://pastebin.com/GbC3Rfse

I guess the problem is that since the limes.dtd is part of the jar the file does not exist and cannot be accessed

Logging path should be configurable

I want to be able to specify where LIMES should save the .log file. At the moment, it just writes it to the same folder as .xml configuration (which is fine as default).

Add CHANGELOG file

We need a CHANGELOG.md file that'll describe changes in new builds/releases.
It's possible to use something like git-extras to generate it, but we need a nice commit messages to do so.

Mappers: Validate threshold

For string mappers, when the getMapping function is called, then if the threshold is 0 or negative, the the program throws InvalidThresholdException and it terminates.

Please perform similar threshold checks to the other Mappers based on their functionality.

GUI: entering non-numerical value in page size throws

Steps to reproduce:

  1. File -> New
  2. Enter text in "Page size" box
  3. Hit "Next"

No errors are shown, but console displays the following:

Exception in thread "JavaFX Application Thread" java.lang.NumberFormatException: For input string: "-asd"
        at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
        at java.lang.Integer.parseInt(Integer.java:580)
        at java.lang.Integer.parseInt(Integer.java:615)

Input validation and display of the errors to user required.

ML algorithms and Execution Engine

Dear all,
if you are responsible for a ML algorithm and your algorithm requires an execution engine type and a planner type, please give the config instance as parameter to your constructor. No hardcoded engines and planner types. Check the Controller class for further information on how to invoke the execution engine and what parameters the (new) execute function needs. Ideally, the execution engine should interact with your ML class ONLY with the execute function and none other execute functions.

RDFConfigurationReader file path bug

When handing over a relative path to RDFConfigurationReader, it throws an Exception. Given an absolute path like /resources/lgd-lgd-ml.ttl it uses it as relative path (i.e., $CWD/resources/lgd-lgd-ml.ttl).
Probably just a missing +"/"+.

No check in Parser for complex LS with wrong Operator

I am doing test for the CanonicalPlanner. I wanted to check what happens if the LS has an operator that is not included in the list of operators. It seems that the LinkSpecification readSpec function assumes that the input LS is atomic because of the Parser class.

EagleTest.testSupervisedBatch occasionally fails

See Travis logs for more info, here's a partial copy of output:

testSupervisedBatch(org.aksw.limes.core.ml.EagleTest)  Time elapsed: 1.122 sec  <<< FAILURE!
java.lang.AssertionError: null
    at org.aksw.limes.core.ml.EagleTest.testSupervisedBatch(EagleTest.java:104)
Running org.aksw.limes.core.ml.DecisionTreeLearningTest
11:00:07.214 [main] INFO  org.aksw.limes.core.ml.algorithm.decisionTreeLearning.DecisionTreeLearning - Getting initial training mapping...
11:00:07.724 [main] INFO  org.aksw.limes.core.ml.algorithm.decisionTreeLearning.DecisionTreeLearning - false & true = false
11:00:07.724 [main] INFO  org.aksw.limes.core.ml.algorithm.decisionTreeLearning.DecisionTreeLearning - Training Data contains only positive/negative examples. Using Wombat
11:00:08.234 [main] INFO  org.aksw.limes.core.ml.algorithm.decisionTreeLearning.DecisionTreeLearning - true & true = true
11:00:08.265 [main] INFO  org.aksw.limes.core.ml.algorithm.decisionTreeLearning.DecisionTreeLearning - Building classifier....
11:00:08.404 [main] INFO  org.aksw.limes.core.ml.algorithm.decisionTreeLearning.DecisionTreeLearning - Parsing tree to LinkSpecification...
11:00:08.407 [main] INFO  org.aksw.limes.core.ml.algorithm.decisionTreeLearning.DecisionTreeLearning - Learned LinkSpecification: (jaccard(x.surname,y.surname)|1.00, 1.0, null, null)
digraph J48Tree {
N0 [label="positive (2.0/1.0)" shape=box style=filled ]
}

Measure Factories

@MSherif : Please remove any Type_of_Metric_Factory.java class from internal measure packages.
The execution engine recognizes only one measure factory, the MeasureFactory inside the measures package.
-We avoid duplication of code.
-We avoid to unnecessary maintenance.

{XML,RDF}ConfigurationReader only works correctly with paths relative to working directory

Since in the constructor both classes call

super(System.getProperty("user.dir") + "/" + fileNameOrUri);

Given an absolute path the working the directory is always taken as prefix. A possible solution could be:

super(fileNameOrUri);

But I don't know if this would break something else.
Another option could be to make another read method which takes absolute paths

Measures that have no implementation for getRuntimeApproximation

Helios creates filterInstructions considering runtime approximations of the measures included in the filter. I found a set of measures that do not implement getRuntimeApproximation. Please check measure.measures.MeasureFactory.getMeasure function. If you are responsible for one or more of the measures that have the comment "//NO getRuntimeApproximation", it would be nice if you could fix it :)

Matching equal language tags

I propose a feature that only equal language tags get matched. This increases precision and reduces runtime from |N|*|M| to |N \cap M|.

Example:

dbpedia Germany                    linkedgeodata Germany
"Germany"@en                       "Germany"@en
"Deutschland"@de                  "Deutschland"@en
"Alemagne"@fr                       "?"@es

The way it is now, there are 9 comparisons, but with the new method there would only be 2. Question is what to do with no language tag, compare to others without or also to everything else, should be an option.

Idea:

<property>rdfs:label RENAME label TAGMATCH notagtoall</property>
<property>rdfs:label RENAME label TAGMATCH notagtonotag</property>

MetricFormatException when trying to add source/target properties of example config

When trying to add the source/target properties of the example config the following error is thrown: Exception in thread "JavaFX Application Thread" org.aksw.limes.core.gui.model.metric.MetricFormatException: id "x.geom:geometry/geos:asWKT" does not confirm to the regex \w+\.\w+:?\w+ at org.aksw.limes.core.gui.model.metric.Property.<init>(Property.java:30) at org.aksw.limes.core.gui.view.ToolBox.generateProperty(ToolBox.java:157) .....

NullPointerException in Parser Constructor

I am doing a test for HeliosPlanner and I am creating a plan that has a null measure expression in its filtering instruction (aka filter without condition but threshold only). Helios is trying to find get costs for this filtering instruction by finding the metric(s) in the measure expression and therefore, a function in Helios creates a Parser instance. I have fixed Helios to handle LS with filtering instructions with null measure expressions but that issue could also be raised somewhere else i.e. when someone gives an empty LS as input.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.