GithubHelp home page GithubHelp logo

graphaware / neo4j-expire Goto Github PK

View Code? Open in Web Editor NEW
29.0 26.0 10.0 206 KB

GraphAware Module for Expiring (Deleting) Nodes and Relationships

Java 100.00%
neo4j neo4j-graphaware-framework java

neo4j-expire's Introduction

GraphAware Neo4j Expire - RETIRED

GraphAware Neo4j Expire Has Been Retired

As of May 2021, this repository has been retired.

GraphAware Expire is a simple library that automatically deletes nodes and relationships from the database when they've reached their expiration date or time-to-live (TTL).

Community vs Enterprise

This open-source (GPL) version of the module is compatible with GraphAware Framework Community (GPL), which in turn is compatible with Neo4j Community Edition (GPL) only. It will not work with Neo4j Enterprise Edition, which is a proprietary and commercial software product of Neo4j, Inc..

GraphAware offers an Enterprise version of the GraphAware Framework to licensed users of Neo4j Enterprise Edition. Please get in touch to receive access.

Getting the Software

Server Mode

When using Neo4j in the standalone server mode, you will need the GraphAware Neo4j Framework and GraphAware Neo4j Expire .jar files (both of which you can download here) dropped into the plugins directory of your Neo4j installation. After changing a few lines of config (read on) and restarting Neo4j, the module will do its magic.

Embedded Mode / Java Development

Java developers that use Neo4j in embedded mode and those developing Neo4j server plugins, unmanaged extensions, GraphAware Runtime Modules, or Spring MVC Controllers can include the Expire Module as a dependency for their Java project.

Releases

Releases are synced to Maven Central repository. When using Maven for dependency management, include the following dependency in your pom.xml.

<dependencies>
    ...
    <dependency>
        <groupId>com.graphaware.neo4j</groupId>
        <artifactId>expire</artifactId>
        <version>3.5.11.54.4</version>
    </dependency>
    ...
</dependencies>

Snapshots

To use the latest development version, just clone this repository, run mvn clean install and change the version in the dependency above to 3.5.11.54.5-SNAPSHOT.

Note on Versioning Scheme

The version number has two parts. The first four numbers indicate compatibility with Neo4j GraphAware Framework. The last number is the version of the Expire library. For example, version 2.3.3.37.1 is version 1 of the Expire library compatible with GraphAware Neo4j Framework 2.3.3.37.

Setup and Configuration

Server Mode

First, please make sure that the framework is configured by adding dbms.thirdparty_jaxrs_classes=com.graphaware.server=/graphaware to conf/neo4j.conf, as described here.

And add this configuration to register the Expire module:

com.graphaware.runtime.enabled=true

#EM becomes the module ID (you will need to use this ID in other config below):
com.graphaware.module.EM.1=com.graphaware.neo4j.expire.ExpirationModuleBootstrapper

#If you want to delete nodes at a certain time, configure the node property (in this case "expire")
#that holds the expiration time in ms since epoch:
com.graphaware.module.EM.nodeExpirationProperty=expire

#Alternatively, if you want to delete nodes after some time has elapsed since they have been created,
#configure the node property (in this case "ttl") that holds the TTL in ms:
com.graphaware.module.EM.nodeTtlProperty=ttl

#If you want to delete relationships at a certain time, configure the relationships property (in this case "expire")
#that holds the expiration time in ms since epoch:
com.graphaware.module.EM.relationshipExpirationProperty=expire

#Alternatively, if you want to delete relationships after some time has elapsed since they have been created,
#configure the relationships property (in this case "ttl") that holds the TTL in ms:
com.graphaware.module.EM.relationshipTtlProperty=ttl

#If you want to delete expired nodes despite that fact they still have relationships, set the strategy to "force".
# This setting defaults to "orphan", which will only delete expired nodes with no relationships:
com.graphaware.module.EM.nodeExpirationStrategy=force

#By default, all created/updated nodes and relationships are checked for the presence of expire/ttl property.
#As with most GraphAware Modules, nodes and relationships this module applies to can be limited by the use of SPeL, e.g.:
com.graphaware.module.EM.node=hasLabel('NodeThatExpiresAtSomePoint')
com.graphaware.module.EM.relationship=isType('TEMPORARY_RELATIONSHIP')

#Optionally, configure the maximum number of nodes/relationships deleted in one transaction. Defaults to 1000.
com.graphaware.module.EM.maxExpirations=5000

Embedded Mode / Java Development

To use the Expire module programmatically, register the module like this

 GraphAwareRuntime runtime = GraphAwareRuntimeFactory.createRuntime(database);  //where database is an instance of GraphDatabaseService
 ExpirationModule module = new ExpirationModule("EXP", database, ExpirationConfiguration.defaultConfiguration().withNodeTtlProperty("ttl").withRelationshipTtlProperty("ttl"));
 runtime.registerModule(module);
 runtime.start();

Alternatively:

 GraphDatabaseService database = new GraphDatabaseFactory().newEmbeddedDatabaseBuilder(pathToDb)
    .loadPropertiesFromFile(this.getClass().getClassLoader().getResource("neo4j.conf").getPath())
    .newGraphDatabase();

 //make sure neo4j.properties contain the lines mentioned in previous section

Using GraphAware Expire

Apart from the configuration described above, the GraphAware Expire module requires nothing else to function. It will delete nodes and relationships when they've reached their expiration date or TTL. In case both expiration date and TTL are set, the module takes into account whichever one is later. Please note a few more facts of interest:

  • by default, nodes are only deleted if they have no relationship (i.e. all relationships have expired or have been manually deleted), unless the node expiration strategy is set to "force".
  • when ttl property gets updated, the time-to-live is counted from the moment the node has been updated
  • one of the following must be configured, otherwise it does not make sense to use the module: nodeExpirationProperty, nodeTtlProperty, relationshipExpirationProperty, relationshipTtlProperty.

Advanced Config

Nodes and relationships, along with their expiration dates, are stored in Neo4j's legacy index, completely transparently to the user. A GraphAware Framework Timer-Driven Runtime Module checks for expired nodes and relationships every time it is asked to perform work, and deletes the ones that are found.

Please note that the default setting for the Timer-Driven Runtime Module is and "adaptive" strategy that it slows down background processing when the database is busy. By default, the maximum delay between invocations is 5 seconds. If you want a shorter and/or more predictable time between a node/relationship reaching its expiration date and actually being deleted, you can change this strategy. For example, if you wanted to check for expired elements every 100ms consistently, you could add the following lines to neo4j.properties:

com.graphaware.runtime.timing.strategy=fixed
com.graphaware.runtime.timing.initialDelay=100
com.graphaware.runtime.timing.delay=100

License

Copyright (c) 2013-2020 GraphAware

GraphAware is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.

neo4j-expire's People

Contributors

bachmanm avatar ikwattro avatar jasperblues avatar michal-trnka avatar swamwithturtles avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

neo4j-expire's Issues

Entities deleted prior to expiration not removed from index

If a node/relationship is deleted prior to expiration it is left in the manual index. Since Neo4j recycles IDs, when a node/relationship is assigned an ID that's still in the TTL index after a delete, the new entity is deleted even if the new entity was not assigned a TTL.

Use Double.valueOf to parse longs

result = Long.parseLong(entity.getProperty(expirationProperty).toString())

Calling toString on the expiration property may result in it being returned in Scientific notation, therefore unparseable. Instead use:

Double.valueOf((entity.getProperty(expirationProperty).toString()))

Changed in #9 which might get merged, otherwise logged here.

Expire module not working correctly for multi-node Neo4j cluster

We are having issues with the expire module (3.0.6.43.3) running in a multi-node, neo4j cluster (3.0.6).
When setting the attribute 'ttl' on a node in the graph with the value of 10000 (10 seconds), the node is deleted after 10 seconds in a single node neo4j cluster (which is correct). The same test with multiple neo4j nodes results in the node being deleted instantly (not waiting for 10 seconds).

Test cypher:
create (t:Test) return t // node is returned
match (t:Test) return t // again node is returned
match (t:Test) set t.ttl = 10000 return t // node is returned with new ttl attribute
match (t:Test) return t // Even though this cypher was executed with 1 second of the previous, the node is already gone

Setting the ttl value to 40000 did result in the node existing longer, but it was not 40 seconds (more like 5-6 seconds). What is strange, we were having this problem in test/stage/prod and now the problem only exists in stage. We're wondering if maybe the servers are out of sync (their clocks), maybe this could cause the problem (as we noticed a couple of the prod boxes had been bounced).

Switching a single node from High Availability (multi-node cluster) to a single stand alone node results in the expire module working as expected. Switching back to the multi-node and the test no longer works.

Our entries in neo4j.conf:
com.graphaware.runtime.enabled=true
com.graphaware.module.EM.1=com.graphaware.neo4j.expire.ExpirationModuleBootstrapper
com.graphaware.module.EM.relationshipTtlProperty=ttl
com.graphaware.module.EM.nodeTtlProperty=ttl
com.graphaware.runtime.stats.disabled=true
com.graphaware.server.stats.disabled=true
dbms.unmanaged_extension_classes=com.graphaware.server=/graphaware

Some bad defaults in the suggested configuration?

In the readme for the configuration properties to apply to neo4j.conf, we can see:

#By default, all created/updated nodes and relationships are checked for the presence of expire/ttl property.
#As with most GraphAware Modules, nodes and relationships this module applies to can be limited by the use of SPeL, e.g.:
com.graphaware.module.EM.node=hasLabel('NodeThatExpiresAtSomePoint')
com.graphaware.module.EM.relationship=isType('TEMPORARY_RELATIONSHIP')

Doesn't the presence of these uncommented lines means that it will only be looking at :NodeThatExpiresAtSomePoint nodes and :TEMPORARY_RELATIONSHIP types instead of all updated nodes and relationships? Seems like these should be commented out to adhere with the note about default behavior.

Trigger to set ttl with phase=before not being invoked

Using:

neo4j 3.5.11
graphaware-expire-3.5.11.54.4.jar
graphaware-server-enterprise-all-3.5.11.54.jar

user is creating 2 triggers:

CALL apoc.trigger.add('create-ttl-new-pet','UNWIND {createdNodes} as n MATCH(n:Pet) set n.ttl=60000',{phase:'before'})
CALL apoc.trigger.add('create-ttl-new-person','UNWIND {createdNodes} as n MATCH(n:Person) set n.ttl=60000',{phase:'after'})

Trigger for label of Person with phase=before does not get triggered after the allotted time, but the one with phase=after does. Both nodes do display the "ttl": 60000, correctly.

Config file:

com.graphaware.runtime.enabled=true
com.graphaware.module.EM.1=com.graphaware.neo4j.expire.ExpirationModuleBootstrapper
#com.graphaware.module.EM.nodeExpirationProperty=expire
com.graphaware.module.EM.nodeTtlProperty=ttl
#com.graphaware.module.EM.relationshipExpirationProperty=expire
com.graphaware.module.EM.relationshipTtlProperty=ttl
com.graphaware.module.EM.nodeExpirationStrategy=force
#As with most GraphAware Modules, nodes and relationships this module applies to can be limited by the use of SPeL, e.g.:
#com.graphaware.module.EM.node=hasLabel('NodeThatExpiresAtSomePoint')
#com.graphaware.module.EM.relationship=isType('TEMPORARY_RELATIONSHIP')
com.graphaware.module.EM.maxExpirations=1000

Tried setting

com.graphaware.module.EM.node=hasLabel('Pet) || hasLabel('Person')

but the issue persists nonetheless. Any insight into this issue is greatly appreciated.

google-analytics within Neo4j

We are seeing the following exception over and over in our Neo4j logs:
2017-02-20 13:34:57.564+0000 WARN [c.g.c.p.GoogleAnalyticsStatsCollector] Unable to collect stats Connect to www.google-analytics.com:80 [www.google-analytics.co
m/4.59.40.103, www.google-analytics.com/4.59.40.89, www.google-analytics.com/4.59.40.114, www.google-analytics.com/4.59.40.123, www.google-analytics.com/4.59.40.1
18, www.google-analytics.com/4.59.40.94, www.google-analytics.com/4.59.40.98, www.google-analytics.com/4.59.40.93, www.google-analytics.com/4.59.40.88, www.google
-analytics.com/4.59.40.109, www.google-analytics.com/4.59.40.84, www.google-analytics.com/4.59.40.113, www.google-analytics.com/4.59.40.119, www.google-analytics.
com/4.59.40.104, www.google-analytics.com/4.59.40.99, www.google-analytics.com/4.59.40.108] failed: Connection timed out (Connection timed out)
org.apache.http.conn.HttpHostConnectException: Connect to www.google-analytics.com:80 [www.google-analytics.com/4.59.40.103, www.google-analytics.com/4.59.40.89,
www.google-analytics.com/4.59.40.114, www.google-analytics.com/4.59.40.123, www.google-analytics.com/4.59.40.118, www.google-analytics.com/4.59.40.94, www.google-
analytics.com/4.59.40.98, www.google-analytics.com/4.59.40.93, www.google-analytics.com/4.59.40.88, www.google-analytics.com/4.59.40.109, www.google-analytics.com
/4.59.40.84, www.google-analytics.com/4.59.40.113, www.google-analytics.com/4.59.40.119, www.google-analytics.com/4.59.40.104, www.google-analytics.com/4.59.40.99
, www.google-analytics.com/4.59.40.108] failed: Connection timed out (Connection timed out)

Our network security team will never allow this information to flow through our firewalls. We thought we configured the Graphware Expire module correctly to turn off this logging but apparently not:
Within the neo4j.conf file:
com.graphaware.runtime.stats.disabled=true
com.graphaware.server.stats.disabled=true

Can you tell us how to turn off google analytics?

Add/Remove labels

Would you consider expansion of scope to:

  • Rather than expire, add or remove configurable labels, such as ActiveProfile, InactiveProfile
  • Trigger from a configurable property, rather than expiration date. This property would take precedence over expirationDate and TTL.

Would you accept a PR for such a feature? No worries if the above is beyond the scope of this module. Wanted to check to avoid doubling up.

Wrong ms value on older article on expire plugin

Not directly related to code but an older article, not sure where to put this.

Looking at this article from 2016:
https://graphaware.com/neo4j/2016/03/15/expiring-data-in-neo4j.html

we see this:

MATCH (p:Person {name:'Michal'})
MATCH (o:Organisation {name:'ACM'})
MERGE (p)-[:MEMBER_OF {ttl:2592000}]->(o)

will cause the relationship to vanish in exactly 30 days (2,592,000 ms).

The ms value here, 2,592,000, looks to be the value in seconds (30*24*60*60), it should be multiplied by 1000 to provide the ms value of 2,592,000,000.

Protected access

Hi, I'm trying to use the neo4j-expire in embedded mode but the constructor for the ExpirationModule is protected instead of public.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.