GithubHelp home page GithubHelp logo

gchq / gaffer Goto Github PK

View Code? Open in Web Editor NEW
1.8K 142.0 353.0 214.43 MB

A large-scale entity and relation database supporting aggregation of properties

License: Apache License 2.0

Java 77.87% CSS 0.01% JavaScript 21.98% HTML 0.03% Shell 0.12%
accumulo graph graph-database hadoop big-data aggregation hbase parquet spark

gaffer's Introduction

Gaffer

ci codecov Maven Central

Gaffer is a graph database framework. It allows the storage of very large graphs containing rich properties on the nodes and edges. Several storage options are available, including Accumulo and an in-memory Java Map Store.

It is designed to be as flexible, scalable and extensible as possible, allowing for rapid prototyping and transition to production systems.

Gaffer offers:

  • Rapid query across very large numbers of nodes and edges
  • Continual ingest of data at very high data rates, and batch bulk ingest of data via MapReduce or Spark
  • Storage of arbitrary Java objects on the nodes and edges
  • Automatic, user-configurable in-database aggregation of rich statistical properties (e.g. counts, histograms, sketches) on the nodes and edges
  • Versatile query-time summarisation, filtering and transformation of data
  • Fine grained data access controls
  • Hooks to apply policy and compliance rules to queries
  • Automated, rule-based removal of data (typically used to age-off old data)
  • Retrieval of graph data into Apache Spark for fast and flexible analysis
  • A fully-featured REST API

To get going with Gaffer, visit our getting started pages (1.x, 2.x). We also have a demo available to try that is based around a small uk road use dataset. See the example/road-traffic README to try it out.

Gaffer is under active development. Version 1.0 of Gaffer was released in October 2017, version 2.0 was released in May 2023.

Contributing

We welcome contributions to the project.

Quickstart

Open in GitHub Codespaces

To quickly and easily get access to an environment with everything installed and setup correctly you can use GitHub Codespaces, or alternatively GitLab GitPod. These provide remote coding environments using VS Code with the required plugins, Java version and Maven preinstalled.

Our Javadoc can be found here. Gaffer's documentation is kept in the gaffer-doc repository and published on GitHub pages (gchq.github.io).

Local Requirements

For building Gaffer locally you need Java 8 or 11 and Maven installed locally in a *nix environment. MS Windows will work for most purposes, but is not recommended because tests utilising Hadoop fail due to limited Hadoop support on Windows. Gaffer will compile with newer versions of Java, but some tests will fail because of a lack of support for newer Java in certain external dependencies.

To build Gaffer run mvn clean install -Pquick in the top-level directory. This will build all of Gaffer's core libraries and some examples of how to load and query data.

Contribution Process

Detailed information on our ways of working can be found in our developer docs. In brief:

Inclusion in other projects

Gaffer is hosted on Maven Central and can easily be incorporated into your own maven projects.

To use Gaffer from the Java API the only required dependencies are the Gaffer graph module and a store module for the specific database technology used to store the data, e.g. for the Accumulo store:

<dependency>
    <groupId>uk.gov.gchq.gaffer</groupId>
    <artifactId>graph</artifactId>
    <version>${gaffer.version}</version>
</dependency>
<dependency>
    <groupId>uk.gov.gchq.gaffer</groupId>
    <artifactId>accumulo-store</artifactId>
    <version>${gaffer.version}</version>
</dependency>

This will include all other mandatory dependencies. Other (optional) components can be added to your project as required.

Related repositories

The gafferpy repository contains a python shell that can execute operations.

The gaffer-docker repository contains the code needed to run Gaffer using Docker or Kubernetes.

The koryphe repository contains an extensible functions library for filtering, aggregating and transforming data based on the Java Function API. It is a dependency of Gaffer.

License

Gaffer is licensed under the Apache 2 license and is covered by Crown Copyright.

Copyright 2016-2023 Crown Copyright

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

  http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

gaffer's People

Contributors

a09631 avatar ac74475 avatar ak8532110 avatar c015dariu avatar cn337131 avatar ctas582 avatar d21211122 avatar d47853 avatar gaffer01 avatar gchq-11 avatar gchqdev03 avatar gchqdev404 avatar gchqdeveloper1 avatar gchqdeveloper314 avatar gchqdeveloper404 avatar github-actions[bot] avatar james010101101 avatar javadev001001 avatar m29827 avatar m316257 avatar m55624 avatar m607123 avatar nikgil avatar p013570 avatar p29876 avatar p3430233 avatar r32575 avatar t616178 avatar t92549 avatar tb06904 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gaffer's Issues

Add validation to Views

Before using a View in an operation it would be useful to validate similar to how we validate data schemas to avoid class cast exceptions.

Error while running SimpleQuery.java

I am getting following error while running SimpleQuery.java

ubuntu@ip-172-31-24-17:~/Downloads/installs/accumulo-1.6.4$ bin/accumulo gaffer.example.SimpleQuery

Thread "gaffer.example.SimpleQuery" died Failed to create input stream from path: file:/home/ubuntu/Downloads/installs/accumulo-1.6.4/lib/ext/example-0.3.1-SNAPSHOT.jar!/dataSchema.json
java.lang.IllegalArgumentException: Failed to create input stream from path: file:/home/ubuntu/Downloads/installs/accumulo-1.6.4/lib/ext/example-0.3.1-SNAPSHOT.jar!/dataSchema.json
at gaffer.graph.Graph.createInputStream(Graph.java:428)
at gaffer.graph.Graph.(Graph.java:138)
at gaffer.graph.Graph.(Graph.java:120)
at gaffer.graph.Graph.(Graph.java:83)
at gaffer.example.SimpleQuery.run(SimpleQuery.java:93)
at gaffer.example.SimpleQuery.main(SimpleQuery.java:68)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.accumulo.start.Main$1.run(Main.java:141)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.nio.file.NoSuchFileException: file:/home/ubuntu/Downloads/installs/accumulo-1.6.4/lib/ext/example-0.3.1-SNAPSHOT.jar!/dataSchema.json
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
at java.nio.file.Files.newByteChannel(Files.java:317)
at java.nio.file.Files.newByteChannel(Files.java:363)
at java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:380)
at java.nio.file.Files.newInputStream(Files.java:108)
at gaffer.graph.Graph.createInputStream(Graph.java:426)
... 11 more

Following are the contents of .jar file

ubuntu@ip-172-31-24-17:~/Downloads/installs/accumulo-1.6.4/lib/ext$ jar -tvf example-0.3.1-SNAPSHOT.jar
0 Tue Feb 23 11:05:32 UTC 2016 META-INF/
131 Tue Feb 23 11:05:30 UTC 2016 META-INF/MANIFEST.MF
0 Tue Feb 23 10:51:06 UTC 2016 gaffer/
0 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/
0 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/
0 Tue Feb 23 10:44:58 UTC 2016 gaffer/operation/impl/add/
0 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/get/
0 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/generate/
0 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/data/
0 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/data/generator/
0 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/
0 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/function/
0 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/function/transform/
0 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/data/
0 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/data/schema/
0 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/generator/
0 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/serialiser/
0 Tue Feb 23 10:51:06 UTC 2016 gaffer/store/
0 Tue Feb 23 10:51:06 UTC 2016 gaffer/store/operation/
0 Tue Feb 23 10:51:06 UTC 2016 gaffer/store/operation/handler/
0 Tue Feb 23 10:51:06 UTC 2016 gaffer/store/schema/
0 Tue Feb 23 10:48:56 UTC 2016 gaffer/graph/
0 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/
0 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/element/
0 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/element/function/
0 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/generator/
0 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/
0 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/view/
0 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/schema/
0 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/schema/exception/
3269 Fri Feb 19 07:31:54 UTC 2016 addData.json
1928 Fri Feb 19 07:31:54 UTC 2016 complexQuery.json
1014 Tue Feb 23 10:44:58 UTC 2016 gaffer/operation/impl/add/AddElements.class
2566 Tue Feb 23 10:44:58 UTC 2016 gaffer/operation/impl/add/AddElements$Builder.class
2180 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/Validate$Builder.class
1757 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/get/GetElements.class
4990 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/get/GetElementsSeed$Builder.class
2684 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/get/GetEntitiesBySeed.class
3193 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/get/GetRelatedEntities$Builder.class
5044 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/get/GetRelatedElements$Builder.class
2740 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/get/GetAdjacentEntitySeeds.class
2833 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/get/GetRelatedElements.class
2816 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/get/GetElementsSeed.class
2678 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/get/GetRelatedEdges.class
1079 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/get/GetElements$Builder.class
3176 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/get/GetEntitiesBySeed$Builder.class
997 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/get/GetEdges$Builder.class
1023 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/get/GetEntities$Builder.class
4421 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/get/GetAdjacentEntitySeeds$Builder.class
2654 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/get/GetEdgesBySeed.class
3148 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/get/GetRelatedEdges$Builder.class
2986 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/get/GetEntities.class
2696 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/get/GetRelatedEntities.class
2905 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/get/GetEdges.class
3675 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/get/GetEdgesBySeed$Builder.class
3121 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/generate/GenerateObjects$Builder.class
3441 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/generate/GenerateElements.class
3005 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/generate/GenerateElements$Builder.class
3108 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/generate/GenerateObjects.class
2471 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/impl/Validate.class
3219 Tue Feb 23 10:44:58 UTC 2016 gaffer/operation/OperationChain$Builder$TypedBuilder.class
272 Tue Feb 23 10:44:58 UTC 2016 gaffer/operation/VoidInput.class
4824 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/AbstractOperation.class
1490 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/GetOperation$IncludeEdgeType.class
1313 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/GetOperation$IncludeIncomingOutgoingType.class
1490 Tue Feb 23 10:44:58 UTC 2016 gaffer/operation/OperationChain$Builder.class
614 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/OperationException.class
1612 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/GetOperation.class
2207 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/AbstractOperation$Builder.class
7355 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/AbstractGetOperation.class
1948 Tue Feb 23 10:44:58 UTC 2016 gaffer/operation/AbstractValidatable$Builder.class
4767 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/AbstractValidatable.class
656 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/Validatable.class
1185 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/GetOperation$SeedMatchingType.class
1449 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/data/ElementSeed.class
3075 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/data/EdgeSeed.class
2368 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/data/EntitySeed.class
3285 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/data/generator/EntitySeedExtractor.class
2460 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/data/generator/EdgeSeedExtractor.class
1723 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/data/ElementSeed$Matches.class
2888 Tue Feb 23 10:44:58 UTC 2016 gaffer/operation/OperationChain$Builder$TypelessBuilder.class
272 Tue Feb 23 10:44:58 UTC 2016 gaffer/operation/VoidOutput.class
2667 Tue Feb 23 10:44:58 UTC 2016 gaffer/operation/OperationChain.class
227 Tue Feb 23 10:44:58 UTC 2016 gaffer/operation/OperationChain$1.class
4252 Tue Feb 23 10:45:00 UTC 2016 gaffer/operation/AbstractGetOperation$Builder.class
1009 Tue Feb 23 10:44:58 UTC 2016 gaffer/operation/Operation.class
5964 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/SimpleQuery.class
8033 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/ComplexQuery.class
1796 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/function/transform/StarRatingTransform.class
1920 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/data/Review.class
605 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/data/schema/Property.class
418 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/data/schema/TransientProperty.class
477 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/data/schema/Group.class
1957 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/data/Viewing.class
2948 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/data/SampleData.class
2122 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/data/Film.class
2101 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/data/Person.class
1139 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/data/Certificate.class
1929 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/generator/FilmGenerator.class
3166 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/generator/DataGenerator.class
2090 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/generator/ViewingGenerator.class
2078 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/generator/ReviewGenerator.class
1889 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/generator/PersonGenerator.class
1919 Fri Feb 19 07:31:56 UTC 2016 gaffer/example/serialiser/CertificateVisibilitySerialiser.class
1822 Tue Feb 23 10:51:06 UTC 2016 gaffer/store/operation/handler/GenerateObjectsHandler.class
464 Tue Feb 23 10:51:06 UTC 2016 gaffer/store/operation/handler/OperationHandler.class
1793 Tue Feb 23 10:51:06 UTC 2016 gaffer/store/operation/handler/GenerateElementsHandler.class
1558 Tue Feb 23 10:51:06 UTC 2016 gaffer/store/operation/handler/ValidateHandler.class
14948 Tue Feb 23 10:51:06 UTC 2016 gaffer/store/Store.class
707 Tue Feb 23 10:51:06 UTC 2016 gaffer/store/StoreException.class
3700 Tue Feb 23 10:51:06 UTC 2016 gaffer/store/StoreProperties.class
1441 Tue Feb 23 10:51:06 UTC 2016 gaffer/store/schema/StorePropertyDefinition$Builder.class
5122 Tue Feb 23 10:51:06 UTC 2016 gaffer/store/schema/StoreSchema.class
3308 Tue Feb 23 10:51:06 UTC 2016 gaffer/store/schema/StorePropertyDefinition.class
3190 Tue Feb 23 10:51:06 UTC 2016 gaffer/store/schema/StoreSchema$Builder.class
2910 Tue Feb 23 10:51:06 UTC 2016 gaffer/store/schema/StoreElementDefinition.class
1200 Tue Feb 23 10:51:06 UTC 2016 gaffer/store/schema/StoreElementDefinition$Builder.class
1090 Tue Feb 23 10:51:06 UTC 2016 gaffer/store/StoreTrait.class
10615 Tue Feb 23 10:48:56 UTC 2016 gaffer/graph/Graph.class
2879 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/ElementValidator.class
3314 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/TransformOneToManyIterable.class
2065 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/element/ElementTuple.class
3809 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/element/Entity.class
766 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/element/Edge$1.class
671 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/element/Entity$1.class
3262 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/element/Properties.class
4960 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/element/Edge.class
283 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/element/ElementValueLoader.class
2265 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/element/function/ElementAggregator$Builder.class
2039 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/element/function/ElementAggregator.class
2933 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/element/function/ElementTransformer$Builder.class
1554 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/element/function/ElementFilter.class
2490 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/element/function/ElementFilter$Builder.class
1566 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/element/function/ElementTransformer.class
3848 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/element/LazyEdge.class
3129 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/element/ElementComponentKey.class
3163 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/element/LazyEntity.class
4394 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/element/Element.class
1435 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/element/IdentifierType.class
4758 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/element/LazyProperties.class
1681 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/element/PropertiesTuple.class
3132 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/TransformOneToManyIterable$1.class
2460 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/ValidatedElements.class
739 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/IsEdgeValidator.class
234 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/Validator.class
2677 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/TransformIterable$1.class
1430 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/generator/OneToManyElementGenerator$1.class
2327 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/generator/OneToOneElementGenerator.class
1536 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/generator/OneToOneElementGenerator$1.class
1651 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/generator/OneToManyElementGenerator.class
1611 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/generator/OneToOneElementGenerator$2.class
947 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/generator/ElementGenerator.class
367 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/ElementDefinition.class
5592 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/TypedElementDefinition.class
2237 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/ElementDefinitions$Builder.class
202 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/TypeStore.class
2037 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/TypedElementDefinition$Builder.class
1494 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/view/ViewEdgeDefinition.class
3165 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/view/ViewElementDefinition$Builder.class
1097 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/view/ViewEntityDefinition.class
3888 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/view/ViewEntityDefinition$Builder.class
2975 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/view/View.class
1324 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/view/ViewElementDefinition.class
2487 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/view/View$Builder.class
4067 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/view/ViewElementDefinitionValidator.class
4146 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/view/ViewEdgeDefinition$Builder.class
3702 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/schema/DataEntityDefinition$Builder.class
7046 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/schema/DataElementDefinition.class
3960 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/schema/DataEdgeDefinition$Builder.class
4057 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/schema/DataElementDefinitionValidator.class
2348 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/schema/DataSchema$Builder.class
1105 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/schema/DataEntityDefinition.class
1760 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/schema/DataElementDefinition$Builder.class
1502 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/schema/DataEdgeDefinition.class
6536 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/schema/DataSchema.class
672 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/schema/exception/SchemaException.class
2075 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/TypedEdgeDefinition$Builder.class
1195 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/TypedEntityDefinition$Builder.class
1793 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/TypedEdgeDefinition.class
1396 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/TypedEntityDefinition.class
8212 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/TypedElementDefinitionValidator.class
8827 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/ElementDefinitions.class
399 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/Types.class
422 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/ElementDefinitionWithIds.class
2633 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/elementdefinition/Type.class
659 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/AlwaysValid.class
747 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/IsEntityValidator.class
3191 Tue Feb 23 10:35:06 UTC 2016 gaffer/data/TransformIterable.class
1255 Fri Feb 19 07:31:54 UTC 2016 store.properties
518 Tue Feb 23 11:05:30 UTC 2016 trial.class
900 Fri Feb 19 07:31:54 UTC 2016 storeSchema.json
3467 Fri Feb 19 07:31:54 UTC 2016 dataSchema.json
633 Fri Feb 19 07:31:54 UTC 2016 simpleQuery.json
0 Tue Feb 23 11:05:32 UTC 2016 META-INF/maven/
0 Tue Feb 23 11:05:32 UTC 2016 META-INF/maven/gaffer/
0 Tue Feb 23 11:05:32 UTC 2016 META-INF/maven/gaffer/example/
2591 Fri Feb 19 05:24:26 UTC 2016 META-INF/maven/gaffer/example/pom.xml
107 Fri Feb 19 07:31:58 UTC 2016 META-INF/maven/gaffer/example/pom.properties

It can be seen that dataSchema.json is present at the correct location. I am not able to understand why the error is occurring.
Need help to resolve it !

Trouble facing while running example programs of Gaffer

Hi, I have built Gaffer and Accumulo in Ubuntu.
Building of jar file of example is successful. I put example-0.3.1-SNAPSHOT.jar in accumulo/lib/ext
Then, I tried running it to see its exact working by entering command --
$ bin/accumulo gaffer.example.SimpleQuery

but I am getting following error --

ubuntu@ip-172-31-24-17:~/Downloads/installs/accumulo-1.6.4$ bin/accumulo gaffer.example.SimpleQuery
java.lang.NoClassDefFoundError: gaffer/data/generator/ElementGenerator
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2625)
at java.lang.Class.getMethod0(Class.java:2866)
at java.lang.Class.getMethod(Class.java:1676)
at org.apache.accumulo.start.Main.main(Main.java:115)
Caused by: java.lang.ClassNotFoundException: gaffer.data.generator.ElementGenerator
at org.apache.commons.vfs2.impl.VFSClassLoader.findClass(VFSClassLoader.java:175)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 5 more
gaffer.example.SimpleQuery must implement a public static void main(String args[]) method

What am I missing ?

Modular schemas

It would be useful to be able to split schemas into small schema modules. For example different data types could be stored in different schemas then the data could be loaded into the same graph by merging the schemas.

Add query time aggregator to the View

Currently query time aggregation just uses the aggregation specified in the schema. It would be useful to be able to provide a different set of aggregation functions.

This issue should add a aggregator section to the View.

The HBase and Accumulo stores would need to be updated to use the new aggregator functions provided in the View.

As part of this ticket, the integration tests should be updated to include tests for query time aggregation

Fix typo in TextJobInitialiser

Within TextJobInitialiser there is a minor copy & paste error and redundant code that originated from AvroJobInitialiser.

Cleanup Error Handling in the Accumulostore

Some exceptions are thrown caught then re-thrown without doing any specific handling, just returning an error to the caller. Tidy this up to just throw rather than needlessly catch.

0.31 RC1 BugFix

Small Bug fixes for RC1 branch
Null Check for empty group in AbstractCoreKeyAccumuloElementConverter,
Remove maintenance jar building as part of the accumulo-store, it must be packages up elsewhere with the functions/serialisers
Adding null check for accumulo-store bulk import partitioner option.

Investigate option in DataSchema for additional input validation

We may need to add an additional section to the data schema for extra input validation. This would allow additional validation (could be computationally expensive) to be carried out when elements are added. This validation would not be done as part of the continuous validation added in issue gh-17.

Remove the Accumulo authorisation requirement

Currently all operations executed on an Accumulo store require an "authorisation" option to be set. This requirement should be removed as not all graphs will use Accumulo's visibility column and therefore operations should not be forced to have an authorisation option set.

See AccumuloStore.handleOperation

Stuck in running programs

I have built Gaffer and have Accumulo running on Hadoop on Ubuntu instance.

But I am having trouble with running the programs. It will be helpful if you could guide me about how to exactly run programs using Maven or there is some other way ??
And I am awaiting the User Guide !

Refactor AddElementsFromHdfsHandler in AccumuloStore

The AddElementsFromHdfsHandler should be changed to do the following:

The default behaviour should be set to being; using the options passed to the job through the operation/ including the reducer number and partitioner.

If an option is provided to the operation specifying that the user would like to use standard gaffer behaviour and the user also provides a splits file, then this file will be used as the job's reducer number (Any user provided reducer numbers will be over-ridden by the number of splits +1) and the AccumuloKeyRange partitioner will be used (Replacing any previously specified one).

If no splits file has been set and the user would like to use standard gaffer behaviour, then gaffer will query the accumulo table for its split points and use those for the number of reducers (+1) and will use the AccumuloKeyRange partitioner.

General changes will include:
Making clear the purpose of the splits file property in the accumulo store over one provided in an operation with additional comments in the store properties.
Change the default number of reducers for a job up from 1 to 25 to better reflect use at scale.
Adding an option to provide a partitioner to the AddElementsFromHdfs operation builder.

Accumulo store cannot aggregate empty sets of properties

I have some simple edges with no properties.
If I add the same edge twice to the Accumulo store, i would expect to retrieve a single copy of the edge (i.e. the aggregated version).
I actually get a null pointer exception thrown by the AggregatorIterator.

Add extra logging to the Filter processor

If an element fails validation it can be very difficult to find out why. It would be very useful to add additional logging to the Filter processor, so that when an element is filtered out based on a data schema it logs the function name and the property that was invalid.

We have 2 options:

  • When elements are invalid based on a data schema we could log at 'warn' level. When elements are filtered out based on a View it is probably best not to log anything to avoid slowing the operations down - this may require an extra flag to toggle logging on and off in the processor.
  • Log at 'debug' level for all filtering (data schema and view). Then users will have to set the level to 'debug' temporarily to see the extra information.

Facing problem in building Gaffer

The way explanation is given for its building and deployment is pretty short.
It will be great if you could help for the same.

I wish to build Gaffer on ubuntu 14.04, as its said to run "mvn clean package" in the top level directory that is : Gaffer

But when i run it , its not successful, and it is throwing errors. I feel that i am missing something.
It will be awesome, if you could guide me for building it the right way !
Thanks in advance !

Add store validation trait

A new store trait (STORE_VALIDATION) will be used to allow store implementations to continuously validate elements and remove elements that are no longer valid. This will also allow custom age off and expiration filters to be written based on FilterFunctions. For example, you could age off elements based on properties, e.g (timestamp > x) AND (count < y).

Accumulo store should support store validation by adding a new validation iterator.
NOTE - the previous AgeOff accumulo iterator will no longer be used. To migrate to using the new store validator, store schemas will need to be updated to add an AgeOff validator function for each element group that needs to be aged off.

Added a new AccumuloProperty "gaffer.store.accumulo.enable.validator.iterator" to disable the store validation if it is not required.

The previous VALIDATION trait has been removed as validation is handled via a validation operation automatically so it wouldn't make sense for a store not to be able to validate the elements. There is still the option to disable validation.

Add post-filter stage to operations

Rename the current filter stage to pre-filter and create a new stage called post-filter.
This will allow users to filter based on aggregated properties and transient properties (properties created by the transform stage).

The order of operation stages would then be:
pre-filter
aggregate
transform
post-filter

We need to consider also adding a mid-filter:
pre-filter
aggregate
mid-filter
transform
post-filter

Crown copyright

Some of your files have crown copyright on them then the apache2 licence UNDER it. Point two of the Apache licence is 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form.

It's hard to licence stuff. But these two points in some of your files are contradictory thus your code can't be used. Remove the crown copyright assertion. You only need the Apache in the top of the repo the rest is covered. Get a better lawyer, check your code before you publish. Consider changing to bsc or mit licence.

Very clean neat Java btw.

Accumulo iterators next() methods need to return the next element

See AbstractElementIteratorReadIntoMemory and AbstractElementIteratorFromBatches.

Currently multiple calls to next() will keep returning the same element. In order to use the iterators users must call hasNext() before calling next().

Also calling hasNext() multiple times before calling next() will skip elements.

Cannot easily create HyperLogLogPlus sketches in JSON

The JSON deserialiser for the HyperLogLogPlus sketch doesn't handle null or empty objects and simply throws an IllegalArgumentException. We should update this to create an empty HyperLogLogPlus sketch to allow objects to be created in JSON.

Currently to create a HyperLogLogPlus object in JSON you have to specify the bytes. We should also update the serialisation and deserialisation methods to allow you to set all parameters in the HyperLogLogPlus object in JSON.

Lack of information regarding potential use cases

It's an interesting project, however I think more information on the use cases you have found for this software would be very helpful. For example, might it be useful in creating a large scale surveillance and control system without respect to the will of the people?

Just curious: why waiting until completion to open source Gaffer2?

Hi. I'm just curious why you're waiting until Gaffer2 is completed before publishing it under an open source license. Is there some reason not to do its development in the open?

(Serious question, not meant facetiously, in case that's not clear. I can think of both good and bad reasons to wait, and wasn't sure which ones were operative here -- so I thought I'd just ask!)

Rename Data Schema to Graph Schema

This will help to make the distinction between the Graph Schema and the Store Schema.
The Graph and REST API should then provide a method to return the GraphStore Schema - the combined schema containing the merged information from both the Graph and Store Schema.

Improve Function library interfaces

Currently it is awkward to write Function implementations due to the inputs and outputs being an Object[]. It would also be nice if we could remove the requirement for the Inputs and Outputs annotation to define the object types.

Ideally a FilterFunction should not have to cast the parameters into the required types, something like:

public class FilterFunctionExample extends FilterFunction {
    @Override
    public boolean filter(final String param1, final Integer param2) {
        // do something
    }
}

The Function package names could also be improved to make it a bit clearer what interfaces you should use.

The method name in FilterFunction could be "isValid" instead of "filter" this would make it clearer what the method does.

We have investigated several different options, each having different advantages, but none are perfect. The best so far is something like:

public abstract class FilterFunction {
    public abstract boolean isValid(Object[] input);
}

public abstract class SimpleFilterFunction<T> extends FilterFunction {
        public boolean isValid(Object[] input) {
            return isValid((T) input[0]);
        }
        protected abstract boolean isValid(T item);
}

public class IsTrue extends SimpleFilterFunction <Boolean> {
    public boolean isValid(Boolean item) {
        return Boolean.TRUE.equals(item);
    }
}

Add Accumulostore SplitTable Operation.

The accumulostore should have an operation to add split points to a table,
These split points should be able to be generated by providing a directory of sample data, which will be used to create the splits.

In use this should be used as part of an operation chain when loading in data particularly for the first time,
First split the table according to the data,
Then run the bulk import using then enable the accumulo partitioner option to use those new splits.

Rest Api - Investigate limiting results of an operation chain

Should an N limit apply to only the first step of an operation chain?
e.g generate first N elements. Then add all these N elements.

Or should it apply to every step in an operation chain?
e.g Return the first N elements matching the given seeds then pass these N elements as the seeds of a second get Operations and limit the results of that operation to N.

Or should it apply to only the last step in an operation chain?
e.g Return all elements matching the given seeds then pass these elements as the seeds of a second get operation and return only N results of that operation.

Also investigate what to do when the result on an operation chain is void?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.