GithubHelp home page GithubHelp logo

oshimat / jpmml-evaluator Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jpmml/jpmml-evaluator

0.0 2.0 0.0 5.48 MB

Java Evaluator API for PMML

License: GNU Affero General Public License v3.0

Java 98.53% R 1.47%

jpmml-evaluator's Introduction

JPMML-Evaluator Build Status

Java Evaluator API for Predictive Model Markup Language (PMML).

Features

JPMML-Evaluator is de facto the reference implementation of the PMML specification versions 3.0, 3.1, 3.2, 4.0, 4.1, 4.2 and 4.3 for the Java platform:

For more information please see the features.md file.

JPMML-Evaluator is interoperable with most popular statistics and data mining software:

JPMML-Evaluator is fast and memory efficient. It can deliver one million scorings per second already on a desktop computer.

Prerequisites

  • Java 1.8 or newer.

Installation

JPMML-Evaluator library JAR files (together with accompanying Java source and Javadocs JAR files) are released via Maven Central Repository.

The current version is 1.3.11 (29 January, 2018).

<dependency>
	<groupId>org.jpmml</groupId>
	<artifactId>pmml-evaluator</artifactId>
	<version>1.3.11</version>
</dependency>
<dependency>
	<groupId>org.jpmml</groupId>
	<artifactId>pmml-evaluator-extension</artifactId>
	<version>1.3.11</version>
</dependency>

Usage

Loading models

JPMML-Evaluator depends on the JPMML-Model library for PMML class model.

Loading a PMML schema version 3.X or 4.X document into an org.dmg.pmml.PMML instance:

PMML pmml;

try(InputStream is = ...){
	pmml = org.jpmml.model.PMMLUtil.unmarshal(is);
}

If the model type is known, then it is possible to instantiate the corresponding subclass of org.jpmml.evaluator.ModelEvaluator directly:

PMML pmml = ...;

ModelEvaluator<TreeModel> modelEvaluator = new TreeModelEvaluator(pmml);

Otherwise, if the model type is unknown, then the model evaluator instantiation work should be delegated to an instance of class org.jpmml.evaluator.ModelEvaluatorFactory:

PMML pmml = ...;

ModelEvaluatorFactory modelEvaluatorFactory = ModelEvaluatorFactory.newInstance();
 
ModelEvaluator<?> modelEvaluator = modelEvaluatorFactory.newModelEvaluator(pmml);

Model evaluator classes follow functional programming principles and are completely thread safe.

Model evaluator instances are fairly lightweight, which makes them cheap to create and destroy. Nevertheless, long-running applications should maintain a one-to-one mapping between PMML and ModelEvaluator instances for better performance.

It is advisable for application code to work against the org.jpmml.evaluator.Evaluator interface:

Evaluator evaluator = (Evaluator)modelEvaluator;

Querying the "data schema" of models

The model evaluator can be queried for the list of input (ie. independent), target (ie. primary dependent) and output (ie. secondary dependent) field definitions, which provide information about field name, data type, operational type, value domain etc. information.

Querying and analyzing input fields:

List<InputField> inputFields = evaluator.getInputFields();
for(InputField inputField : inputFields){
	org.dmg.pmml.DataField pmmlDataField = (org.dmg.pmml.DataField)inputField.getField();
	org.dmg.pmml.MiningField pmmlMiningField = inputField.getMiningField();

	org.dmg.pmml.DataType dataType = inputField.getDataType();
	org.dmg.pmml.OpType opType = inputField.getOpType();

	switch(opType){
		case CONTINUOUS:
			RangeSet<Double> validArgumentRanges = FieldValueUtil.getValidRanges(pmmlDataField);
			break;
		case CATEGORICAL:
		case ORDINAL:
			List<Value> validArgumentValues = FieldValueUtil.getValidValues(pmmlDataField);
			break;
		default:
			break;
	}
}

Querying and analyzing target fields:

List<TargetField> targetFields = evaluator.getTargetFields();
for(TargetField targetField : targetFields){
	org.dmg.pmml.DataField pmmlDataField = targetField.getDataField();
	org.dmg.pmml.MiningField pmmlMiningField = targetField.getMiningField(); // Could be null
	org.dmg.pmml.Target pmmlTarget = targetField.getTarget(); // Could be null

	org.dmg.pmml.DataType dataType = targetField.getDataType();
	org.dmg.pmml.OpType opType = targetField.getOpType();

	switch(opType){
		case CONTINUOUS:
			break;
		case CATEGORICAL:
		case ORDINAL:
			List<Value> validResultValues = FieldValueUtil.getValidValues(pmmlDataField);
			break;
		default:
			break;
	}
}

Querying and analyzing output fields:

List<OutputField> outputFields = evaluator.getOutputFields();
for(OutputField outputField : outputFields){
	org.dmg.pmml.OutputField pmmlOutputField = outputField.getOutputField();

	org.dmg.pmml.DataType dataType = outputField.getDataType(); // Could be null
	org.dmg.pmml.OpType opType = outputField.getOpType(); // Could be null

	boolean finalResult = outputField.isFinalResult();
	if(!finalResult){
		continue;
	}
}

Evaluating models

The PMML scoring operation must be invoked with valid arguments. Otherwise, the behaviour of the model evaluator class is unspecified.

Preparing the argument data record:

Map<FieldName, FieldValue> arguments = new LinkedHashMap<>();

List<InputField> inputFields = evaluator.getInputFields();
for(InputField inputField : inputFields){
	FieldName inputFieldName = inputField.getName();

	// The raw (ie. user-supplied) value could be any Java primitive value
	Object rawValue = ...;

	// The raw value is passed through: 1) outlier treatment, 2) missing value treatment, 3) invalid value treatment and 4) type conversion
	FieldValue inputFieldValue = inputField.prepare(rawValue);

	arguments.put(inputFieldName, inputFieldValue);
}

Performing the evaluation:

Map<FieldName, ?> results = evaluator.evaluate(arguments);

Extracting primary results from the result data record:

List<TargetField> targetFields = evaluator.getTargetFields();
for(TargetField targetField : targetFields){
	FieldName targetFieldName = targetField.getName();

	Object targetFieldValue = results.get(targetFieldName);
}

The target value is either a Java primitive value (as a wrapper object) or an instance of org.jpmml.evaluator.Computable:

if(targetFieldValue instanceof Computable){
	Computable computable = (Computable)targetFieldValue;

	Object unboxedTargetFieldValue = computable.getResult();
}

The target value may implement interfaces that descend from interface org.jpmml.evaluator.ResultFeature:

// Test for "entityId" result feature
if(targetFieldValue instanceof HasEntityId){
	HasEntityId hasEntityId = (HasEntityId)targetFieldValue;
	HasEntityRegistry<?> hasEntityRegistry = (HasEntityRegistry<?>)evaluator;
	BiMap<String, ? extends Entity> entities = hasEntityRegistry.getEntityRegistry();
	Entity winner = entities.get(hasEntityId.getEntityId());

	// Test for "probability" result feature
	if(targetFieldValue instanceof HasProbability){
		HasProbability hasProbability = (HasProbability)targetFieldValue;
		Double winnerProbability = hasProbability.getProbability(winner.getId());
	}
}

Extracting secondary results from the result data record:

List<OutputField> outputFields = evaluator.getOutputFields();
for(OutputField outputField : outputFields){
	FieldName outputFieldName = outputField.getName();

	Object outputFieldValue = results.get(outputFieldName);
}

The output value is always a Java primitive value (as a wrapper object).

Example applications

Module pmml-evaluator-example exemplifies the use of the JPMML-Evaluator library.

This module can be built using Apache Maven:

mvn clean install

The resulting uber-JAR file target/example-1.4-SNAPSHOT.jar contains the following command-line applications:

  • org.jpmml.evaluator.EvaluationExample (source). Evaluates a PMML model with data. The predictions are stored.
  • org.jpmml.evaluator.TestingExample. Evaluates a PMML model with data. The predictions are verified against expected predictions data.
  • org.jpmml.evaluator.EnhancementExample. Enhances a PMML model with a ModelVerification element.

Evaluating model model.pmml with data records from input.csv. The predictions are stored to output.csv:

java -cp target/example-1.4-SNAPSHOT.jar org.jpmml.evaluator.EvaluationExample --model model.pmml --input input.csv --output output.csv

Evaluating model model.pmml with data records from input.csv. The predictions are verified against data records from expected-output.csv:

java -cp target/example-1.4-SNAPSHOT.jar org.jpmml.evaluator.TestingExample --model model.pmml --input input.csv --expected-output expected-output.csv

Getting help:

java -cp target/example-1.4-SNAPSHOT.jar <application class name> --help

Support and Documentation

Limited public support is available via the JPMML mailing list.

The Openscoring.io blog contains fully worked out examples about using JPMML-Model and JPMML-Evaluator libraries.

Recommended reading:

License

JPMML-Evaluator is licensed under the GNU Affero General Public License (AGPL) version 3.0. Other licenses are available on request.

Additional information

Please contact [email protected]

jpmml-evaluator's People

Contributors

vruusmann avatar

Watchers

James Cloos avatar Oshima Tlholoe avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.