Module datarush.analytics
Class DecisionTreePredictor
- java.lang.Object
-
- com.pervasive.datarush.operators.AbstractLogicalOperator
-
- com.pervasive.datarush.operators.StreamingOperator
-
- com.pervasive.datarush.operators.ExecutableOperator
-
- com.pervasive.datarush.analytics.util.AbstractPredictor
-
- com.pervasive.datarush.analytics.decisiontree.predictor.DecisionTreePredictor
-
- All Implemented Interfaces:
LogicalOperator
public final class DecisionTreePredictor extends AbstractPredictor
Operator responsible for predicting outcomes based on a Decision Tree PMML model. Supports most of the functionality listed here. That is, a subset of the PMML specification, but a superset of the functionality required for C4.5. Specifically, supports all required elements/attributes as well as the following optional elements/attributes:- missingValueStategy ( all strategies supported )
- missingValuePenalty
- noTrueChildStategy ( all strategies supported )
- All predicates: SimplePredicate, CompoundPredicate, SimpleSetPredicate, True, False
- ScoreDistribution
- EmbeddedModel
- Partition
- ModelStats
- ModelExplanation
- Targets
- LocalTransformations
- ModelVerification
- splitCharacteristic
-
-
Constructor Summary
Constructors Constructor Description DecisionTreePredictor()
Creates a decision tree predictor with default settings.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
execute(PMMLModel pmml, RecordValued input, ScalarSettable[] predictedFields)
Called to perform prediction.String
getConfidencePrefix()
Gets the field name prefix to use for confidence.RecordPort
getOutput()
Returns a record port consisting of the original input plus predicted values.String
getRecordCountPrefix()
Gets the field name prefix to use for record counts.String
getWinnerField()
Gets the name of the winner field to output.boolean
isAppendConfidence()
Returns whether to append confidence information.boolean
isAppendRecordCount()
Returns whether to append record count information.protected RecordTokenType
predictedType(PMMLModelSpec modelSpec)
Given the model spec, returns the predicted type.void
setAppendConfidence(boolean appendConfidence)
Sets whether to append confidence information.void
setAppendRecordCount(boolean appendRecordCount)
Sets whether to append record count information.void
setConfidencePrefix(String confidencePrefix)
Sets the field name prefix to use for confidence.void
setRecordCountPrefix(String recordCountPrefix)
Sets the field name prefix to use for record counts.void
setWinnerField(String winnerField)
Sets the name of the winner field to output.-
Methods inherited from class com.pervasive.datarush.analytics.util.AbstractPredictor
computeMetadata, execute, getInput, getModel, pushPrediction, stepNext
-
Methods inherited from class com.pervasive.datarush.operators.ExecutableOperator
cloneForExecution, getNumInputCopies, getPortSettings, handleInactiveOutput
-
Methods inherited from class com.pervasive.datarush.operators.AbstractLogicalOperator
disableParallelism, getInputPorts, getOutputPorts, newInput, newInput, newOutput, newRecordInput, newRecordInput, newRecordOutput, notifyError
-
-
-
-
Method Detail
-
getOutput
public RecordPort getOutput()
Returns a record port consisting of the original input plus predicted values. This adds the following additional fields.- winner: The predicted value. Generally corresponds to the target with the highest record count; although PMML may specify an alternative winner via the "score" attribute.
The name "winner" is the default; this is configurable via the property
winnerField
. - record_count_targetValue (optional): The record count of the named targetValue.
The prefix "record_count_" is the default; this is configurable via the property
recordCountPrefix
. - confidence_targetValue (optional): The probability of the named targetValue.
The prefix "confidence_" is the default; this is configurable via the property
confidencePrefix
.
- Overrides:
getOutput
in classAbstractPredictor
- Returns:
- a record flow of predicted values and their record counts
- winner: The predicted value. Generally corresponds to the target with the highest record count; although PMML may specify an alternative winner via the "score" attribute.
The name "winner" is the default; this is configurable via the property
-
isAppendRecordCount
public boolean isAppendRecordCount()
Returns whether to append record count information. This is false by-default.- Returns:
- whether to append record count information.
-
setAppendRecordCount
public void setAppendRecordCount(boolean appendRecordCount)
Sets whether to append record count information. This is false by-default.- Parameters:
appendRecordCount
- whether to append record count information.
-
getWinnerField
public String getWinnerField()
Gets the name of the winner field to output. This is "winner" by-default.- Returns:
- the name of the winner field to output.
-
setWinnerField
public void setWinnerField(String winnerField)
Sets the name of the winner field to output. This is "winner" by-default.- Parameters:
winnerField
- the name of the winner field to output.
-
getRecordCountPrefix
public String getRecordCountPrefix()
Gets the field name prefix to use for record counts. This is "record_count_" by-default.- Returns:
- the field name prefix to use for record counts.
-
setRecordCountPrefix
public void setRecordCountPrefix(String recordCountPrefix)
Sets the field name prefix to use for record counts. This is "record_count_" by-default.- Parameters:
recordCountPrefix
- the field name prefix to use for record counts.
-
getConfidencePrefix
public String getConfidencePrefix()
Gets the field name prefix to use for confidence. This is "confidence_" by-default.- Returns:
- the field name prefix to use for confidence.
-
setConfidencePrefix
public void setConfidencePrefix(String confidencePrefix)
Sets the field name prefix to use for confidence. This is "confidence_" by-default.- Parameters:
confidencePrefix
- the field name prefix to use for confidence.
-
isAppendConfidence
public boolean isAppendConfidence()
Returns whether to append confidence information. This is false by-default.- Returns:
- whether to append confidence information.
-
setAppendConfidence
public void setAppendConfidence(boolean appendConfidence)
Sets whether to append confidence information. This is false by-default.- Parameters:
appendConfidence
- whether to append confidence information.
-
predictedType
protected RecordTokenType predictedType(PMMLModelSpec modelSpec)
Description copied from class:AbstractPredictor
Given the model spec, returns the predicted type. This should not include the input type ( the input is automatically prepended to the type that is returned )- Specified by:
predictedType
in classAbstractPredictor
- Parameters:
modelSpec
- the model metadata- Returns:
- the predicted type
-
execute
protected void execute(PMMLModel pmml, RecordValued input, ScalarSettable[] predictedFields)
Description copied from class:AbstractPredictor
Called to perform prediction. Subclasses are expected to loop over the input by callingAbstractPredictor.stepNext()
. For each row of input, subclasses should first set the predicted values in thepredictedFields
array and then invokeAbstractPredictor.pushPrediction()
. Subclasses should not invokepushEndOfData
since that is automatically handled by the base class.- Specified by:
execute
in classAbstractPredictor
- Parameters:
pmml
- The input PMML modelinput
- The input datapredictedFields
- An array of fields that reference the predicted field locations. The array positionally corresponds to the type returned byAbstractPredictor.predictedType(PMMLModelSpec)
.
-
-