Module datarush.analytics
Class DecisionTreePredictor
java.lang.Object
com.pervasive.datarush.operators.AbstractLogicalOperator
com.pervasive.datarush.operators.StreamingOperator
com.pervasive.datarush.operators.ExecutableOperator
com.pervasive.datarush.analytics.util.AbstractPredictor
com.pervasive.datarush.analytics.decisiontree.predictor.DecisionTreePredictor
- All Implemented Interfaces:
LogicalOperator
Operator responsible for predicting outcomes based on a Decision Tree PMML model.
Supports most of the functionality listed here.
That is, a subset of the PMML specification, but a superset of the functionality required for C4.5.
Specifically, supports all required elements/attributes as well as the following optional elements/attributes:
- missingValueStategy ( all strategies supported )
- missingValuePenalty
- noTrueChildStategy ( all strategies supported )
- All predicates: SimplePredicate, CompoundPredicate, SimpleSetPredicate, True, False
- ScoreDistribution
- EmbeddedModel
- Partition
- ModelStats
- ModelExplanation
- Targets
- LocalTransformations
- ModelVerification
- splitCharacteristic
-
Constructor Summary
ConstructorsConstructorDescriptionCreates a decision tree predictor with default settings. -
Method Summary
Modifier and TypeMethodDescriptionprotected voidexecute(PMMLModel pmml, RecordValued input, ScalarSettable[] predictedFields) Called to perform prediction.Gets the field name prefix to use for confidence.Returns a record port consisting of the original input plus predicted values.Gets the field name prefix to use for record counts.Gets the name of the winner field to output.booleanReturns whether to append confidence information.booleanReturns whether to append record count information.protected RecordTokenTypepredictedType(PMMLModelSpec modelSpec) Given the model spec, returns the predicted type.voidsetAppendConfidence(boolean appendConfidence) Sets whether to append confidence information.voidsetAppendRecordCount(boolean appendRecordCount) Sets whether to append record count information.voidsetConfidencePrefix(String confidencePrefix) Sets the field name prefix to use for confidence.voidsetRecordCountPrefix(String recordCountPrefix) Sets the field name prefix to use for record counts.voidsetWinnerField(String winnerField) Sets the name of the winner field to output.Methods inherited from class com.pervasive.datarush.analytics.util.AbstractPredictor
computeMetadata, execute, getInput, getModel, pushPrediction, stepNextMethods inherited from class com.pervasive.datarush.operators.ExecutableOperator
cloneForExecution, getNumInputCopies, getPortSettings, handleInactiveOutputMethods inherited from class com.pervasive.datarush.operators.AbstractLogicalOperator
disableParallelism, getInputPorts, getOutputPorts, newInput, newInput, newOutput, newRecordInput, newRecordInput, newRecordOutput, notifyError
-
Constructor Details
-
DecisionTreePredictor
public DecisionTreePredictor()Creates a decision tree predictor with default settings.
-
-
Method Details
-
getOutput
Returns a record port consisting of the original input plus predicted values. This adds the following additional fields.- winner: The predicted value. Generally corresponds to the target with the highest record count; although PMML may specify an alternative winner via the "score" attribute.
The name "winner" is the default; this is configurable via the property
winnerField. - record_count_targetValue (optional): The record count of the named targetValue.
The prefix "record_count_" is the default; this is configurable via the property
recordCountPrefix. - confidence_targetValue (optional): The probability of the named targetValue.
The prefix "confidence_" is the default; this is configurable via the property
confidencePrefix.
- Overrides:
getOutputin classAbstractPredictor- Returns:
- a record flow of predicted values and their record counts
- winner: The predicted value. Generally corresponds to the target with the highest record count; although PMML may specify an alternative winner via the "score" attribute.
The name "winner" is the default; this is configurable via the property
-
isAppendRecordCount
public boolean isAppendRecordCount()Returns whether to append record count information. This is false by-default.- Returns:
- whether to append record count information.
-
setAppendRecordCount
public void setAppendRecordCount(boolean appendRecordCount) Sets whether to append record count information. This is false by-default.- Parameters:
appendRecordCount- whether to append record count information.
-
getWinnerField
Gets the name of the winner field to output. This is "winner" by-default.- Returns:
- the name of the winner field to output.
-
setWinnerField
Sets the name of the winner field to output. This is "winner" by-default.- Parameters:
winnerField- the name of the winner field to output.
-
getRecordCountPrefix
Gets the field name prefix to use for record counts. This is "record_count_" by-default.- Returns:
- the field name prefix to use for record counts.
-
setRecordCountPrefix
Sets the field name prefix to use for record counts. This is "record_count_" by-default.- Parameters:
recordCountPrefix- the field name prefix to use for record counts.
-
getConfidencePrefix
Gets the field name prefix to use for confidence. This is "confidence_" by-default.- Returns:
- the field name prefix to use for confidence.
-
setConfidencePrefix
Sets the field name prefix to use for confidence. This is "confidence_" by-default.- Parameters:
confidencePrefix- the field name prefix to use for confidence.
-
isAppendConfidence
public boolean isAppendConfidence()Returns whether to append confidence information. This is false by-default.- Returns:
- whether to append confidence information.
-
setAppendConfidence
public void setAppendConfidence(boolean appendConfidence) Sets whether to append confidence information. This is false by-default.- Parameters:
appendConfidence- whether to append confidence information.
-
predictedType
Description copied from class:AbstractPredictorGiven the model spec, returns the predicted type. This should not include the input type ( the input is automatically prepended to the type that is returned )- Specified by:
predictedTypein classAbstractPredictor- Parameters:
modelSpec- the model metadata- Returns:
- the predicted type
-
execute
Description copied from class:AbstractPredictorCalled to perform prediction. Subclasses are expected to loop over the input by callingAbstractPredictor.stepNext(). For each row of input, subclasses should first set the predicted values in thepredictedFieldsarray and then invokeAbstractPredictor.pushPrediction(). Subclasses should not invokepushEndOfDatasince that is automatically handled by the base class.- Specified by:
executein classAbstractPredictor- Parameters:
pmml- The input PMML modelinput- The input datapredictedFields- An array of fields that reference the predicted field locations. The array positionally corresponds to the type returned byAbstractPredictor.predictedType(PMMLModelSpec).
-