java.lang.Object
com.pervasive.datarush.operators.AbstractLogicalOperator
com.pervasive.datarush.operators.StreamingOperator
com.pervasive.datarush.operators.ExecutableOperator
com.pervasive.datarush.analytics.util.AbstractPredictor
com.pervasive.datarush.analytics.cluster.ClusterPredictor
- All Implemented Interfaces:
LogicalOperator
Assigns input data to clusters based on the provided PMML Clustering Model.
The explicit cluster IDs will be used for the assignment, if the model provides any.
Otherwise, the implicit 1-based index, indicating the position in which each cluster
appears in the model will be used as ID.
The input data must contain the same fields as the training data that was used to build
the model (in the PMML model: clustering fields with the attribute "isCenterField" set to
"true") and these fields must be of type double, float, long or int. The resulting
assignments will be part of the output alongside with the original input data.
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected voidexecute(PMMLModel model, RecordValued input, ScalarSettable[] predictedFields) Called to perform prediction.Returns a record port consisting of the input plus the assigned cluster IDs.Gets the name of the winner field in the output.protected RecordTokenTypepredictedType(PMMLModelSpec modelSpec) Given the model spec, returns the predicted type.voidsetWinnerFieldName(String winnerFieldName) Sets the name of the winner field in the output.Methods inherited from class com.pervasive.datarush.analytics.util.AbstractPredictor
computeMetadata, execute, getInput, getModel, pushPrediction, stepNextMethods inherited from class com.pervasive.datarush.operators.ExecutableOperator
cloneForExecution, getNumInputCopies, getPortSettings, handleInactiveOutputMethods inherited from class com.pervasive.datarush.operators.AbstractLogicalOperator
disableParallelism, getInputPorts, getOutputPorts, newInput, newInput, newOutput, newRecordInput, newRecordInput, newRecordOutput, notifyError
-
Constructor Details
-
ClusterPredictor
public ClusterPredictor()
-
-
Method Details
-
getWinnerFieldName
Gets the name of the winner field in the output. This is "winner" by-default. For every input row, this field will contain the ID of the cluster, that row was assigned to.- Returns:
- the name of the winner field in the output
-
setWinnerFieldName
Sets the name of the winner field in the output. This is "winner" by-default. For every input row, this field will contain the ID of the cluster, that row was assigned to.- Parameters:
winnerFieldName- the name of the winner field in the output.
-
getOutput
Returns a record port consisting of the input plus the assigned cluster IDs. If the model provides cluster IDs, these explicit IDs will be used in the output. Otherwise, the implicit 1-based index, indicating the position in which each cluster appears in the model will be used as ID. The name of the field containing the resulting assignments is configurable via the propertywinnerFieldName. The default name is "winner".- Overrides:
getOutputin classAbstractPredictor- Returns:
- a record flow of original values and their cluster assignments.
-
predictedType
Description copied from class:AbstractPredictorGiven the model spec, returns the predicted type. This should not include the input type ( the input is automatically prepended to the type that is returned )- Specified by:
predictedTypein classAbstractPredictor- Parameters:
modelSpec- the model metadata- Returns:
- the predicted type
-
execute
Description copied from class:AbstractPredictorCalled to perform prediction. Subclasses are expected to loop over the input by callingAbstractPredictor.stepNext(). For each row of input, subclasses should first set the predicted values in thepredictedFieldsarray and then invokeAbstractPredictor.pushPrediction(). Subclasses should not invokepushEndOfDatasince that is automatically handled by the base class.- Specified by:
executein classAbstractPredictor- Parameters:
model- The input PMML modelinput- The input datapredictedFields- An array of fields that reference the predicted field locations. The array positionally corresponds to the type returned byAbstractPredictor.predictedType(PMMLModelSpec).
-