Class ClusterPredictor

All Implemented Interfaces:
LogicalOperator

public final class ClusterPredictor extends AbstractPredictor
Assigns input data to clusters based on the provided PMML Clustering Model. The explicit cluster IDs will be used for the assignment, if the model provides any. Otherwise, the implicit 1-based index, indicating the position in which each cluster appears in the model will be used as ID. The input data must contain the same fields as the training data that was used to build the model (in the PMML model: clustering fields with the attribute "isCenterField" set to "true") and these fields must be of type double, float, long or int. The resulting assignments will be part of the output alongside with the original input data.
  • Constructor Details

    • ClusterPredictor

      public ClusterPredictor()
  • Method Details

    • getWinnerFieldName

      public String getWinnerFieldName()
      Gets the name of the winner field in the output. This is "winner" by-default. For every input row, this field will contain the ID of the cluster, that row was assigned to.
      Returns:
      the name of the winner field in the output
    • setWinnerFieldName

      public void setWinnerFieldName(String winnerFieldName)
      Sets the name of the winner field in the output. This is "winner" by-default. For every input row, this field will contain the ID of the cluster, that row was assigned to.
      Parameters:
      winnerFieldName - the name of the winner field in the output.
    • getOutput

      public RecordPort getOutput()
      Returns a record port consisting of the input plus the assigned cluster IDs. If the model provides cluster IDs, these explicit IDs will be used in the output. Otherwise, the implicit 1-based index, indicating the position in which each cluster appears in the model will be used as ID. The name of the field containing the resulting assignments is configurable via the property winnerFieldName. The default name is "winner".
      Overrides:
      getOutput in class AbstractPredictor
      Returns:
      a record flow of original values and their cluster assignments.
    • predictedType

      protected RecordTokenType predictedType(PMMLModelSpec modelSpec)
      Description copied from class: AbstractPredictor
      Given the model spec, returns the predicted type. This should not include the input type ( the input is automatically prepended to the type that is returned )
      Specified by:
      predictedType in class AbstractPredictor
      Parameters:
      modelSpec - the model metadata
      Returns:
      the predicted type
    • execute

      protected void execute(PMMLModel model, RecordValued input, ScalarSettable[] predictedFields)
      Description copied from class: AbstractPredictor
      Called to perform prediction. Subclasses are expected to loop over the input by calling AbstractPredictor.stepNext(). For each row of input, subclasses should first set the predicted values in the predictedFields array and then invoke AbstractPredictor.pushPrediction(). Subclasses should not invoke pushEndOfData since that is automatically handled by the base class.
      Specified by:
      execute in class AbstractPredictor
      Parameters:
      model - The input PMML model
      input - The input data
      predictedFields - An array of fields that reference the predicted field locations. The array positionally corresponds to the type returned by AbstractPredictor.predictedType(PMMLModelSpec).