Class AbstractModelPort<T>

java.lang.Object
com.pervasive.datarush.ports.LogicalPort
com.pervasive.datarush.ports.model.AbstractModelPort<T>
Type Parameters:
T - the type of model.
Direct Known Subclasses:
DoneSignalPort, PMMLPort, SimpleModelPort

public abstract class AbstractModelPort<T> extends LogicalPort
Common base class for all types of model ports. Client's should generally not need to extend this class; rather they should use one of the predefined subclasses (either SimpleModelPort or PMMLPort).

The term "model" originated in the analytics package because model ports are most commonly used for prediction models (i.e. PMML). Model ports share the following characteristics:

  • Models are assumed to be "small". By "small" we mean small enough to fit into memory.
  • The framework provides built-in support for building partial models and then reducing to a single model. This happens automatically when a parallel operator outputs a model which feeds into a non-parallel operator; the non-parallel operator must specify a merge-handler. Note that MergeModel is a convenient, re-usable model reducer, parameterized with a merge-handler.
  • The framework provides built-in support for replicating a final model to a parallelized operator, a common scenario for parallelized predictors. This happens automatically anytime a model from a non-parallel operator is fed into a parallel operator. In this case, a duplicate copy of the model is replicated to all partitions. Note that in this scenario, parallel partitions within the same JVM will share a copy of the model. (This is an optimization to reduce memory consumption). Thus, care must be taken to ensure that predictors not mutate the model.
It's important to highlight the asymmetry: on a model scatter, we always duplicate the model. On a model gather, we always merge the model. (This is quite different from RecordPort's where gather and scatter operators are symmetric).
  • Constructor Details

    • AbstractModelPort

      protected AbstractModelPort(LogicalOperator owner, String name, LogicalPort.Direction direction, boolean optional, ModelStorageHandler<T> storageHandler)
      Subclasses must invoke this constructor
      Parameters:
      owner - the operator that owns this port
      name - the name of the port
      direction - the direction, input vs. output
      optional - whether the port is optional
      storageHandler - a storage handler responsible for persisting the model
  • Method Details

    • getModelClass

      public final Class<T> getModelClass()
      Returns the java class of the model
      Returns:
      the java class of the model
    • getStorageHandler

      protected final ModelStorageHandler<T> getStorageHandler()
      Returns the storage handler responsible for persisting the model
      Returns:
      the storage handler responsible for persisting the model
    • getMetadata

      public abstract AbstractModelPortMetadata getMetadata(T model)
      Returns the metadata associate with the model. (Given a model, we should always be able to get its metadata.
      Parameters:
      model - the model object
      Returns:
      the metadata
    • getFactory

      public abstract LogicalPortFactory<? extends AbstractModelPort<?>> getFactory()
      Description copied from class: LogicalPort
      Returns the factory that knows how to create ports of this type
      Specified by:
      getFactory in class LogicalPort
      Returns:
      the factory that knows how to create ports of this type
    • setMergeHandler

      public final void setMergeHandler(MetadataCalculationContext ctx, ModelMergeHandler<T> mergeHandler)
      Sets the merge handler for the input port
      Parameters:
      ctx - the metadata context
      mergeHandler - the merge handler to use
    • getModel

      public final T getModel(ExecutionContext ctx)
      Reads the model from this model input port. This method must be invoked at least once for each model input port (additional invocations will return the same model reference as the first)
      Parameters:
      ctx - the execution context
      Returns:
      the model from with this model input port.
    • setModel

      public final void setModel(ExecutionContext ctx, T model)
      Outputs the model on this model output port. This method must be invoked exactly once for each model output port.
      Parameters:
      ctx - the execution context
      model - the model