Class AbstractModelPort<T>

  • Type Parameters:
    T - the type of model.
    Direct Known Subclasses:
    DoneSignalPort, PMMLPort, SimpleModelPort

    public abstract class AbstractModelPort<T>
    extends LogicalPort
    Common base class for all types of model ports. Client's should generally not need to extend this class; rather they should use one of the predefined subclasses (either SimpleModelPort or PMMLPort).

    The term "model" originated in the analytics package because model ports are most commonly used for prediction models (i.e. PMML). Model ports share the following characteristics:

    • Models are assumed to be "small". By "small" we mean small enough to fit into memory.
    • The framework provides built-in support for building partial models and then reducing to a single model. This happens automatically when a parallel operator outputs a model which feeds into a non-parallel operator; the non-parallel operator must specify a merge-handler. Note that MergeModel is a convenient, re-usable model reducer, parameterized with a merge-handler.
    • The framework provides built-in support for replicating a final model to a parallelized operator, a common scenario for parallelized predictors. This happens automatically anytime a model from a non-parallel operator is fed into a parallel operator. In this case, a duplicate copy of the model is replicated to all partitions. Note that in this scenario, parallel partitions within the same JVM will share a copy of the model. (This is an optimization to reduce memory consumption). Thus, care must be taken to ensure that predictors not mutate the model.
    It's important to highlight the asymmetry: on a model scatter, we always duplicate the model. On a model gather, we always merge the model. (This is quite different from RecordPort's where gather and scatter operators are symmetric).
    • Constructor Detail

      • AbstractModelPort

        protected AbstractModelPort​(LogicalOperator owner,
                                    String name,
                                    LogicalPort.Direction direction,
                                    boolean optional,
                                    ModelStorageHandler<T> storageHandler)
        Subclasses must invoke this constructor
        Parameters:
        owner - the operator that owns this port
        name - the name of the port
        direction - the direction, input vs. output
        optional - whether the port is optional
        storageHandler - a storage handler responsible for persisting the model
    • Method Detail

      • getModelClass

        public final Class<T> getModelClass()
        Returns the java class of the model
        Returns:
        the java class of the model
      • getStorageHandler

        protected final ModelStorageHandler<T> getStorageHandler()
        Returns the storage handler responsible for persisting the model
        Returns:
        the storage handler responsible for persisting the model
      • getMetadata

        public abstract AbstractModelPortMetadata getMetadata​(T model)
        Returns the metadata associate with the model. (Given a model, we should always be able to get its metadata.
        Parameters:
        model - the model object
        Returns:
        the metadata
      • setMergeHandler

        public final void setMergeHandler​(MetadataCalculationContext ctx,
                                          ModelMergeHandler<T> mergeHandler)
        Sets the merge handler for the input port
        Parameters:
        ctx - the metadata context
        mergeHandler - the merge handler to use
      • getModel

        public final T getModel​(ExecutionContext ctx)
        Reads the model from this model input port. This method must be invoked at least once for each model input port (additional invocations will return the same model reference as the first)
        Parameters:
        ctx - the execution context
        Returns:
        the model from with this model input port.
      • setModel

        public final void setModel​(ExecutionContext ctx,
                                   T model)
        Outputs the model on this model output port. This method must be invoked exactly once for each model output port.
        Parameters:
        ctx - the execution context
        model - the model