Class NormalizeValues

  • All Implemented Interfaces:
    LogicalOperator, PipelineOperator<RecordPort>, RecordPipelineOperator

    public final class NormalizeValues
    extends AbstractRecordCompositeOperator
    Apply normalization methods to fields within an input data flow. The results of the normalization methods are available in the output flow. All input fields are present in the output with the addition of the calculated normalizations. Inclusion of the input fields in the output can be controlled by setting the include input fields property using setIncludeInputFields(boolean).

    Normalization methods require certain statistics about the input data such as the mean, standard deviation, minimum value, maximum value and so on. These statistics are captured in a PMMLModel. The statistics can be gathered by an upstream operator such as SummaryStatistics and passed into this operator. If not, they will be calculated with a first pass over the data and then applied in a second pass.

    See Also:
    SummaryStatistics
    • Constructor Detail

      • NormalizeValues

        public NormalizeValues()
    • Method Detail

      • getModelInput

        public PMMLPort getModelInput()
        Get the optional input port used to read the PMML model containing field statistics needed by normalization methods.

        This port is optional. If a statistics model is not provided the needed statistics will be calculated using SummaryStatistics.

        Returns:
        input port for the statistics model
      • setScoreFields

        public void setScoreFields​(List<String> scoreFields)
        Set the names of the input fields to normalize. If no fields names are provided, all fields will be transformed by default.
        Parameters:
        scoreFields - names of fields to normalize
      • setScoreFields

        public void setScoreFields​(String... scoreFields)
        Set the names of the input fields to normalize. If no fields names are provided, all fields will be transformed by default.
        Parameters:
        scoreFields - names of fields to normalize
      • getScoreFields

        public List<String> getScoreFields()
        Get the names of fields configured to be normalized.
        Returns:
        names of fields to normalize
      • setMethod

        public void setMethod​(StatsFunctions.NormalizeMethod method)
        Set the normalization method to use.
        Parameters:
        method - normalization method
      • getIncludeInputFields

        public boolean getIncludeInputFields()
        Get the property that specifies whether or not to include the input fields in the output data.
        Returns:
        true if input fields are included; false if excluded
      • setIncludeInputFields

        public void setIncludeInputFields​(boolean includeInputFields)
        Set the indicator of whether or not to include the input fields in the output data. Setting this property to true causes the input values to be transferred to the output. Otherwise the input values are excluded leaving only the transformed fields in the output data.

        This value is true by default.

        Parameters:
        includeInputFields - true if input fields are included; false if excluded
      • compose

        protected void compose​(CompositionContext ctx)
        Description copied from class: CompositeOperator
        Compose the body of this operator. Implementations should do the following:
        1. Perform any validation of configuration, input types, etc
        2. Instantiate and configure sub-operators, adding them to the provided context via the method OperatorComposable.add(O)
        3. Create necessary connections via the method OperatorComposable.connect(P, P). This includes connections from the composite's input ports to sub-operators, connections between sub-operators, and connections from sub-operators output ports to the composite's output ports
        Specified by:
        compose in class CompositeOperator
        Parameters:
        ctx - the context