Class NormalizeValues

All Implemented Interfaces:
LogicalOperator, PipelineOperator<RecordPort>, RecordPipelineOperator

public final class NormalizeValues extends AbstractRecordCompositeOperator
Apply normalization methods to fields within an input data flow. The results of the normalization methods are available in the output flow. All input fields are present in the output with the addition of the calculated normalizations. Inclusion of the input fields in the output can be controlled by setting the include input fields property using setIncludeInputFields(boolean).

Normalization methods require certain statistics about the input data such as the mean, standard deviation, minimum value, maximum value and so on. These statistics are captured in a PMMLModel. The statistics can be gathered by an upstream operator such as SummaryStatistics and passed into this operator. If not, they will be calculated with a first pass over the data and then applied in a second pass.

See Also:
  • Constructor Details

    • NormalizeValues

      public NormalizeValues()
  • Method Details

    • getModelInput

      public PMMLPort getModelInput()
      Get the optional input port used to read the PMML model containing field statistics needed by normalization methods.

      This port is optional. If a statistics model is not provided the needed statistics will be calculated using SummaryStatistics.

      Returns:
      input port for the statistics model
    • getInput

      public RecordPort getInput()
      Description copied from interface: PipelineOperator
      Returns the input port
      Specified by:
      getInput in interface PipelineOperator<RecordPort>
      Overrides:
      getInput in class AbstractRecordCompositeOperator
      Returns:
      the input port
    • getOutput

      public RecordPort getOutput()
      Description copied from interface: PipelineOperator
      Returns the output port
      Specified by:
      getOutput in interface PipelineOperator<RecordPort>
      Overrides:
      getOutput in class AbstractRecordCompositeOperator
      Returns:
      the output port
    • setScoreFields

      public void setScoreFields(List<String> scoreFields)
      Set the names of the input fields to normalize. If no fields names are provided, all fields will be transformed by default.
      Parameters:
      scoreFields - names of fields to normalize
    • setScoreFields

      public void setScoreFields(String... scoreFields)
      Set the names of the input fields to normalize. If no fields names are provided, all fields will be transformed by default.
      Parameters:
      scoreFields - names of fields to normalize
    • getScoreFields

      public List<String> getScoreFields()
      Get the names of fields configured to be normalized.
      Returns:
      names of fields to normalize
    • setMethod

      public void setMethod(StatsFunctions.NormalizeMethod method)
      Set the normalization method to use.
      Parameters:
      method - normalization method
    • getMethod

      public StatsFunctions.NormalizeMethod getMethod()
      Get the normalization method configured.
      Returns:
      normalization method
    • getIncludeInputFields

      public boolean getIncludeInputFields()
      Get the property that specifies whether or not to include the input fields in the output data.
      Returns:
      true if input fields are included; false if excluded
    • setIncludeInputFields

      public void setIncludeInputFields(boolean includeInputFields)
      Set the indicator of whether or not to include the input fields in the output data. Setting this property to true causes the input values to be transferred to the output. Otherwise the input values are excluded leaving only the transformed fields in the output data.

      This value is true by default.

      Parameters:
      includeInputFields - true if input fields are included; false if excluded
    • compose

      protected void compose(CompositionContext ctx)
      Description copied from class: CompositeOperator
      Compose the body of this operator. Implementations should do the following:
      1. Perform any validation of configuration, input types, etc
      2. Instantiate and configure sub-operators, adding them to the provided context via the method OperatorComposable.add(O)
      3. Create necessary connections via the method OperatorComposable.connect(P, P). This includes connections from the composite's input ports to sub-operators, connections between sub-operators, and connections from sub-operators output ports to the composite's output ports
      Specified by:
      compose in class CompositeOperator
      Parameters:
      ctx - the context