- java.lang.Object
-
- com.pervasive.datarush.operators.AbstractLogicalOperator
-
- com.pervasive.datarush.operators.CompositeOperator
-
- com.pervasive.datarush.operators.AbstractRecordCompositeOperator
-
- com.pervasive.datarush.analytics.stats.NormalizeValues
-
- All Implemented Interfaces:
LogicalOperator
,PipelineOperator<RecordPort>
,RecordPipelineOperator
public final class NormalizeValues extends AbstractRecordCompositeOperator
Apply normalization methods to fields within an input data flow. The results of the normalization methods are available in the output flow. All input fields are present in the output with the addition of the calculated normalizations. Inclusion of the input fields in the output can be controlled by setting the include input fields property usingsetIncludeInputFields(boolean)
.Normalization methods require certain statistics about the input data such as the mean, standard deviation, minimum value, maximum value and so on. These statistics are captured in a PMMLModel. The statistics can be gathered by an upstream operator such as
SummaryStatistics
and passed into this operator. If not, they will be calculated with a first pass over the data and then applied in a second pass.- See Also:
SummaryStatistics
-
-
Field Summary
-
Fields inherited from class com.pervasive.datarush.operators.AbstractRecordCompositeOperator
input, output
-
-
Constructor Summary
Constructors Constructor Description NormalizeValues()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
compose(CompositionContext ctx)
Compose the body of this operator.boolean
getIncludeInputFields()
Get the property that specifies whether or not to include the input fields in the output data.RecordPort
getInput()
Returns the input portStatsFunctions.NormalizeMethod
getMethod()
Get the normalization method configured.PMMLPort
getModelInput()
Get the optional input port used to read the PMML model containing field statistics needed by normalization methods.RecordPort
getOutput()
Returns the output portList<String>
getScoreFields()
Get the names of fields configured to be normalized.void
setIncludeInputFields(boolean includeInputFields)
Set the indicator of whether or not to include the input fields in the output data.void
setMethod(StatsFunctions.NormalizeMethod method)
Set the normalization method to use.void
setScoreFields(String... scoreFields)
Set the names of the input fields to normalize.void
setScoreFields(List<String> scoreFields)
Set the names of the input fields to normalize.-
Methods inherited from class com.pervasive.datarush.operators.AbstractLogicalOperator
disableParallelism, getInputPorts, getOutputPorts, newInput, newInput, newOutput, newRecordInput, newRecordInput, newRecordOutput, notifyError
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface com.pervasive.datarush.operators.LogicalOperator
disableParallelism, getInputPorts, getOutputPorts
-
-
-
-
Method Detail
-
getModelInput
public PMMLPort getModelInput()
Get the optional input port used to read the PMML model containing field statistics needed by normalization methods.This port is optional. If a statistics model is not provided the needed statistics will be calculated using
SummaryStatistics
.- Returns:
- input port for the statistics model
-
getInput
public RecordPort getInput()
Description copied from interface:PipelineOperator
Returns the input port- Specified by:
getInput
in interfacePipelineOperator<RecordPort>
- Overrides:
getInput
in classAbstractRecordCompositeOperator
- Returns:
- the input port
-
getOutput
public RecordPort getOutput()
Description copied from interface:PipelineOperator
Returns the output port- Specified by:
getOutput
in interfacePipelineOperator<RecordPort>
- Overrides:
getOutput
in classAbstractRecordCompositeOperator
- Returns:
- the output port
-
setScoreFields
public void setScoreFields(List<String> scoreFields)
Set the names of the input fields to normalize. If no fields names are provided, all fields will be transformed by default.- Parameters:
scoreFields
- names of fields to normalize
-
setScoreFields
public void setScoreFields(String... scoreFields)
Set the names of the input fields to normalize. If no fields names are provided, all fields will be transformed by default.- Parameters:
scoreFields
- names of fields to normalize
-
getScoreFields
public List<String> getScoreFields()
Get the names of fields configured to be normalized.- Returns:
- names of fields to normalize
-
setMethod
public void setMethod(StatsFunctions.NormalizeMethod method)
Set the normalization method to use.- Parameters:
method
- normalization method
-
getMethod
public StatsFunctions.NormalizeMethod getMethod()
Get the normalization method configured.- Returns:
- normalization method
-
getIncludeInputFields
public boolean getIncludeInputFields()
Get the property that specifies whether or not to include the input fields in the output data.- Returns:
- true if input fields are included; false if excluded
-
setIncludeInputFields
public void setIncludeInputFields(boolean includeInputFields)
Set the indicator of whether or not to include the input fields in the output data. Setting this property totrue
causes the input values to be transferred to the output. Otherwise the input values are excluded leaving only the transformed fields in the output data.This value is
true
by default.- Parameters:
includeInputFields
- true if input fields are included; false if excluded
-
compose
protected void compose(CompositionContext ctx)
Description copied from class:CompositeOperator
Compose the body of this operator. Implementations should do the following:- Perform any validation of configuration, input types, etc
- Instantiate and configure sub-operators, adding them to the provided context via
the method
OperatorComposable.add(O)
- Create necessary connections via the method
OperatorComposable.connect(P, P)
. This includes connections from the composite's input ports to sub-operators, connections between sub-operators, and connections from sub-operators output ports to the composite's output ports
- Specified by:
compose
in classCompositeOperator
- Parameters:
ctx
- the context
-
-