Class CountRanges

  • All Implemented Interfaces:
    LogicalOperator, PipelineOperator<RecordPort>, RecordPipelineOperator

    public class CountRanges
    extends AbstractRecordCompositeOperator
    Determines which range each value in a field falls within and counts the totals. The operation is defined by a list of break point which are used as the boundaries for the ranges. A list of n breaks defines n+1 range groups which are indexed beginning with 1 of which the first and last group are unbounded on one side. The range groups are sorted in ascending order based on the comparable interface of the field. The behavior of the specific range intervals closures can also be adjusted by enabling closed lower and upper bounds which will include values equal to the boundary in the respective interval. A value can only be included in a single range group so both the lower and upper bound cannot be closed. Any value which is not included in any range group such as null or the boundary values are included in group 0;

    The output of this operator appends a single new field to the original records indicating the range group of the specified field. The statistics output of this operator outputs the counts of the defined range groups. This output includes two fields, the range group index and the total number of values within that group.

    • Constructor Detail

      • CountRanges

        public CountRanges()
    • Method Detail

      • getStatsOutput

        public RecordPort getStatsOutput()
      • getFieldName

        public String getFieldName()
        Gets the name of the field which will be divided into ranges.
        Returns:
        the field which will have the ranges calculated
      • setFieldName

        public void setFieldName​(String fieldName)
        Sets the name of the field which will be divided into ranges.
        Parameters:
        fieldName - the field which will have the ranges calculated
      • getBreaks

        public List getBreaks()
        Gets the values that will be used as the boundaries for the ranges.
        Returns:
        the sorted list of range boundary breaks
      • setBreaks

        public void setBreaks​(List breaks)
        Sets the values that will be used as the boundaries for the ranges. These should be of the same type as the selected fields. The break values will be sorted in ascending order and therefore the range groups produced will be in ascending order. Unless either the upper or lower bound is closed the break values themselves will not be included in a range group and will instead be included in group 0 which includes the null values. Otherwise each value in the field will be categorized in group 1-n where n is equal to the number of break values plus one. Range group 1 will include every value less than the first defined break, range group 2 will include every value between the first and second break, etc.
        Parameters:
        breaks - the values that will define the boundaries of the ranges
      • getLowerBoundClosed

        public boolean getLowerBoundClosed()
        Returns true if the lower boundary defined by a break should be included in the range group. The default is false.
        Returns:
        whether to include the lower boundary in range
      • setLowerBoundClosed

        public void setLowerBoundClosed​(boolean lowerBoundClosed)
        Sets whether the lower boundary defined by a break should be included in the group. The properties lowerBoundClosed and upperBoundClosed cannot both be set to true.
        Parameters:
        lowerBoundClosed - whether to include lower boundary in range
      • getUpperBoundClosed

        public boolean getUpperBoundClosed()
        Returns true if the upper boundary defined by a break should be included in the range group. The default is false.
        Returns:
        whether to include the upper boundary in range
      • setUpperBoundClosed

        public void setUpperBoundClosed​(boolean upperBoundClosed)
        Sets whether the upper boundary defined by a break should be included in the group. The properties lowerBoundClosed and upperBoundClosed cannot both be set to true.
        Parameters:
        upperBoundClosed - whether to include upper boundary in range
      • compose

        protected void compose​(CompositionContext ctx)
        Description copied from class: CompositeOperator
        Compose the body of this operator. Implementations should do the following:
        1. Perform any validation of configuration, input types, etc
        2. Instantiate and configure sub-operators, adding them to the provided context via the method OperatorComposable.add(O)
        3. Create necessary connections via the method OperatorComposable.connect(P, P). This includes connections from the composite's input ports to sub-operators, connections between sub-operators, and connections from sub-operators output ports to the composite's output ports
        Specified by:
        compose in class CompositeOperator
        Parameters:
        ctx - the context