- All Implemented Interfaces:
LogicalOperator,PipelineOperator<RecordPort>,RecordPipelineOperator
Aggregation: Aggregation.avg(String), Aggregation.count(String),
Aggregation.max(String), Aggregation.min(String),
Aggregation.stddev(String), Aggregation.sum(String), Aggregation.var(String).
The operator uses groups of consecutive equal keys ("key groups") to determine which data values to aggregate. The input data need not be sorted; if it is already sorted, performance will be optimal.
-
Field Summary
Fields inherited from class com.pervasive.datarush.operators.AbstractRecordCompositeOperator
input, output -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected voidCompose the body of this operator.Returns the aggregations to apply to the data.intReturns a hint as to the number of groups that are expected to be processed.getInput()Returns the input portReturns the prefix to add to key fields.String[]getKeys()Returns the names of the key fields.intReturns the max number of groups to fit into internal memory buffers.Returns the output portbooleanProvides a hint as to whether the number of groups is expected to be small.voidsetAggregations(Aggregation[] aggregations) Sets the aggregations to apply to the data.voidsetAggregations(String aggregationExpression) Sets the aggregations to apply to the data.voidsetFewGroupsHint(boolean fewGroupsHint) Sets a hint as to whether the number of groups is expected to be small.voidsetInitialGroupCapacity(int initialGroupCapacity) Sets a hint as to the number of groups that are expected to be processed.voidsetKeyFieldPrefix(String keyFieldPrefix) Sets the prefix to add to key fields.voidSets the names of the key fields.voidsetMaxGroupCapacity(int maxGroupCapacity) Sets the max number of groups to fit into internal memory buffers.Methods inherited from class com.pervasive.datarush.operators.AbstractLogicalOperator
disableParallelism, getInputPorts, getOutputPorts, newInput, newInput, newOutput, newRecordInput, newRecordInput, newRecordOutput, notifyErrorMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface com.pervasive.datarush.operators.LogicalOperator
disableParallelism, getInputPorts, getOutputPorts
-
Constructor Details
-
Group
public Group()Default constructor. Prior to graph compilation at least one of the following properties must be set: -
Group
Create a new group plan, specifying keys and aggegations.- Parameters:
keys- the names of the key fields. If empty then all of the rows in the input are treated as one group.aggregations- the aggregations to apply to the data.
-
-
Method Details
-
getInput
Description copied from interface:PipelineOperatorReturns the input port- Specified by:
getInputin interfacePipelineOperator<RecordPort>- Overrides:
getInputin classAbstractRecordCompositeOperator- Returns:
- the input port
-
getOutput
Description copied from interface:PipelineOperatorReturns the output port- Specified by:
getOutputin interfacePipelineOperator<RecordPort>- Overrides:
getOutputin classAbstractRecordCompositeOperator- Returns:
- the output port
-
getKeys
Returns the names of the key fields. If empty then all of the rows in the input are treated as one group.- Returns:
- names of the group key fields
-
setKeys
Sets the names of the key fields. If empty then all of the rows in the input are treated as one group.- Parameters:
keys- names of the fields to use as group keys (order is important)
-
getAggregations
Returns the aggregations to apply to the data.- Returns:
- aggregations to apply to the input data
-
setAggregations
Sets the aggregations to apply to the data.- Parameters:
aggregations- the aggregations to apply to the input data
-
setAggregations
Sets the aggregations to apply to the data. The aggregations to apply are expressed using a SQL-like syntax. Multiple aggregations can be contained within one expression. Multiple aggregations should be separated by a comma.Each aggregation contains the aggregation function with the aggregation parameters contained within parentheses. Most aggregations take only one parameter: the name of the field to aggregate. The
distinctkeyword can be used before the first parameter. Doing so enables a distinct aggregation. Use theaskeyword to specify the field name that should be used for the resultant aggregation.Examples of aggregation expressions:
count(field) as "count"sum(field), avg(field), min(field), max(field)
Note that double quotes must be used when a keyword such as an aggregation function name is used as the resultant field name (the
asclause).- Parameters:
aggregationExpression- expression containing aggregations to apply
-
getKeyFieldPrefix
Returns the prefix to add to key fields. Defaults to "".- Returns:
- the prefix to add to key fields.
-
setKeyFieldPrefix
Sets the prefix to add to key fields. Defaults to "".- Parameters:
keyFieldPrefix- key field prefix text
-
getInitialGroupCapacity
public int getInitialGroupCapacity()Returns a hint as to the number of groups that are expected to be processed. When input data is unsorted, we will optimistically buffer input rows in order to attempt to reduce the amount of data to be sorted. This setting is ignore if input data is already sorted.- Returns:
- initial group capacity
-
setInitialGroupCapacity
public void setInitialGroupCapacity(int initialGroupCapacity) Sets a hint as to the number of groups that are expected to be processed. When input data is unsorted, we will optimistically buffer input rows in order to attempt to reduce the amount of data to be sorted. This setting is ignored if input data is already sorted.- Parameters:
initialGroupCapacity- the initial capacity
-
getMaxGroupCapacity
public int getMaxGroupCapacity()Returns the max number of groups to fit into internal memory buffers. A value of zero means that this will grow unbounded. This setting is ignored if input data is already sorted.- Returns:
- the max capacity.
-
setMaxGroupCapacity
public void setMaxGroupCapacity(int maxGroupCapacity) Sets the max number of groups to fit into internal memory buffers. A value of zero means that this will grow unbounded. This setting is ignored if input data is already sorted.- Parameters:
maxGroupCapacity- the max capacity.
-
isFewGroupsHint
public boolean isFewGroupsHint()Provides a hint as to whether the number of groups is expected to be small. If so, the reduction step is performed as a non-parallel operation. By default this is false.- Returns:
- a hint as to whether the number of groups is expected to be small
-
setFewGroupsHint
public void setFewGroupsHint(boolean fewGroupsHint) Sets a hint as to whether the number of groups is expected to be small. If so, the reduction step is performed as a non-parallel operation. By default this is false.- Parameters:
fewGroupsHint- a hint as to whether the number of groups is expected to be small
-
compose
Description copied from class:CompositeOperatorCompose the body of this operator. Implementations should do the following:- Perform any validation of configuration, input types, etc
- Instantiate and configure sub-operators, adding them to the provided context via
the method
OperatorComposable.add(O) - Create necessary connections via the method
OperatorComposable.connect(P, P). This includes connections from the composite's input ports to sub-operators, connections between sub-operators, and connections from sub-operators output ports to the composite's output ports
- Specified by:
composein classCompositeOperator- Parameters:
ctx- the context
-