- java.lang.Object
-
- com.pervasive.datarush.operators.AbstractLogicalOperator
-
- com.pervasive.datarush.operators.CompositeOperator
-
- com.pervasive.datarush.operators.AbstractRecordCompositeOperator
-
- com.pervasive.datarush.operators.group.Group
-
- All Implemented Interfaces:
LogicalOperator
,PipelineOperator<RecordPort>
,RecordPipelineOperator
public final class Group extends AbstractRecordCompositeOperator
Performs grouping (aggregation) of sorted input data. Available aggregations are provided by the static methods onAggregation
:Aggregation.avg(String)
,Aggregation.count(String)
,Aggregation.max(String)
,Aggregation.min(String)
,Aggregation.stddev(String)
,Aggregation.sum(String)
,Aggregation.var(String)
.The operator uses groups of consecutive equal keys ("key groups") to determine which data values to aggregate. The input data need not be sorted; if it is already sorted, performance will be optimal.
-
-
Field Summary
-
Fields inherited from class com.pervasive.datarush.operators.AbstractRecordCompositeOperator
input, output
-
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
compose(CompositionContext ctx)
Compose the body of this operator.Aggregation[]
getAggregations()
Returns the aggregations to apply to the data.int
getInitialGroupCapacity()
Returns a hint as to the number of groups that are expected to be processed.RecordPort
getInput()
Returns the input portString
getKeyFieldPrefix()
Returns the prefix to add to key fields.String[]
getKeys()
Returns the names of the key fields.int
getMaxGroupCapacity()
Returns the max number of groups to fit into internal memory buffers.RecordPort
getOutput()
Returns the output portboolean
isFewGroupsHint()
Provides a hint as to whether the number of groups is expected to be small.void
setAggregations(Aggregation[] aggregations)
Sets the aggregations to apply to the data.void
setAggregations(String aggregationExpression)
Sets the aggregations to apply to the data.void
setFewGroupsHint(boolean fewGroupsHint)
Sets a hint as to whether the number of groups is expected to be small.void
setInitialGroupCapacity(int initialGroupCapacity)
Sets a hint as to the number of groups that are expected to be processed.void
setKeyFieldPrefix(String keyFieldPrefix)
Sets the prefix to add to key fields.void
setKeys(String[] keys)
Sets the names of the key fields.void
setMaxGroupCapacity(int maxGroupCapacity)
Sets the max number of groups to fit into internal memory buffers.-
Methods inherited from class com.pervasive.datarush.operators.AbstractLogicalOperator
disableParallelism, getInputPorts, getOutputPorts, newInput, newInput, newOutput, newRecordInput, newRecordInput, newRecordOutput, notifyError
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface com.pervasive.datarush.operators.LogicalOperator
disableParallelism, getInputPorts, getOutputPorts
-
-
-
-
Constructor Detail
-
Group
public Group()
Default constructor. Prior to graph compilation at least one of the following properties must be set:
-
Group
public Group(List<String> keys, List<Aggregation> aggregations)
Create a new group plan, specifying keys and aggegations.- Parameters:
keys
- the names of the key fields. If empty then all of the rows in the input are treated as one group.aggregations
- the aggregations to apply to the data.
-
-
Method Detail
-
getInput
public RecordPort getInput()
Description copied from interface:PipelineOperator
Returns the input port- Specified by:
getInput
in interfacePipelineOperator<RecordPort>
- Overrides:
getInput
in classAbstractRecordCompositeOperator
- Returns:
- the input port
-
getOutput
public RecordPort getOutput()
Description copied from interface:PipelineOperator
Returns the output port- Specified by:
getOutput
in interfacePipelineOperator<RecordPort>
- Overrides:
getOutput
in classAbstractRecordCompositeOperator
- Returns:
- the output port
-
getKeys
public String[] getKeys()
Returns the names of the key fields. If empty then all of the rows in the input are treated as one group.- Returns:
- names of the group key fields
-
setKeys
public void setKeys(String[] keys)
Sets the names of the key fields. If empty then all of the rows in the input are treated as one group.- Parameters:
keys
- names of the fields to use as group keys (order is important)
-
getAggregations
public Aggregation[] getAggregations()
Returns the aggregations to apply to the data.- Returns:
- aggregations to apply to the input data
-
setAggregations
public void setAggregations(Aggregation[] aggregations)
Sets the aggregations to apply to the data.- Parameters:
aggregations
- the aggregations to apply to the input data
-
setAggregations
public void setAggregations(String aggregationExpression)
Sets the aggregations to apply to the data. The aggregations to apply are expressed using a SQL-like syntax. Multiple aggregations can be contained within one expression. Multiple aggregations should be separated by a comma.Each aggregation contains the aggregation function with the aggregation parameters contained within parentheses. Most aggregations take only one parameter: the name of the field to aggregate. The
distinct
keyword can be used before the first parameter. Doing so enables a distinct aggregation. Use theas
keyword to specify the field name that should be used for the resultant aggregation.Examples of aggregation expressions:
count(field) as "count"
sum(field), avg(field), min(field), max(field)
Note that double quotes must be used when a keyword such as an aggregation function name is used as the resultant field name (the
as
clause).- Parameters:
aggregationExpression
- expression containing aggregations to apply
-
getKeyFieldPrefix
public String getKeyFieldPrefix()
Returns the prefix to add to key fields. Defaults to "".- Returns:
- the prefix to add to key fields.
-
setKeyFieldPrefix
public void setKeyFieldPrefix(String keyFieldPrefix)
Sets the prefix to add to key fields. Defaults to "".- Parameters:
keyFieldPrefix
- key field prefix text
-
getInitialGroupCapacity
public int getInitialGroupCapacity()
Returns a hint as to the number of groups that are expected to be processed. When input data is unsorted, we will optimistically buffer input rows in order to attempt to reduce the amount of data to be sorted. This setting is ignore if input data is already sorted.- Returns:
- initial group capacity
-
setInitialGroupCapacity
public void setInitialGroupCapacity(int initialGroupCapacity)
Sets a hint as to the number of groups that are expected to be processed. When input data is unsorted, we will optimistically buffer input rows in order to attempt to reduce the amount of data to be sorted. This setting is ignored if input data is already sorted.- Parameters:
initialGroupCapacity
- the initial capacity
-
getMaxGroupCapacity
public int getMaxGroupCapacity()
Returns the max number of groups to fit into internal memory buffers. A value of zero means that this will grow unbounded. This setting is ignored if input data is already sorted.- Returns:
- the max capacity.
-
setMaxGroupCapacity
public void setMaxGroupCapacity(int maxGroupCapacity)
Sets the max number of groups to fit into internal memory buffers. A value of zero means that this will grow unbounded. This setting is ignored if input data is already sorted.- Parameters:
maxGroupCapacity
- the max capacity.
-
isFewGroupsHint
public boolean isFewGroupsHint()
Provides a hint as to whether the number of groups is expected to be small. If so, the reduction step is performed as a non-parallel operation. By default this is false.- Returns:
- a hint as to whether the number of groups is expected to be small
-
setFewGroupsHint
public void setFewGroupsHint(boolean fewGroupsHint)
Sets a hint as to whether the number of groups is expected to be small. If so, the reduction step is performed as a non-parallel operation. By default this is false.- Parameters:
fewGroupsHint
- a hint as to whether the number of groups is expected to be small
-
compose
protected void compose(CompositionContext ctx)
Description copied from class:CompositeOperator
Compose the body of this operator. Implementations should do the following:- Perform any validation of configuration, input types, etc
- Instantiate and configure sub-operators, adding them to the provided context via
the method
OperatorComposable.add(O)
- Create necessary connections via the method
OperatorComposable.connect(P, P)
. This includes connections from the composite's input ports to sub-operators, connections between sub-operators, and connections from sub-operators output ports to the composite's output ports
- Specified by:
compose
in classCompositeOperator
- Parameters:
ctx
- the context
-
-