- java.lang.Object
-
- com.pervasive.datarush.operators.AbstractLogicalOperator
-
- com.pervasive.datarush.operators.CompositeOperator
-
- com.pervasive.datarush.operators.join.AbstractRelationalJoin
-
- com.pervasive.datarush.operators.join.FilterExistingRows
-
- All Implemented Interfaces:
LogicalOperator
public final class FilterExistingRows extends AbstractRelationalJoin
Filters records on the left based on the presence of matching records on the right.Row selection is controlled by checking whether the key values on the left can be found in the key fields of any record on the right. Records on the left whose keys match those of at least one record on the right are emitted on the output flow. A secondary flow consisting of those records which did not match is also produced; this output is the complement of the primary output with respect to the left input. In terms of relational algebra, this operator simultaneously performs a left semi-join and left anti-join on the two inputs.
Depending on the value of
AbstractRelationalJoin.getUseHashJoinHint()
one of two procedures are used which affects the overall graph behavior.- If hash join hint is false, input data will be sorted and hash partitioned by the specified
keys (if not already sorted according to upstream metadata).
Once sorted and partitioned, data is is them combined in a streaming fashion. Note that
in the case that a
join condition
is specified, this will require buffering on the right-hand-side, increasing memory requirements if the right has a large number records with duplicate keys. - If hash join hint is true, a full copy of the data from the right will be distributed to the cluster and loaded into memory within each node in the cluster. The left side will not be sorted or partitioned. Thus, the right side should always be small.
-
-
Constructor Summary
Constructors Constructor Description FilterExistingRows()
Default constructor.FilterExistingRows(JoinKey[] joinKeys)
Performs a filter with the given set of join keysFilterExistingRows(List<JoinKey> joinKeys)
Performs a filter with the given set of join keys
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
compose(CompositionContext ctx)
Compose the body of this operator.protected RecordPort
composeJoin(CompositionContext ctx, RecordPort left, RecordPort right, JoinKey[] keys)
RecordPort
getRejects()
Returns the port providing the data from the left which failed to match any record on the right.-
Methods inherited from class com.pervasive.datarush.operators.join.AbstractRelationalJoin
getJoinCondition, getJoinKeys, getLeft, getOutput, getRight, getUseHashJoinHint, newJoinID, setJoinCondition, setJoinCondition, setJoinKeys, setJoinKeys, setJoinKeys, setJoinKeys, setUseHashJoinHint
-
Methods inherited from class com.pervasive.datarush.operators.AbstractLogicalOperator
disableParallelism, getInputPorts, getOutputPorts, newInput, newInput, newOutput, newRecordInput, newRecordInput, newRecordOutput, notifyError
-
-
-
-
Constructor Detail
-
FilterExistingRows
public FilterExistingRows()
Default constructor. Prior to graph compilation, the following property must be set:
-
FilterExistingRows
public FilterExistingRows(JoinKey[] joinKeys)
Performs a filter with the given set of join keys- Parameters:
joinKeys
- the join keys
-
-
Method Detail
-
getRejects
public RecordPort getRejects()
Returns the port providing the data from the left which failed to match any record on the right.- Returns:
- the rejected data port
-
compose
protected final void compose(CompositionContext ctx)
Description copied from class:CompositeOperator
Compose the body of this operator. Implementations should do the following:- Perform any validation of configuration, input types, etc
- Instantiate and configure sub-operators, adding them to the provided context via
the method
OperatorComposable.add(O)
- Create necessary connections via the method
OperatorComposable.connect(P, P)
. This includes connections from the composite's input ports to sub-operators, connections between sub-operators, and connections from sub-operators output ports to the composite's output ports
- Overrides:
compose
in classAbstractRelationalJoin
- Parameters:
ctx
- the context
-
composeJoin
protected RecordPort composeJoin(CompositionContext ctx, RecordPort left, RecordPort right, JoinKey[] keys)
- Specified by:
composeJoin
in classAbstractRelationalJoin
-
-