- java.lang.Object
-
- com.pervasive.datarush.operators.AbstractLogicalOperator
-
- com.pervasive.datarush.operators.StreamingOperator
-
- com.pervasive.datarush.operators.DeferredCompositeOperator
-
- com.pervasive.datarush.operators.AbstractDeferredRecordOperator
-
- com.pervasive.datarush.operators.partition.PartitionHint
-
- All Implemented Interfaces:
LogicalOperator
,PipelineOperator<RecordPort>
,RecordPipelineOperator
public final class PartitionHint extends AbstractDeferredRecordOperator
Forces the input data to be partitioned into parallel streams of data for subsequent parallel operations. If requested, the output will also be sorted. It is also possible to force data to be staged to disk, breaking physical streaming; consumers of the output will run in a physical graph executed after the physical graph executing this operator. Even if not forced, the (re)partitioning may cause staging to occur.Normally these actions happens automatically in a logical graph as required, but this operator provides a mechanism for explicit control.
-
-
Field Summary
-
Fields inherited from class com.pervasive.datarush.operators.AbstractDeferredRecordOperator
input, output
-
-
Constructor Summary
Constructors Constructor Description PartitionHint()
Forces a partitioning into parallel streams of data.PartitionHint(DataDistribution partitioning)
Forces a partitioning into parallel streams of data guaranteeing the specified distribution.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
compose(DeferredCompositionContext ctx)
Compose the body of this operator.protected void
computeMetadata(StreamingMetadataContext ctx)
Implementations must adhere to all of the contracts specified byStreamingOperator.computeMetadata(com.pervasive.datarush.operators.StreamingMetadataContext)
.DataOrdering
getDataOrdering()
Gets the data ordering of the output.DataDistribution
getPartitioning()
Gets the data distribution of the output.boolean
getPreserveSortOrder()
If true, data order will be preserved ( provided that the data order is known ).boolean
isForceStaging()
Indicates whether data will be forcibly staged to disk.void
setDataOrdering(DataOrdering dataOrdering)
Sets the data ordering of the output.void
setForceStaging(boolean forceStaging)
Sets whether data must be staged to disk.void
setPartitioning(DataDistribution partitioning)
Sets the data distribution of the output.void
setPreserveSortOrder(boolean preserveSortOrder)
Sets whether data order will be preserved ( provided that the data order is known ).-
Methods inherited from class com.pervasive.datarush.operators.AbstractDeferredRecordOperator
getInput, getOutput
-
Methods inherited from class com.pervasive.datarush.operators.DeferredCompositeOperator
computeOutputTypes
-
Methods inherited from class com.pervasive.datarush.operators.AbstractLogicalOperator
disableParallelism, getInputPorts, getOutputPorts, newInput, newInput, newOutput, newRecordInput, newRecordInput, newRecordOutput, notifyError
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface com.pervasive.datarush.operators.LogicalOperator
disableParallelism, getInputPorts, getOutputPorts
-
-
-
-
Constructor Detail
-
PartitionHint
public PartitionHint()
Forces a partitioning into parallel streams of data. The resulting streams will be partitioned in an unspecified fashion and be unordered.
-
PartitionHint
public PartitionHint(DataDistribution partitioning)
Forces a partitioning into parallel streams of data guaranteeing the specified distribution. The resulting streams will be unordered.- Parameters:
partitioning
- the requested data distribution
-
-
Method Detail
-
getPartitioning
public DataDistribution getPartitioning()
Gets the data distribution of the output.- Returns:
- the requested data distribution
-
setPartitioning
public void setPartitioning(DataDistribution partitioning)
Sets the data distribution of the output. The output is guaranteed to be distributed in the specified manner.- Parameters:
partitioning
- the requested data distribution
-
getDataOrdering
public DataOrdering getDataOrdering()
Gets the data ordering of the output.- Returns:
- the requested data ordering
-
setDataOrdering
public void setDataOrdering(DataOrdering dataOrdering)
Sets the data ordering of the output. The output is guaranteed to be ordered in the specified manner.- Parameters:
dataOrdering
- the requested data ordering
-
isForceStaging
public boolean isForceStaging()
Indicates whether data will be forcibly staged to disk.- Returns:
- whether staging is required
-
setForceStaging
public void setForceStaging(boolean forceStaging)
Sets whether data must be staged to disk.- Parameters:
forceStaging
- indicates whether the data must be staged
-
getPreserveSortOrder
public boolean getPreserveSortOrder()
If true, data order will be preserved ( provided that the data order is known ). Note that this can be a very expensive operation ( when running distributed it requires a re-sort following redistribution), so in-general this option should be avoided. If it is known that the downstream operations require data to be sorted, though, it may be advantageous to specify this option because, in the pseudo-distributed case, it is cheaper than performing a re-sort. By default, we do not preserve sort order.- Returns:
- whether to preserve sort order.
-
setPreserveSortOrder
public void setPreserveSortOrder(boolean preserveSortOrder)
Sets whether data order will be preserved ( provided that the data order is known ). Note that this can be a very expensive operation ( when running distributed it requires a re-sort following redistribution), so in-general this option should be avoided. If it is known that the downstream operations require data to be sorted, though, it may be advantageous to specify this option because, in the pseudo-distributed case, it is cheaper than performing a re-sort. By default, we do not preserve sort order.- Parameters:
preserveSortOrder
- whether to preserve sort order.
-
computeMetadata
protected void computeMetadata(StreamingMetadataContext ctx)
Description copied from class:DeferredCompositeOperator
Implementations must adhere to all of the contracts specified byStreamingOperator.computeMetadata(com.pervasive.datarush.operators.StreamingMetadataContext)
. In addition, DeferredCompositeOperators must declare required metadata so as to satisfy requirements of the operators that are added duringDeferredCompositeOperator.compose(com.pervasive.datarush.operators.DeferredCompositionContext)
.- Specified by:
computeMetadata
in classDeferredCompositeOperator
- Parameters:
ctx
- the context
-
compose
protected void compose(DeferredCompositionContext ctx)
Description copied from class:DeferredCompositeOperator
Compose the body of this operator. Implementations should do the following:- Instantiate and configure sub-operators, adding them to the provided context via
the method
OperatorComposable.add(O)
- Create necessary connections via the method
OperatorComposable.connect(P, P)
. This includes connections from the composite's input ports to sub-operators, connections between sub-operators, and connections from sub-operators output ports to the composite's output ports
- Specified by:
compose
in classDeferredCompositeOperator
- Parameters:
ctx
- the context
- Instantiate and configure sub-operators, adding them to the provided context via
the method
-
-