public final class CollectRecords extends ExecutableOperator implements RecordSinkOperator, RecordCollector
This should only to be used in contexts where the data is known to be small since all the data will be loaded into memory.
After this operator is executed within a dataflow graph, the resultant
data can be accessed using the getOutput()
method. The result
is a RecordTokenList
. This list supports iteration and so can
be used to peruse each resultant row. The result of invoking the getOutput()
method before execution of the graph completes is undefined
This operator is non-parallel.
Constructor and Description |
---|
CollectRecords()
Collects input data.
|
Modifier and Type | Method and Description |
---|---|
protected ExecutableOperator |
cloneForExecution()
Performs a deep copy of the operator for execution.
|
protected void |
computeMetadata(StreamingMetadataContext ctx)
Implementations must adhere to the following contracts
|
protected void |
execute(ExecutionContext ctx)
Executes the operator.
|
int |
getInitialBufferSize()
Gets the initial size, in tokens, to use for the list
collecting input data.
|
RecordPort |
getInput()
Gets the record port providing the input data to the sink.
|
RecordTokenList |
getOutput()
Gets the collected data.
|
void |
setInitialBufferSize(int size)
Sets the initial size, in tokens, to use for the list
collecting input data.
|
getNumInputCopies, getPortSettings, handleInactiveOutput
disableParallelism, getInputPorts, getOutputPorts, newInput, newInput, newOutput, newRecordInput, newRecordInput, newRecordOutput, notifyError
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
disableParallelism, getInputPorts, getOutputPorts
public CollectRecords()
public RecordPort getInput()
RecordSinkOperator
getInput
in interface RecordSinkOperator
getInput
in interface SinkOperator<RecordPort>
public int getInitialBufferSize()
public void setInitialBufferSize(int size)
size
- the number of tokens expectedpublic RecordTokenList getOutput()
The results are undefined if this method is called before the containing graph has completed execution.
getOutput
in interface RecordCollector
protected void computeMetadata(StreamingMetadataContext ctx)
StreamingOperator
StreamingMetadataContext.parallelize(ParallelismStrategy)
.
RecordPort#setRequiredDataOrdering
, otherwise data may arrive in any order.
RecordPort#setRequiredDataDistribution
, otherwise data will arrive in an unspecified partial distribution
.
RecordPort#getSourceDataDistribution
and RecordPort#getSourceDataOrdering
. These should be
viewed as a hints to help chose a more efficient algorithm. In such cases, though, operators must
still declare data ordering and data distribution requirements; otherwise there is no guarantee that
data will arrive sorted/distributed as required.
RecordPort#setType
.RecordPort#setOutputDataOrdering
RecordPort#setOutputDataDistribution
AbstractModelPort#setMergeHandler
.MergeModel
is a convenient, re-usable model reducer, parameterized with
a merge-handler.
SimpleModelPort
's have no associated metadata and therefore there is
never any output metadata to declare. PMMLPort
's, on the other hand,
do have associated metadata. For all PMMLPorts, implementations must declare
the following:
PMMLPort.setPMMLModelSpec
.
computeMetadata
in class StreamingOperator
ctx
- the contextprotected ExecutableOperator cloneForExecution()
ExecutableOperator
cloneForExecution
in class ExecutableOperator
protected void execute(ExecutionContext ctx)
ExecutableOperator
execute
in class ExecutableOperator
ctx
- context in which to lookup physical ports bound to logical portsCopyright © 2024 Actian Corporation. All rights reserved.