public final class DeriveFields extends AbstractExecutableRecordPipeline
Applying multiple functions to an input record flow within a single dataflow process can be more efficient than applying each function in its own process. This is due to many factors, but mainly: preventing processor cache thrashing, saving data copies and lowering thread context switching.
input, output
Constructor and Description |
---|
DeriveFields()
Applies no functions to the input records.
|
DeriveFields(FieldDerivation... derivations)
Applies the specified derivations to all input records.
|
DeriveFields(List<FieldDerivation> derivations)
Applies the specified derivations to all input records.
|
DeriveFields(List<FieldDerivation> derivations,
boolean dropUnderived)
Applies the specified derivations to all input records.
|
DeriveFields(String derivationExpression)
Applies the specified derivations to all input records.
|
DeriveFields(String derivationExpression,
boolean dropUnderived)
Applies the specified derivations to all input records.
|
Modifier and Type | Method and Description |
---|---|
protected void |
computeMetadata(StreamingMetadataContext ctx)
Implementations must adhere to the following contracts
|
protected void |
execute(ExecutionContext ctx)
Executes the operator.
|
List<FieldDerivation> |
getDerivedFields()
Get the list of derivations that will be applied.
|
boolean |
getDropUnderivedFields()
Indicates whether input fields are dropped from the
output.
|
RecordPort |
getInput()
Gets the record port providing the input data to the operation.
|
RecordPort |
getOutput()
Gets the record port providing the output from the operation.
|
void |
setDerivedFields(FieldDerivation... derivations)
Set the list of field derivations to apply.
|
void |
setDerivedFields(List<FieldDerivation> derivations)
Set the list of field derivations to apply.
|
void |
setDerivedFields(String derivationExpression)
Set the list of field derivations to apply, using
a field derivation expression.
|
void |
setDropUnderivedFields(boolean dropUnderived)
Set whether input fields are dropped from the output.
|
cloneForExecution, getNumInputCopies, getPortSettings, handleInactiveOutput
disableParallelism, getInputPorts, getOutputPorts, newInput, newInput, newOutput, newRecordInput, newRecordInput, newRecordOutput, notifyError
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
disableParallelism, getInputPorts, getOutputPorts
public DeriveFields()
setDerivedFields(FieldDerivation...)
to set
the functions to apply.public DeriveFields(String derivationExpression)
derivations
- the expression containing field derivations to applypublic DeriveFields(List<FieldDerivation> derivations)
derivations
- the field derivations to applypublic DeriveFields(FieldDerivation... derivations)
derivations
- the field derivations to applypublic DeriveFields(String derivationExpression, boolean dropUnderived)
derivationExpression
- the expression containing field derivations to applydropUnderived
- true if input fields should be dropped; false otherwisepublic DeriveFields(List<FieldDerivation> derivations, boolean dropUnderived)
derivations
- the field derivations to applydropUnderived
- true if input fields should be dropped; false otherwisepublic RecordPort getInput()
AbstractExecutableRecordPipeline
getInput
in interface PipelineOperator<RecordPort>
getInput
in class AbstractExecutableRecordPipeline
public RecordPort getOutput()
AbstractExecutableRecordPipeline
getOutput
in interface PipelineOperator<RecordPort>
getOutput
in class AbstractExecutableRecordPipeline
public void setDerivedFields(String derivationExpression)
derivations
- the field derivations to applypublic void setDerivedFields(List<FieldDerivation> derivations)
FieldDerivation.derive(String, ScalarValuedFunction)
.
If multiple derivations apply to an output field, the
last one defined is used.derivations
- the field derivations to applypublic void setDerivedFields(FieldDerivation... derivations)
FieldDerivation.derive(String, ScalarValuedFunction)
.
If multiple derivations apply to an output field, the
last one defined is used.derivations
- the field derivations to applypublic List<FieldDerivation> getDerivedFields()
public boolean getDropUnderivedFields()
public void setDropUnderivedFields(boolean dropUnderived)
true
only derived fields are included
in the output.
This value is false
by default.
dropUnderived
- indicates whether to drop input
fields from the outputprotected void computeMetadata(StreamingMetadataContext ctx)
StreamingOperator
StreamingMetadataContext.parallelize(ParallelismStrategy)
.
RecordPort#setRequiredDataOrdering
, otherwise data may arrive in any order.
RecordPort#setRequiredDataDistribution
, otherwise data will arrive in an unspecified partial distribution
.
RecordPort#getSourceDataDistribution
and RecordPort#getSourceDataOrdering
. These should be
viewed as a hints to help chose a more efficient algorithm. In such cases, though, operators must
still declare data ordering and data distribution requirements; otherwise there is no guarantee that
data will arrive sorted/distributed as required.
RecordPort#setType
.RecordPort#setOutputDataOrdering
RecordPort#setOutputDataDistribution
AbstractModelPort#setMergeHandler
.MergeModel
is a convenient, re-usable model reducer, parameterized with
a merge-handler.
SimpleModelPort
's have no associated metadata and therefore there is
never any output metadata to declare. PMMLPort
's, on the other hand,
do have associated metadata. For all PMMLPorts, implementations must declare
the following:
PMMLPort.setPMMLModelSpec
.
computeMetadata
in class StreamingOperator
ctx
- the contextprotected void execute(ExecutionContext ctx)
ExecutableOperator
execute
in class ExecutableOperator
ctx
- context in which to lookup physical ports bound to logical portsCopyright © 2020 Actian Corporation. All rights reserved.