public class GenerateBagOfWords extends ExecutableOperator implements RecordPipelineOperator
| Constructor and Description |
|---|
GenerateBagOfWords()
Default constructor.
|
GenerateBagOfWords(String textField)
Constructor specifying the tokenized text field for which to generate the
bag of words.
|
| Modifier and Type | Method and Description |
|---|---|
protected void |
computeMetadata(StreamingMetadataContext ctx)
Implementations must adhere to the following contracts
|
protected void |
execute(ExecutionContext ctx)
Executes the operator.
|
RecordPort |
getInput()
Get the input port of this operator.
|
String |
getInputField()
Get the field for which to generate the bag of words.
|
RecordPort |
getOutput()
Get the output port of this operator.
|
String |
getOutputField()
Get the field that will contain the bag of words.
|
void |
setInputField(String textField)
Set the field for which to generate the bag of words.
|
void |
setOutputField(String outputField)
Set the field that will contain the bag of words.
|
cloneForExecution, getNumInputCopies, getPortSettings, handleInactiveOutputdisableParallelism, getInputPorts, getOutputPorts, newInput, newInput, newOutput, newRecordInput, newRecordInput, newRecordOutput, notifyErrorclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitdisableParallelism, getInputPorts, getOutputPortspublic GenerateBagOfWords()
setInputField(String) and
setOutputField(String)
to set the name of the text field to count and the output field.public GenerateBagOfWords(String textField)
textField - name of the tokenized text field in the inputpublic void setInputField(String textField)
If this field does not exist in the input, or is not of type TokenizedText, an exception will be thrown at composition time.
textField - name of the tokenized text field in the inputpublic String getInputField()
public void setOutputField(String outputField)
outputField - name of the term set field in the outputpublic String getOutputField()
public RecordPort getInput()
getInput in interface PipelineOperator<RecordPort>public RecordPort getOutput()
getOutput in interface PipelineOperator<RecordPort>protected void computeMetadata(StreamingMetadataContext ctx)
StreamingOperator
StreamingMetadataContext.parallelize(ParallelismStrategy).
RecordPort#setRequiredDataOrdering, otherwise data may arrive in any order.
RecordPort#setRequiredDataDistribution, otherwise data will arrive in an unspecified partial distribution.
RecordPort#getSourceDataDistribution and RecordPort#getSourceDataOrdering. These should be
viewed as a hints to help chose a more efficient algorithm. In such cases, though, operators must
still declare data ordering and data distribution requirements; otherwise there is no guarantee that
data will arrive sorted/distributed as required.
RecordPort#setType.RecordPort#setOutputDataOrderingRecordPort#setOutputDataDistributionAbstractModelPort#setMergeHandler.MergeModel is a convenient, re-usable model reducer, parameterized with
a merge-handler.
SimpleModelPort's have no associated metadata and therefore there is
never any output metadata to declare. PMMLPort's, on the other hand,
do have associated metadata. For all PMMLPorts, implementations must declare
the following:
PMMLPort.setPMMLModelSpec.
computeMetadata in class StreamingOperatorctx - the contextprotected void execute(ExecutionContext ctx)
ExecutableOperatorexecute in class ExecutableOperatorctx - context in which to lookup physical ports bound to logical portsCopyright © 2020 Actian Corporation. All rights reserved.