public class TextFrequencyFilter extends ExecutableOperator implements RecordPipelineOperator
Constructor and Description |
---|
TextFrequencyFilter()
Default constructor.
|
TextFrequencyFilter(String inputField)
Constructor specifying the input field containing the text list and the
input field containing the frequency list that will be filtered.
|
TextFrequencyFilter(String inputField,
int totalNumber)
Constructor specifying the input field containing the frequency map
and sets the total number of top frequencies to keep.
|
TextFrequencyFilter(String inputField,
int min,
int max)
Constructor specifying the input field containing the frequency map
and sets the minimum and maximum threshold of frequencies to keep.
|
Modifier and Type | Method and Description |
---|---|
protected void |
computeMetadata(StreamingMetadataContext ctx)
Implementations must adhere to the following contracts
|
protected void |
execute(ExecutionContext ctx)
Executes the operator.
|
RecordPort |
getInput()
Get the input port of this operator.
|
String |
getInputField()
Get the frequency map field to filter.
|
int |
getMaxThreshold()
Get the maximum threshold for absolute frequencies when filtering.
|
int |
getMinThreshold()
Get the minimum threshold for absolute frequencies when filtering.
|
RecordPort |
getOutput()
Get the output port of this operator.
|
String |
getOutputField()
Get the output field that will contain the filtered frequency map.
|
int |
getTotalNumber()
Get the total number of top frequencies to keep.
|
void |
setInputField(String inputField)
Set the frequency map field to filter.
|
void |
setMaxThreshold(int maxThreshold)
Set the maximum threshold for absolute frequencies when filtering.
|
void |
setMinThreshold(int minThreshold)
Set the minimum threshold for absolute frequencies when filtering.
|
void |
setOutputField(String outputField)
Set the output field that will contain the filtered frequency map.
|
void |
setTotalNumber(int totalNumber)
Set the total number of top frequencies to keep.
|
cloneForExecution, getNumInputCopies, getPortSettings, handleInactiveOutput
disableParallelism, getInputPorts, getOutputPorts, newInput, newInput, newOutput, newRecordInput, newRecordInput, newRecordOutput, notifyError
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
disableParallelism, getInputPorts, getOutputPorts
public TextFrequencyFilter()
setInputField(String)
to set
the name of the frequency map field.public TextFrequencyFilter(String inputField)
inputField
- name of the frequency map field to filterpublic TextFrequencyFilter(String inputField, int totalNumber)
inputField
- name of the text field to filtertotalNumber
- number of highest frequencies to keeppublic TextFrequencyFilter(String inputField, int min, int max)
inputField
- name of the text field to filtermin
- lowest frequency to keepmax
- highest frequency to keeppublic void setInputField(String inputField)
If this field does not exist in the input, or is not of type WordMap or NGramMap, an exception will be thrown at composition time.
inputField
- name of the frequency map field to filterpublic String getInputField()
public void setOutputField(String outputField)
outputField
- name of the frequency map output fieldpublic String getOutputField()
public void setMinThreshold(int minThreshold)
minThreshold
- minimum frequency to keeppublic int getMinThreshold()
public void setMaxThreshold(int maxThreshold)
maxThreshold
- maximum frequency to keeppublic int getMaxThreshold()
public void setTotalNumber(int totalNumber)
totalNumber
- number of top frequencies to keeppublic int getTotalNumber()
public RecordPort getInput()
getInput
in interface PipelineOperator<RecordPort>
public RecordPort getOutput()
getOutput
in interface PipelineOperator<RecordPort>
protected void computeMetadata(StreamingMetadataContext ctx)
StreamingOperator
StreamingMetadataContext.parallelize(ParallelismStrategy)
.
RecordPort#setRequiredDataOrdering
, otherwise data may arrive in any order.
RecordPort#setRequiredDataDistribution
, otherwise data will arrive in an unspecified partial distribution
.
RecordPort#getSourceDataDistribution
and RecordPort#getSourceDataOrdering
. These should be
viewed as a hints to help chose a more efficient algorithm. In such cases, though, operators must
still declare data ordering and data distribution requirements; otherwise there is no guarantee that
data will arrive sorted/distributed as required.
RecordPort#setType
.RecordPort#setOutputDataOrdering
RecordPort#setOutputDataDistribution
AbstractModelPort#setMergeHandler
.MergeModel
is a convenient, re-usable model reducer, parameterized with
a merge-handler.
SimpleModelPort
's have no associated metadata and therefore there is
never any output metadata to declare. PMMLPort
's, on the other hand,
do have associated metadata. For all PMMLPorts, implementations must declare
the following:
PMMLPort.setPMMLModelSpec
.
computeMetadata
in class StreamingOperator
ctx
- the contextprotected void execute(ExecutionContext ctx)
ExecutableOperator
execute
in class ExecutableOperator
ctx
- context in which to lookup physical ports bound to logical portsCopyright © 2024 Actian Corporation. All rights reserved.