public final class Sort extends AbstractDeferredRecordOperator
getOutput()
.
Sort ordering is configurable via the property sortKeys
.
A set of key fields must be provided, optionally including the ordering direction for each key.
The specified fields must exist in the input data. If ordering information is omitted, ascending order
is assumed.
null
values sort lower than non-null
values under ascending order, higher under descending order.
Additional parameters may be set on the sort, permitting fine tuning of implementation specific details. By default, a reasonable default set of parameter values will be used which is suitable for the majority of uses.
SortKey
,
TokenOrder
Modifier and Type | Field and Description |
---|---|
static int |
DEFAULT_IO_BUFFER_SIZE
File I/O buffers default to 64K.
|
static long |
DEFAULT_SORT_BUFFER_SIZE
The default sort buffer, 100M.
|
static long |
SORT_BUFFER_SIZE_MAX
The largest allowable sort buffer, 16G.
|
static long |
SORT_BUFFER_SIZE_MIN
The smallest allowable sort buffer, 128K.
|
input, output
Constructor and Description |
---|
Sort()
The default constructor.
|
Sort(List<SortKey> keys)
Creates a new instance of
Sort , specifying
the minimal set of required parameters. |
Sort(SortKey... keys)
Creates a new instance of
Sort , specifying
the minimal set of required parameters. |
Sort(String... keys)
Creates a new instance of
Sort , specifying
the minimal set of required parameters. |
Modifier and Type | Method and Description |
---|---|
protected void |
compose(DeferredCompositionContext ctx)
Compose the body of this operator.
|
protected void |
computeMetadata(StreamingMetadataContext ctx)
Implementations must adhere to all of the contracts specified
by
StreamingOperator.computeMetadata(com.pervasive.datarush.operators.StreamingMetadataContext) . |
RecordPort |
getInput()
Returns the input port
|
long |
getIOBufferSize()
Gets the buffer size, in bytes, used for I/O operations on run files.
|
int |
getMaxMerge()
Gets the maximum number of intermediate result files which should be
merged at one time.
|
RecordPort |
getOutput()
Returns the output port
|
long |
getSortBufferSize()
Gets the size of the memory usage target, in bytes, for a sort.
|
List<SortKey> |
getSortKeys()
Returns the list of sort keys by which we are to sort.
|
void |
setIOBuffer(String sizeSpecifier)
Sets the size of the memory buffers used to for I/O operations on run
files during sorting.
|
void |
setIOBufferSize(long size)
sets the buffer size, in bytes, used for I/O operations on run files.
|
void |
setMaxMerge(int maxMerge)
Sets the maximum number of run files to merge at one time.
|
void |
setSortBuffer(String sizeSpecifier)
Sets an approximate cap on the amount of memory used by the sort.
|
void |
setSortBufferSize(long size)
Sets the size of the memory usage target, in bytes, for a sort.
|
void |
setSortKeys(List<SortKey> keys)
Sets the list of sort keys by which we are to sort.
|
void |
setSortKeys(SortKey... keys)
Sets the list of sort keys by which we are to sort.
|
void |
setSortKeys(String... keys)
Sets the list of sort keys by which we are to sort.
|
String |
toString() |
computeOutputTypes
disableParallelism, getInputPorts, getOutputPorts, newInput, newInput, newOutput, newRecordInput, newRecordInput, newRecordOutput, notifyError
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
disableParallelism, getInputPorts, getOutputPorts
public static final long SORT_BUFFER_SIZE_MIN
public static final long DEFAULT_SORT_BUFFER_SIZE
public static final long SORT_BUFFER_SIZE_MAX
public static final int DEFAULT_IO_BUFFER_SIZE
public Sort()
public Sort(List<SortKey> keys)
Sort
, specifying
the minimal set of required parameters.keys
- The fields by which to sort.public Sort(SortKey... keys)
Sort
, specifying
the minimal set of required parameters.keys
- The fields by which to sort.public RecordPort getInput()
PipelineOperator
getInput
in interface PipelineOperator<RecordPort>
getInput
in class AbstractDeferredRecordOperator
public RecordPort getOutput()
PipelineOperator
getOutput
in interface PipelineOperator<RecordPort>
getOutput
in class AbstractDeferredRecordOperator
public void setMaxMerge(int maxMerge)
getIOBufferSize()
bytes.
If not set or set to 0
,
this first defaults to EngineConfig.Sort.getMaxMerge()
.
If that is also unspecified or set to 0
this defaults to getSortBufferSize()/getIOBufferSize()
.
maxMerge
- Maximum number of files to mergepublic int getMaxMerge()
0
indicates that the cap will
be derived from the sort and I/O buffer sizes as described in
setMaxMerge(int)
.public void setSortBuffer(String sizeSpecifier)
Values are supplied as number strings, supporting an optional size suffix. The common suffixes K, M, and G are supported, having the expected meaning; suffixes are case-insensitive. Omitting the suffix indicates the value is in bytes. Values are limited to the range from 1M to 2G. Values outside of this range will be adjusted to the nearest limit.
If not specified or value is 0
, this first defaults to EngineConfig.Sort.getSortBufferSize()
.
If that is also unspecified or value is 0
, this defaults to DEFAULT_SORT_BUFFER_SIZE
.
sizeSpecifier
- a string specifying a byte size. Sizes are specified as
positive whole numbers with an optional case-insensitive
multiplier suffix.com.pervasive.datarush.graphs.physical.InvalidPropertyValueException
- if the size specifier cannot be parsed or specifies a
negative size.public void setSortBufferSize(long size)
size
- the approximate memory usage cap for sortingpublic long getSortBufferSize()
public void setIOBuffer(String sizeSpecifier)
setMaxMerge(int)
for more details.
Values are supplied as number strings, supporting an optional size suffix. The common suffixes K, M, and G are supported, having the expected meaning; suffixes are case-insensitive. Omitting the suffix indicates the value is in bytes.
If not specified or value is 0
, this first defaults to EngineConfig.Sort.getIOBufferSize()
.
If that is also unspecified or value is 0
, this defaults to DEFAULT_IO_BUFFER_SIZE
.
sizeSpecifier
- a string specifying a byte size. Sizes are specified as
positive whole numbers with an optional case-insensitive
multiplier suffix.com.pervasive.datarush.graphs.physical.InvalidPropertyValueException
- if the size specifier cannot be parsed or specifies a
negative size.public void setIOBufferSize(long size)
size
- Buffer size used for intermediate file operationspublic long getIOBufferSize()
public List<SortKey> getSortKeys()
public void setSortKeys(List<SortKey> keys)
keys
- the list of sort keys by which we are to sort.public void setSortKeys(SortKey... keys)
keys
- the list of sort keys by which we are to sort.public void setSortKeys(String... keys)
keys
- the list of sort keys by which we are to sort. Ascending order is assumed.protected void computeMetadata(StreamingMetadataContext ctx)
DeferredCompositeOperator
StreamingOperator.computeMetadata(com.pervasive.datarush.operators.StreamingMetadataContext)
. In addition,
DeferredCompositeOperators must declare required metadata so
as to satisfy requirements of the operators that are added
during DeferredCompositeOperator.compose(com.pervasive.datarush.operators.DeferredCompositionContext)
.computeMetadata
in class DeferredCompositeOperator
ctx
- the contextprotected void compose(DeferredCompositionContext ctx)
DeferredCompositeOperator
OperatorComposable.add(O)
OperatorComposable.connect(P, P)
. This includes
connections from the composite's input ports to sub-operators, connections between sub-operators, and
connections from sub-operators output ports to the composite's output portscompose
in class DeferredCompositeOperator
ctx
- the contextCopyright © 2020 Actian Corporation. All rights reserved.