- java.lang.Object
-
- com.pervasive.datarush.ports.record.MetadataUtil
-
public class MetadataUtil extends Object
Provides utility methods for working with port metadata. Some common operator requirements such as data ordering or data grouping (explained below) have slightly less strict requirements than is expressible withDataOrdering
andDataDistribution
alone. More specifically, it is always possible to express required metadata for these situations, but this may cause sorting or redistribution of data which is not strictly necessary. The utility methods of this encapsulate the logic of applying these "loose" requirements, permitting operators to be more flexible with respect to the incoming data.Data grouping is a data distribution requirement that all records with a specific set of values [k1, k2, ..., kn] for group key fields must be on the same data partition. Typically it also convenient for all records to be adjacent in the dataflow, though this is not required in all cases. The difference between this and data distribution is that in grouping, how the data is spread across the partitions is unimportant. It can be hash partitioned or range partitioned as long as groups are never split across two or more partitions.
- Since:
- 6.1
-
-
Constructor Summary
Constructors Constructor Description MetadataUtil()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static KeyDrivenDataDistribution
negotiate(DataDistribution distribution, String... keys)
Negotiates the required data distribution with the specified distribution using the given grouping fields.static DataOrdering
negotiate(DataOrdering ordering, String... keys)
Negotiates the required sort ordering with the specified ordering using the given key fields.static KeyDrivenDataDistribution
negotiateGrouping(MetadataCalculationContext ctx, RecordPort port, SortKey... keys)
Negotiates the required data distribution on the specified port using the given grouping fields with sort ordering.static KeyDrivenDataDistribution
negotiateGrouping(MetadataCalculationContext ctx, RecordPort port, String... keys)
Negotiates the required data ordering and distribution on the specified port using the given grouping fields.static DataOrdering
negotiateOrdering(MetadataCalculationContext ctx, RecordPort port, String... keys)
Negotiates the required data ordering on the specified port using the given key fields.static void
negotiateParallelismBasedOnSourceAssumingParallelizableRecords(IterativeMetadataContext ctx)
Uses the maximum of the input ports' parallelism as the max parallelism of the operator If there are no parallel inputs, this will useInteger.MAX_VALUE
as its parallelism, forcing a "scatter" from the inputs.
-
-
-
Method Detail
-
negotiateOrdering
public static DataOrdering negotiateOrdering(MetadataCalculationContext ctx, RecordPort port, String... keys)
Negotiates the required data ordering on the specified port using the given key fields. No explicit sort ordering is assumed for key fields; any sorting already on the requested keys is acceptable. The returned ordering can be queried to obtain the negotiated sort ordering of key fields.The negotiated ordering will be set as the required ordering for the port as a result of this method.
- Parameters:
ctx
- the context in which port metadata is resolvedport
- the port on which to negotiate data orderingkeys
- the key fields on which data must be sorted- Returns:
- the negotiated data ordering
-
negotiate
public static DataOrdering negotiate(DataOrdering ordering, String... keys)
Negotiates the required sort ordering with the specified ordering using the given key fields. No explicit sort ordering is assumed for key fields; any sorting already on the requested keys is acceptable. The returned ordering can be queried to obtain the negotiated sort ordering of key fields.- Parameters:
ordering
- the source ordering with which to negotiatekeys
- the key fields on which data must be sorted- Returns:
- the negotiated data ordering
-
negotiateGrouping
public static KeyDrivenDataDistribution negotiateGrouping(MetadataCalculationContext ctx, RecordPort port, String... keys)
Negotiates the required data ordering and distribution on the specified port using the given grouping fields. The resulting settings are guaranteed to produce no groups which are split across partitions. Furthermore, all records in the the data flow will be ordered in some way on the group keys, ensuring all records in the group appear sequentially. No explicit sort ordering is assumed for key fields; any sorting already on the requested keys is accepted.The negotiated ordering and distribution will be set as the required ordering and distribution for the port as a result of this method. Query the port for its required ordering using
RecordPort.getRequiredDataOrdering(MetadataContext)
to obtain the negotiated data ordering.- Parameters:
ctx
- the context in which port metadata is resolvedport
- the port on which to negotiate data orderingkeys
- the key fields on which data must be grouped- Returns:
- the negotiated data distribution
-
negotiateGrouping
public static KeyDrivenDataDistribution negotiateGrouping(MetadataCalculationContext ctx, RecordPort port, SortKey... keys)
Negotiates the required data distribution on the specified port using the given grouping fields with sort ordering. The resulting settings are guaranteed to produce no groups which are split across partitions. Furthermore, all records in the the data flow will be ordered as specified on the group keys, ensuring all records in the group appear sequentially.The negotiated ordering and distribution will be set as the required ordering and distribution for the port as a result of this method. The required ordering will be as requested by the provided keys.
- Parameters:
ctx
- the context in which port metadata is resolvedport
- the port on which to negotiate data orderingkeys
- the key fields on which data must be grouped, along with the required ordering- Returns:
- the negotiated data distribution
-
negotiate
public static KeyDrivenDataDistribution negotiate(DataDistribution distribution, String... keys)
Negotiates the required data distribution with the specified distribution using the given grouping fields. The resulting distribution is guaranteed to produce no groups which are split across partitions. The returned distribution can be queried to obtain the negotiated partition key fields.- Parameters:
distribution
- the source distribution with which to negotiate.keys
- the key fields on which data must be group- Returns:
- the negotiated data distribution
-
negotiateParallelismBasedOnSourceAssumingParallelizableRecords
public static void negotiateParallelismBasedOnSourceAssumingParallelizableRecords(IterativeMetadataContext ctx)
Uses the maximum of the input ports' parallelism as the max parallelism of the operator If there are no parallel inputs, this will useInteger.MAX_VALUE
as its parallelism, forcing a "scatter" from the inputs. If this operator has an explicitOperatorSettings#maxParallelism(int)
specified on it or inherited from its parent, this will favor the explicit setting.In addition, record input ports will have their
iterationParallelizable
flags set to true.Finally, record output ports will have their
outputParallelizable
flags set to true.- Parameters:
ctx
- the metadata context
-
-