Class DynamicRangeDataDistribution


  • public final class DynamicRangeDataDistribution
    extends PartialDynamicDataDistribution
    A distribution where data is range-partitioned by a selected array of keys. Ranges are dynamically computed so as to find split points that guarantee that the data is roughly evenly distributed. Data is sampled and the split points are set to evenly-spaces quantiles within the sample.
    See Also:
    DataDistribution
    • Constructor Detail

      • DynamicRangeDataDistribution

        public DynamicRangeDataDistribution​(List<String> keys)
        Creates a range distribution for a list of range keys.
        Parameters:
        keys - the range keys
      • DynamicRangeDataDistribution

        public DynamicRangeDataDistribution​(String... keys)
        Creates a range distribution for a list of range keys.
        Parameters:
        keys - the range keys
    • Method Detail

      • isGroupedBy

        public boolean isGroupedBy​(String[] keys)
        Returns true if this range distribution exactly matches the specified list of keys.
        Parameters:
        keys - the range keys
        Returns:
        whether this distribution exactly matches the specified list of keys
      • getKeys

        public String[] getKeys()
        Returns the keys by which we are range-partitioned.
        Returns:
        the keys by which we are range-partitioned.
      • getAliases

        public AliasSet[] getAliases()
        Description copied from class: DataDistribution
        Returns the fields that are referenced by this distribution. Note that it is valid for a distribution to reference no fields, in which case it should return an empty array. This method is used by the framework to validate the distribution is consistent with the type of the record.
        Specified by:
        getAliases in class DataDistribution
        Returns:
        the fields that are referenced by this distribution