Class FullDataDistribution


  • public final class FullDataDistribution
    extends DataDistribution
    An operator may set this as their requiredDataDistribution in order to indicate that the data needs to be sent to all nodes in the cluster (or all threads in the case of pseudo-distributed operation). This should be used in rare cases when a dataset must be examined by all nodes in the cluster. In addition, this should only be used for data that is "small" (i.e. a small lookup table) since it must be replicated throughout all nodes in the cluster.
    See Also:
    DataDistribution
    • Field Detail

    • Method Detail

      • remap

        public FullDataDistribution remap​(FieldRemapping mapping)
        Because FullDataDistribution does not reference any key names, it is not sensitive to transformations to the record namespace and thus this method just returns a reference to this, unmodified.
        Specified by:
        remap in class DataDistribution
        Parameters:
        mapping - the field remapping.
        Returns:
        this distribution, unmodified
      • getAliases

        public AliasSet[] getAliases()
        Description copied from class: DataDistribution
        Returns the fields that are referenced by this distribution. Note that it is valid for a distribution to reference no fields, in which case it should return an empty array. This method is used by the framework to validate the distribution is consistent with the type of the record.
        Specified by:
        getAliases in class DataDistribution
        Returns:
        the fields that are referenced by this distribution