Class LogicalStatistics


  • public final class LogicalStatistics
    extends Object
    Miscellaneous utilities and constants associated with the LogicalStatistic class.
    • Field Detail

      • CATEGORY_PLAN

        public static final String CATEGORY_PLAN
        The name of the category for static planning statistics
        See Also:
        Constant Field Values
      • CATEGORY_ROW_COUNT

        public static final String CATEGORY_ROW_COUNT
        The name of the category for row count statistics
        See Also:
        Constant Field Values
      • STAGED_DATA_SETS

        public static final CounterDefinition STAGED_DATA_SETS
        This is a planning statistic that indicates the number of staged datasets. It is defined on connections. Each connection will either indicate a value of 1 if staging needed to be done or a value or 0 if not. When aggregated up to the graph level, this provides a total of datasets that required staging.
      • REDISTRIBUTED_DATA_SETS

        public static final CounterDefinition REDISTRIBUTED_DATA_SETS
        This is a planning statistic that indicates the number of redistributed, scattered, or gathered datasets. It is defined on connections. Each connection will either indicate a value of 1 if redistribution needed to be done or a value or 0 if not. When aggregated up to the graph level, this provides a total of datasets that required redistribution, scattering, or gathering.
      • AUTO_SORTED_DATA_SETS

        public static final CounterDefinition AUTO_SORTED_DATA_SETS
        This is a planning statistic that indicates the number of datasets that were automatically sorted. It is defined on connections. Each connection will either indicate a value of 1 if redistribution needed to be done or a value or 0 if not. When aggregated up to the graph level, this provides a total number of datasets that required automatic sort.
      • NON_PARALLEL_OPERATIONS

        public static final CounterDefinition NON_PARALLEL_OPERATIONS
        This is a planning statistic that indicates the number of non-parallel operations. It is defined on operators. Each operator will either indicate a value of 1 if non-parallel needed to be done or a value or 0 if parallel. When aggregated up to the graph level, this provides a total number of non-parallel operators. Note that this count only includes executable operators or deferred composite operators.
      • WRITE_ROW_COUNT

        public static CounterDefinition WRITE_ROW_COUNT
        A connection statistic that provides the number of rows staged (usually due to repartitioning in a cluster or due to forced staging). The counter will be StatisticState#FINISHED when the port reaches end-of-data. This statistic aggregates by summation. Use the utility method aggregateMinOfSumsOfCounts(List) to compute slowest writer across a set of ports underlying a composite port. This provides a simple measure of progress across partitions and across children of the composite.
      • READ_ROW_COUNT

        public static CounterDefinition READ_ROW_COUNT
        A connection statistic that provides the number of rows read by the downstream operator. The counter will be StatisticState#FINISHED when the port reaches end-of-data. This statistic aggregates by summation. Use the utility method aggregateMinOfSumsOfCounts(List) to compute slowest writer across a set of ports underlying a composite port. This provides a simple measure of progress across partitions and across children of the composite.
      • SORT_ROW_COUNT

        public static CounterDefinition SORT_ROW_COUNT
        A connection statistic that provides the number of rows implicitly sorted due to metadata mismatch. The counter will be StatisticState#FINISHED when the port reaches end-of-data. This statistic aggregates by summation. Use the utility method aggregateMinOfSumsOfCounts(List) to compute slowest writer across a set of ports underlying a composite port. This provides a simple measure of progress across partitions and across children of the composite.
    • Method Detail

      • filter

        public static List<LogicalStatistic> filter​(List<LogicalStatistic> stats,
                                                    StatisticDefinition<?> definition)
        Utility method to select all statistics that match the given definition.
        Parameters:
        stats - the original statistics
        definition - the type of statistics
        Returns:
        those statistics that match the given filter
      • filterSuppliers

        public static <T extends StatisticSupplierList<T> filterSuppliers​(List<LogicalStatistic> stats,
                                                                            StatisticDefinition<T> definition)
        Utility method to select all statistics suppliers that match the given definition.
        Type Parameters:
        T - the type of StatisticSupplier
        Parameters:
        stats - the original statistics
        definition - the type of statistics
        Returns:
        those statistics that match the given filter
      • aggregate

        public static StatisticsMap aggregate​(List<LogicalStatistic> details)
        Performs aggregation across logical statistic by the standard aggregation defined for each StatisticDefinition. Statistics are first grouped by definition and then aggregated.
        Parameters:
        details - the list of statistics to aggregate
        Returns:
        the aggregated statistics
      • aggregateMinOfSumsOfCounts

        public static StatisticsMap aggregateMinOfSumsOfCounts​(List<LogicalStatistic> details)
        Performs aggregation across the given list of statistics for the special case monitoring counter progress for a composite port. Statistics are first grouped by LogicalStatistic.definition() and LogicalStatistic.path() and aggregated by summation. They are then grouped by LogicalStatistic.definition() and aggregated by applying minimum. The end result is then a mapping from definition to value such that there is a single value per-statistic type.
        Parameters:
        details - the non-aggregated statistics.
        Returns:
        a mapping from definition to value.
      • aggregate

        public static StatisticsMap aggregate​(LogicalGraphInstanceView graph)
        Performs aggregation across all statistics within a graph
        Parameters:
        graph - the graph
        Returns:
        a mapping from definition to value.