Class LogicalStatistics

java.lang.Object
com.pervasive.datarush.graphs.LogicalStatistics

public final class LogicalStatistics extends Object
Miscellaneous utilities and constants associated with the LogicalStatistic class.
  • Field Details

    • CATEGORY_PLAN

      public static final String CATEGORY_PLAN
      The name of the category for static planning statistics
      See Also:
    • CATEGORY_ROW_COUNT

      public static final String CATEGORY_ROW_COUNT
      The name of the category for row count statistics
      See Also:
    • STAGED_DATA_SETS

      public static final CounterDefinition STAGED_DATA_SETS
      This is a planning statistic that indicates the number of staged datasets. It is defined on connections. Each connection will either indicate a value of 1 if staging needed to be done or a value or 0 if not. When aggregated up to the graph level, this provides a total of datasets that required staging.
    • REDISTRIBUTED_DATA_SETS

      public static final CounterDefinition REDISTRIBUTED_DATA_SETS
      This is a planning statistic that indicates the number of redistributed, scattered, or gathered datasets. It is defined on connections. Each connection will either indicate a value of 1 if redistribution needed to be done or a value or 0 if not. When aggregated up to the graph level, this provides a total of datasets that required redistribution, scattering, or gathering.
    • AUTO_SORTED_DATA_SETS

      public static final CounterDefinition AUTO_SORTED_DATA_SETS
      This is a planning statistic that indicates the number of datasets that were automatically sorted. It is defined on connections. Each connection will either indicate a value of 1 if redistribution needed to be done or a value or 0 if not. When aggregated up to the graph level, this provides a total number of datasets that required automatic sort.
    • NON_PARALLEL_OPERATIONS

      public static final CounterDefinition NON_PARALLEL_OPERATIONS
      This is a planning statistic that indicates the number of non-parallel operations. It is defined on operators. Each operator will either indicate a value of 1 if non-parallel needed to be done or a value or 0 if parallel. When aggregated up to the graph level, this provides a total number of non-parallel operators. Note that this count only includes executable operators or deferred composite operators.
    • WRITE_ROW_COUNT

      public static CounterDefinition WRITE_ROW_COUNT
      A connection statistic that provides the number of rows staged (usually due to repartitioning in a cluster or due to forced staging). The counter will be StatisticState#FINISHED when the port reaches end-of-data. This statistic aggregates by summation. Use the utility method aggregateMinOfSumsOfCounts(List) to compute slowest writer across a set of ports underlying a composite port. This provides a simple measure of progress across partitions and across children of the composite.
    • READ_ROW_COUNT

      public static CounterDefinition READ_ROW_COUNT
      A connection statistic that provides the number of rows read by the downstream operator. The counter will be StatisticState#FINISHED when the port reaches end-of-data. This statistic aggregates by summation. Use the utility method aggregateMinOfSumsOfCounts(List) to compute slowest writer across a set of ports underlying a composite port. This provides a simple measure of progress across partitions and across children of the composite.
    • SORT_ROW_COUNT

      public static CounterDefinition SORT_ROW_COUNT
      A connection statistic that provides the number of rows implicitly sorted due to metadata mismatch. The counter will be StatisticState#FINISHED when the port reaches end-of-data. This statistic aggregates by summation. Use the utility method aggregateMinOfSumsOfCounts(List) to compute slowest writer across a set of ports underlying a composite port. This provides a simple measure of progress across partitions and across children of the composite.
    • ROW_COUNTS

      public static final List<CounterDefinition> ROW_COUNTS
      Returns a list consisting of the various row count statistics:
      1. WRITE_ROW_COUNT
      2. SORT_ROW_COUNT
      3. READ_ROW_COUNT
  • Method Details

    • filter

      public static List<LogicalStatistic> filter(List<LogicalStatistic> stats, StatisticDefinition<?> definition)
      Utility method to select all statistics that match the given definition.
      Parameters:
      stats - the original statistics
      definition - the type of statistics
      Returns:
      those statistics that match the given filter
    • filterSuppliers

      public static <T extends StatisticSupplier> List<T> filterSuppliers(List<LogicalStatistic> stats, StatisticDefinition<T> definition)
      Utility method to select all statistics suppliers that match the given definition.
      Type Parameters:
      T - the type of StatisticSupplier
      Parameters:
      stats - the original statistics
      definition - the type of statistics
      Returns:
      those statistics that match the given filter
    • statistics

      public static List<RuntimeStatistic<?>> statistics(List<LogicalStatistic> logicalStatistics)
      Utility method to select each of the LogicalStatistic.statistic()'s from the specified list.
      Parameters:
      logicalStatistics - the list of logical statistics
      Returns:
      the runtime statistics
    • aggregate

      public static StatisticsMap aggregate(List<LogicalStatistic> details)
      Performs aggregation across logical statistic by the standard aggregation defined for each StatisticDefinition. Statistics are first grouped by definition and then aggregated.
      Parameters:
      details - the list of statistics to aggregate
      Returns:
      the aggregated statistics
    • aggregateMinOfSumsOfCounts

      public static StatisticsMap aggregateMinOfSumsOfCounts(List<LogicalStatistic> details)
      Performs aggregation across the given list of statistics for the special case monitoring counter progress for a composite port. Statistics are first grouped by LogicalStatistic.definition() and LogicalStatistic.path() and aggregated by summation. They are then grouped by LogicalStatistic.definition() and aggregated by applying minimum. The end result is then a mapping from definition to value such that there is a single value per-statistic type.
      Parameters:
      details - the non-aggregated statistics.
      Returns:
      a mapping from definition to value.
    • aggregate

      public static StatisticsMap aggregate(LogicalGraphInstanceView graph)
      Performs aggregation across all statistics within a graph
      Parameters:
      graph - the graph
      Returns:
      a mapping from definition to value.