Class EngineConfig.Ports

java.lang.Object
com.pervasive.datarush.graphs.EngineConfig.Ports
Enclosing class:
EngineConfig

public static final class EngineConfig.Ports extends Object
Nested class containing settings specific to ports.
  • Field Details

    • SPOOL_THRESHOLD

      public static final EngineProperty<Integer> SPOOL_THRESHOLD
      Property specifying the number of unread batches which triggers spooling mode for the queue. By default, this is automatically determined.
    • SIZE_BY_READERS

      public static final EngineProperty<Boolean> SIZE_BY_READERS
      Property controlling whether queues are initially sized based on the number of readers. By default, they are.
    • WRITEAHEAD

      public static final EngineProperty<Integer> WRITEAHEAD
      Property specifying the amount unread data which can be held in queues. By default, this is 2.
    • BATCH_SIZE

      public static final EngineProperty<Integer> BATCH_SIZE
      Property specifying the size of token batches in queues. By default, this is 1024.
  • Method Details

    • getSpoolThreshold

      public int getSpoolThreshold()
      Retrieves the configured threshold at batches are stored to disk. Queues will only ever store at most this many batches in memory; additional batches will be temporarily written to disk and read back as needed. The default threshold is twice the default parallelism or 16, whichever is greater. Negative values indicate spooling is disabled.
      Returns:
      the maximum number of queued batches to be held in memory
    • spoolThreshold

      public EngineConfig spoolThreshold(int size)
      Specifies the threshold at which the writer begins writing published batches to disk. After the threshold is reached, the writer never blocks due to unread batches. Batches are only written to disk if the (possibly increased) writeahead limit will be exceeded.

      This setting can be used to place an upper bound on the memory used for data queues between dataflow processes. Performance will possibly degrade, but OutOfMemoryErrors which kill the graph will be avoided.

      Parameters:
      size - the number of unread batches at which overflow spooling behavior is triggered
      Returns:
      a new EngineConfig with the settings modified
      See Also:
    • noSpooling

      public EngineConfig noSpooling()
      Disables writers writing published batches to disk. All batches are kept in memory. This will increase memory usage, possibly causing OutOfMemoryErrors when queue expansion is occurring, but avoids the performance overhead of disk I/O when memory buffers are full.
      Returns:
      a new EngineConfig with the settings modified
      See Also:
    • isSizeByReaders

      public boolean isSizeByReaders()
      Indicates whether queues will automatically adjust their writeahead limits based on the number of readers.
      Returns:
      true if queues will automatically adjust their sizes
    • sizeByReaders

      public EngineConfig sizeByReaders(boolean enabled)
      Specifies whether the initial writeahead for a port should be automatically determined based on the number of readers. Even if enabled, the writeahead limit will be at least the value specified; the limit is only increased for queues having more readers than the defined limit.

      Enabling this can help reduce contention on queues having a large number of readers.

      Parameters:
      enabled - whether to enable
      Returns:
      a new EngineConfig with the settings modified
      See Also:
    • getWriteahead

      public int getWriteahead()
      Retrieves the configured queue size. The queue size determines the number of token batches that may be present in a queue at a time. The default queue size is 2.
      Returns:
      the configured queue size
    • writeahead

      public EngineConfig writeahead(int size)
      Specifies the number of unread batches which a port can publish before blocking. A batch is considered unread if there exists any reader dataflow process which has not read it.

      Writeahead allows variation in writer performance to be smoothed out by buffering "fast" writers. Increasing this value can also decrease contention between readers, as access is spread across more batches. However, this also increases memory usage.

      Parameters:
      size - the number of batches
      Returns:
      a new EngineConfig with the settings modified
    • getBatchSize

      public int getBatchSize()
      Retrieves the configured batch size. The batch size determines the number of tokens that may be contained in a single batch. The default batch size is 1024.
      Returns:
      the configured batch size
    • batchSize

      public EngineConfig batchSize(int size)
      Specifies the port should publish pushed data in batches of the specified size. Data will not be published to readers until a full batch is ready, end of data is pushed, or the flush() method is called on the port.

      Batching reduces overhead due to synchronization between dataflow processes, as they need to synchronize less frequently. However, large batch sizes can also be detrimental as it increases both memory usage and latency. Batch size should only be modified with care.

      Parameters:
      size - the batch size
      Returns:
      a new EngineConfig with the settings modified