Class EngineConfig.Ports

  • Enclosing class:
    EngineConfig

    public static final class EngineConfig.Ports
    extends Object
    Nested class containing settings specific to ports.
    • Field Detail

      • SPOOL_THRESHOLD

        public static final EngineProperty<Integer> SPOOL_THRESHOLD
        Property specifying the number of unread batches which triggers spooling mode for the queue. By default, this is automatically determined.
      • SIZE_BY_READERS

        public static final EngineProperty<Boolean> SIZE_BY_READERS
        Property controlling whether queues are initially sized based on the number of readers. By default, they are.
      • WRITEAHEAD

        public static final EngineProperty<Integer> WRITEAHEAD
        Property specifying the amount unread data which can be held in queues. By default, this is 2.
      • BATCH_SIZE

        public static final EngineProperty<Integer> BATCH_SIZE
        Property specifying the size of token batches in queues. By default, this is 1024.
    • Method Detail

      • getSpoolThreshold

        public int getSpoolThreshold()
        Retrieves the configured threshold at batches are stored to disk. Queues will only ever store at most this many batches in memory; additional batches will be temporarily written to disk and read back as needed. The default threshold is twice the default parallelism or 16, whichever is greater. Negative values indicate spooling is disabled.
        Returns:
        the maximum number of queued batches to be held in memory
      • spoolThreshold

        public EngineConfig spoolThreshold​(int size)
        Specifies the threshold at which the writer begins writing published batches to disk. After the threshold is reached, the writer never blocks due to unread batches. Batches are only written to disk if the (possibly increased) writeahead limit will be exceeded.

        This setting can be used to place an upper bound on the memory used for data queues between dataflow processes. Performance will possibly degrade, but OutOfMemoryErrors which kill the graph will be avoided.

        Parameters:
        size - the number of unread batches at which overflow spooling behavior is triggered
        Returns:
        a new EngineConfig with the settings modified
        See Also:
        writeahead(int)
      • noSpooling

        public EngineConfig noSpooling()
        Disables writers writing published batches to disk. All batches are kept in memory. This will increase memory usage, possibly causing OutOfMemoryErrors when queue expansion is occurring, but avoids the performance overhead of disk I/O when memory buffers are full.
        Returns:
        a new EngineConfig with the settings modified
        See Also:
        spoolThreshold(int)
      • isSizeByReaders

        public boolean isSizeByReaders()
        Indicates whether queues will automatically adjust their writeahead limits based on the number of readers.
        Returns:
        true if queues will automatically adjust their sizes
      • sizeByReaders

        public EngineConfig sizeByReaders​(boolean enabled)
        Specifies whether the initial writeahead for a port should be automatically determined based on the number of readers. Even if enabled, the writeahead limit will be at least the value specified; the limit is only increased for queues having more readers than the defined limit.

        Enabling this can help reduce contention on queues having a large number of readers.

        Parameters:
        enabled - whether to enable
        Returns:
        a new EngineConfig with the settings modified
        See Also:
        writeahead(int)
      • getWriteahead

        public int getWriteahead()
        Retrieves the configured queue size. The queue size determines the number of token batches that may be present in a queue at a time. The default queue size is 2.
        Returns:
        the configured queue size
      • writeahead

        public EngineConfig writeahead​(int size)
        Specifies the number of unread batches which a port can publish before blocking. A batch is considered unread if there exists any reader dataflow process which has not read it.

        Writeahead allows variation in writer performance to be smoothed out by buffering "fast" writers. Increasing this value can also decrease contention between readers, as access is spread across more batches. However, this also increases memory usage.

        Parameters:
        size - the number of batches
        Returns:
        a new EngineConfig with the settings modified
      • getBatchSize

        public int getBatchSize()
        Retrieves the configured batch size. The batch size determines the number of tokens that may be contained in a single batch. The default batch size is 1024.
        Returns:
        the configured batch size
      • batchSize

        public EngineConfig batchSize​(int size)
        Specifies the port should publish pushed data in batches of the specified size. Data will not be published to readers until a full batch is ready, end of data is pushed, or the flush() method is called on the port.

        Batching reduces overhead due to synchronization between dataflow processes, as they need to synchronize less frequently. However, large batch sizes can also be detrimental as it increases both memory usage and latency. Batch size should only be modified with care.

        Parameters:
        size - the batch size
        Returns:
        a new EngineConfig with the settings modified