Class SplittableCompressedFileSplit

java.lang.Object
com.pervasive.datarush.io.FileSplit
com.pervasive.datarush.io.SplittableCompressedFileSplit
All Implemented Interfaces:
DataSplit, Serializable

public class SplittableCompressedFileSplit extends FileSplit
Represents a file split for a compression format that supports splitting. The CompressedFileSplit only works with formats that do not support splitting (i.e. one split is used that represents the whole file).
See Also:
  • Constructor Details

    • SplittableCompressedFileSplit

      public SplittableCompressedFileSplit(Path path, long startOffset, long length, CompressionFormat format)
      Construct an instance.
      Parameters:
      path - path to the file to read
      startOffset - starting offset of this split within the file
      length - length (in bytes) of the split
      format - compression format (must support splitting)
  • Method Details

    • authorize

      public FileSplit authorize(FileClient client)
      Description copied from interface: DataSplit
      Creates an identical split which will use the specified authorization context for access.

      This method is used by clients of the IO APIs which want to provide an alternative to the OS-level authorization inherited from the JVM's execution environment. Data access methods for the split will use the supplied context.

      The authorization context is not a serializable attribute of a data split, as it represents the environment in which the data in accesses, not a property of the data itself. The context is associated with the split as a matter of convenience.

      Specified by:
      authorize in interface DataSplit
      Overrides:
      authorize in class FileSplit
      Parameters:
      client - the authorization context to use for access
      Returns:
      a split using the provided authorization context
    • openSource

      public InputStream openSource() throws IOException
      Description copied from interface: DataSplit
      Opens the underlying source for access. Initially, the stream is positioned at the first byte of the source. Unlike DataSplit.openSplit(int), the caller is responsible for making sure accesses are aligned to split boundaries. The stream is also unbuffered.

      This method may be required for dealing with formats which store metadata at the beginning of the file.

      Specified by:
      openSource in interface DataSplit
      Overrides:
      openSource in class FileSplit
      Returns:
      a reader of the data in the underlying source
      Throws:
      IOException - if an I/O error occurs opening the underlying source
    • openSplit

      public SplitInputStream openSplit(int buffer) throws IOException
      Description copied from interface: DataSplit
      Opens the split for reading using the specified size for the read buffer. The reader will initially be positioned at the first byte of the split. The reader will indicate when the last byte of the split has been read via SplitInputStreamImpl.hasOverrun().
      Specified by:
      openSplit in interface DataSplit
      Overrides:
      openSplit in class FileSplit
      Parameters:
      buffer - the size of the buffer to use for reads, in bytes
      Returns:
      a reader of the data in the split
      Throws:
      IOException - if an I/O error occurs opening the underlying source
    • getCompressionFormat

      public CompressionFormat getCompressionFormat()
      Get the compression format.
      Returns:
      compressionFormat