Interface CompressionFormat

All Superinterfaces:
Serializable
All Known Implementing Classes:
BZipCompression, GZipCompression, SnappyCompression, UncompressedData

public interface CompressionFormat extends Serializable
Provides support for a compression format.
  • Field Details

    • NONE

      static final CompressionFormat NONE
      A format indicating no compression of data.
  • Method Details

    • getName

      String getName()
      Gets the name of the compression algorithm.
      Returns:
      the well-known name of the compression format
    • isSplittable

      boolean isSplittable()
      Indicates whether compressed data files may be split. Generally, compressed data cannot be split.
      Returns:
      true if the compression format allow file splits, false otherwise.
    • isCompressedFilename

      boolean isCompressedFilename(String fileName)
      Indicates whether the specified file name is compressed in this format. Generally, this is done by looking at the file suffix.
      Parameters:
      fileName - the file name to check
      Returns:
      true if the file is believed to be compressed in this format, false otherwise.
    • getUncompressedFilename

      String getUncompressedFilename(String fileName)
      Strips any file suffix associated with this format from the given file name.
      Parameters:
      fileName - the file name to process
      Returns:
      the expected name of the uncompressed file
    • getCompressedFilename

      String getCompressedFilename(String fileName)
      Adds the common file suffix associated with this format to the given file name.
      Parameters:
      fileName - the file name to process
      Returns:
      the expected name of the compressed file
    • decompressStream

      InputStream decompressStream(InputStream in) throws IOException
      Wraps the specified stream in a decompressor for the format.
      Parameters:
      in - the compressed byte stream to decompress
      Returns:
      the uncompressed data stream
      Throws:
      IOException - if an I/O error occurs
    • compressStream

      OutputStream compressStream(OutputStream out) throws IOException
      Wraps the specified stream in a compressor for the format.
      Parameters:
      out - the uncompressed byte stream to compress
      Returns:
      the compressed data stream
      Throws:
      IOException - if an I/O error occurs
    • getSplitIterator

      SplitIterator getSplitIterator(FileClient client, Path filePath, SplitOptions options) throws IOException
      Create a SplitIterator for the given file path and options. Iterators may vary based on the compression format of an input data stream or if the input is not compressed. For instance, compression formats that do not support splitting will create an iterator with a single split: the whole file.
      Parameters:
      client - the file client in use
      filePath - path to the input file
      options - splitting options
      Returns:
      an iterator of the split(s) for the input file
      Throws:
      IOException - if an I/O error occurs
    • decompressSplitStream

      SplitInputStream decompressSplitStream(DataSplit split, InputStream in, int bufferSize) throws IOException
      Wraps the specified stream for a split in a decompressor for the format. The format must be splittable or an UnsupportedOperationException will be thrown.
      Parameters:
      split - split being read and decompressed
      in - the compressed byte stream for a split to decompress
      bufferSize -
      Returns:
      the uncompressed data stream
      Throws:
      IOException - if an I/O error occurs
      UnsupportedOperationException - if the compression format does not support splitting