Interface CompressionFormat

    • Field Detail

      • NONE

        static final CompressionFormat NONE
        A format indicating no compression of data.
    • Method Detail

      • getName

        String getName()
        Gets the name of the compression algorithm.
        Returns:
        the well-known name of the compression format
      • isSplittable

        boolean isSplittable()
        Indicates whether compressed data files may be split. Generally, compressed data cannot be split.
        Returns:
        true if the compression format allow file splits, false otherwise.
      • isCompressedFilename

        boolean isCompressedFilename​(String fileName)
        Indicates whether the specified file name is compressed in this format. Generally, this is done by looking at the file suffix.
        Parameters:
        fileName - the file name to check
        Returns:
        true if the file is believed to be compressed in this format, false otherwise.
      • getUncompressedFilename

        String getUncompressedFilename​(String fileName)
        Strips any file suffix associated with this format from the given file name.
        Parameters:
        fileName - the file name to process
        Returns:
        the expected name of the uncompressed file
      • getCompressedFilename

        String getCompressedFilename​(String fileName)
        Adds the common file suffix associated with this format to the given file name.
        Parameters:
        fileName - the file name to process
        Returns:
        the expected name of the compressed file
      • decompressStream

        InputStream decompressStream​(InputStream in)
                              throws IOException
        Wraps the specified stream in a decompressor for the format.
        Parameters:
        in - the compressed byte stream to decompress
        Returns:
        the uncompressed data stream
        Throws:
        IOException - if an I/O error occurs
      • compressStream

        OutputStream compressStream​(OutputStream out)
                             throws IOException
        Wraps the specified stream in a compressor for the format.
        Parameters:
        out - the uncompressed byte stream to compress
        Returns:
        the compressed data stream
        Throws:
        IOException - if an I/O error occurs
      • getSplitIterator

        SplitIterator getSplitIterator​(FileClient client,
                                       Path filePath,
                                       SplitOptions options)
                                throws IOException
        Create a SplitIterator for the given file path and options. Iterators may vary based on the compression format of an input data stream or if the input is not compressed. For instance, compression formats that do not support splitting will create an iterator with a single split: the whole file.
        Parameters:
        client - the file client in use
        filePath - path to the input file
        options - splitting options
        Returns:
        an iterator of the split(s) for the input file
        Throws:
        IOException - if an I/O error occurs
      • decompressSplitStream

        SplitInputStream decompressSplitStream​(DataSplit split,
                                               InputStream in,
                                               int bufferSize)
                                        throws IOException
        Wraps the specified stream for a split in a decompressor for the format. The format must be splittable or an UnsupportedOperationException will be thrown.
        Parameters:
        split - split being read and decompressed
        in - the compressed byte stream for a split to decompress
        bufferSize -
        Returns:
        the uncompressed data stream
        Throws:
        IOException - if an I/O error occurs
        UnsupportedOperationException - if the compression format does not support splitting