-
- All Superinterfaces:
Serializable
- All Known Implementing Classes:
BZipCompression
,GZipCompression
,SnappyCompression
,UncompressedData
public interface CompressionFormat extends Serializable
Provides support for a compression format.
-
-
Field Summary
Fields Modifier and Type Field Description static CompressionFormat
NONE
A format indicating no compression of data.
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description OutputStream
compressStream(OutputStream out)
Wraps the specified stream in a compressor for the format.SplitInputStream
decompressSplitStream(DataSplit split, InputStream in, int bufferSize)
Wraps the specified stream for a split in a decompressor for the format.InputStream
decompressStream(InputStream in)
Wraps the specified stream in a decompressor for the format.String
getCompressedFilename(String fileName)
Adds the common file suffix associated with this format to the given file name.String
getName()
Gets the name of the compression algorithm.SplitIterator
getSplitIterator(FileClient client, Path filePath, SplitOptions options)
Create aSplitIterator
for the given file path and options.String
getUncompressedFilename(String fileName)
Strips any file suffix associated with this format from the given file name.boolean
isCompressedFilename(String fileName)
Indicates whether the specified file name is compressed in this format.boolean
isSplittable()
Indicates whether compressed data files may be split.
-
-
-
Field Detail
-
NONE
static final CompressionFormat NONE
A format indicating no compression of data.
-
-
Method Detail
-
getName
String getName()
Gets the name of the compression algorithm.- Returns:
- the well-known name of the compression format
-
isSplittable
boolean isSplittable()
Indicates whether compressed data files may be split. Generally, compressed data cannot be split.- Returns:
true
if the compression format allow file splits,false
otherwise.
-
isCompressedFilename
boolean isCompressedFilename(String fileName)
Indicates whether the specified file name is compressed in this format. Generally, this is done by looking at the file suffix.- Parameters:
fileName
- the file name to check- Returns:
true
if the file is believed to be compressed in this format,false
otherwise.
-
getUncompressedFilename
String getUncompressedFilename(String fileName)
Strips any file suffix associated with this format from the given file name.- Parameters:
fileName
- the file name to process- Returns:
- the expected name of the uncompressed file
-
getCompressedFilename
String getCompressedFilename(String fileName)
Adds the common file suffix associated with this format to the given file name.- Parameters:
fileName
- the file name to process- Returns:
- the expected name of the compressed file
-
decompressStream
InputStream decompressStream(InputStream in) throws IOException
Wraps the specified stream in a decompressor for the format.- Parameters:
in
- the compressed byte stream to decompress- Returns:
- the uncompressed data stream
- Throws:
IOException
- if an I/O error occurs
-
compressStream
OutputStream compressStream(OutputStream out) throws IOException
Wraps the specified stream in a compressor for the format.- Parameters:
out
- the uncompressed byte stream to compress- Returns:
- the compressed data stream
- Throws:
IOException
- if an I/O error occurs
-
getSplitIterator
SplitIterator getSplitIterator(FileClient client, Path filePath, SplitOptions options) throws IOException
Create aSplitIterator
for the given file path and options. Iterators may vary based on the compression format of an input data stream or if the input is not compressed. For instance, compression formats that do not support splitting will create an iterator with a single split: the whole file.- Parameters:
client
- the file client in usefilePath
- path to the input fileoptions
- splitting options- Returns:
- an iterator of the split(s) for the input file
- Throws:
IOException
- if an I/O error occurs
-
decompressSplitStream
SplitInputStream decompressSplitStream(DataSplit split, InputStream in, int bufferSize) throws IOException
Wraps the specified stream for a split in a decompressor for the format. The format must be splittable or anUnsupportedOperationException
will be thrown.- Parameters:
split
- split being read and decompressedin
- the compressed byte stream for a split to decompressbufferSize
-- Returns:
- the uncompressed data stream
- Throws:
IOException
- if an I/O error occursUnsupportedOperationException
- if the compression format does not support splitting
-
-