- All Superinterfaces:
Serializable
- All Known Implementing Classes:
BZipCompression,GZipCompression,SnappyCompression,UncompressedData
Provides support for a compression format.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final CompressionFormatA format indicating no compression of data. -
Method Summary
Modifier and TypeMethodDescriptionWraps the specified stream in a compressor for the format.decompressSplitStream(DataSplit split, InputStream in, int bufferSize) Wraps the specified stream for a split in a decompressor for the format.Wraps the specified stream in a decompressor for the format.getCompressedFilename(String fileName) Adds the common file suffix associated with this format to the given file name.getName()Gets the name of the compression algorithm.getSplitIterator(FileClient client, Path filePath, SplitOptions options) Create aSplitIteratorfor the given file path and options.getUncompressedFilename(String fileName) Strips any file suffix associated with this format from the given file name.booleanisCompressedFilename(String fileName) Indicates whether the specified file name is compressed in this format.booleanIndicates whether compressed data files may be split.
-
Field Details
-
NONE
A format indicating no compression of data.
-
-
Method Details
-
getName
String getName()Gets the name of the compression algorithm.- Returns:
- the well-known name of the compression format
-
isSplittable
boolean isSplittable()Indicates whether compressed data files may be split. Generally, compressed data cannot be split.- Returns:
trueif the compression format allow file splits,falseotherwise.
-
isCompressedFilename
Indicates whether the specified file name is compressed in this format. Generally, this is done by looking at the file suffix.- Parameters:
fileName- the file name to check- Returns:
trueif the file is believed to be compressed in this format,falseotherwise.
-
getUncompressedFilename
Strips any file suffix associated with this format from the given file name.- Parameters:
fileName- the file name to process- Returns:
- the expected name of the uncompressed file
-
getCompressedFilename
Adds the common file suffix associated with this format to the given file name.- Parameters:
fileName- the file name to process- Returns:
- the expected name of the compressed file
-
decompressStream
Wraps the specified stream in a decompressor for the format.- Parameters:
in- the compressed byte stream to decompress- Returns:
- the uncompressed data stream
- Throws:
IOException- if an I/O error occurs
-
compressStream
Wraps the specified stream in a compressor for the format.- Parameters:
out- the uncompressed byte stream to compress- Returns:
- the compressed data stream
- Throws:
IOException- if an I/O error occurs
-
getSplitIterator
SplitIterator getSplitIterator(FileClient client, Path filePath, SplitOptions options) throws IOException Create aSplitIteratorfor the given file path and options. Iterators may vary based on the compression format of an input data stream or if the input is not compressed. For instance, compression formats that do not support splitting will create an iterator with a single split: the whole file.- Parameters:
client- the file client in usefilePath- path to the input fileoptions- splitting options- Returns:
- an iterator of the split(s) for the input file
- Throws:
IOException- if an I/O error occurs
-
decompressSplitStream
SplitInputStream decompressSplitStream(DataSplit split, InputStream in, int bufferSize) throws IOException Wraps the specified stream for a split in a decompressor for the format. The format must be splittable or anUnsupportedOperationExceptionwill be thrown.- Parameters:
split- split being read and decompressedin- the compressed byte stream for a split to decompressbufferSize-- Returns:
- the uncompressed data stream
- Throws:
IOException- if an I/O error occursUnsupportedOperationException- if the compression format does not support splitting
-