public interface DataSplit extends Serializable
DataSplit
objects are used to describe how
data sources can be divided into pieces which can
then be parsed in parallel.Modifier and Type | Method and Description |
---|---|
DataSplit |
authorize(FileClient client)
Creates an identical split which will use the specified authorization
context for access.
|
FileClient |
getFileClient()
Return the file client associated with this split.
|
long |
getLength()
Gets the length of the split, in bytes.
|
Path |
getPath()
Gets the path to the file on which the
split is defined.
|
long |
getStartOffset()
Gets the byte offset of the beginning of
the split.
|
InputStream |
openSource()
Opens the underlying source for access.
|
SplitInputStream |
openSplit(int buffer)
Opens the split for reading using the specified
size for the read buffer.
|
Path getPath()
Some splits may not represent a file; in
this case, null
is returned.
long getStartOffset()
long getLength()
SplitInputStream openSplit(int buffer) throws IOException
SplitInputStreamImpl.hasOverrun()
.buffer
- the size of the buffer to use for reads,
in bytesIOException
- if an I/O error occurs opening
the underlying sourceInputStream openSource() throws IOException
openSplit(int)
, the caller is responsible for making sure
accesses are aligned to split boundaries. The stream is also unbuffered.
This method may be required for dealing with formats which store metadata at the beginning of the file.
IOException
- if an I/O error occurs opening
the underlying sourceDataSplit authorize(FileClient client)
This method is used by clients of the IO APIs which want to provide an alternative to the OS-level authorization inherited from the JVM's execution environment. Data access methods for the split will use the supplied context.
The authorization context is not a serializable attribute of a data split, as it represents the environment in which the data in accesses, not a property of the data itself. The context is associated with the split as a matter of convenience.
client
- the authorization context to use for accessFileClient getFileClient()
May return null depending on the state.
FileClient
associated with this split.Copyright © 2020 Actian Corporation. All rights reserved.