public class FileSplit extends Object implements DataSplit
DataSplit
objects are used to describe how
files can be divided into pieces which can
then be parsed in parallel.Constructor and Description |
---|
FileSplit(Path path)
Creates a split encompassing the entire file named by the path.
|
FileSplit(Path path,
long start,
long length)
Creates a split of the file named by the path.
|
FileSplit(String path)
Creates a split encompassing the entire file named by the path.
|
FileSplit(String path,
long start,
long length)
Creates a split of the file named by the path.
|
Modifier and Type | Method and Description |
---|---|
FileSplit |
authorize(FileClient client)
Creates an identical split which will use the specified authorization
context for access.
|
long |
getEndOffset()
Gets the end index of the byte range
represented by this split.
|
FileClient |
getFileClient()
Return the file client associated with this split.
|
long |
getLength()
Gets the length of the split, in bytes.
|
Path |
getPath()
Gets the path to the file on which the
split is defined.
|
long |
getStartOffset()
Gets the byte offset of the beginning of
the split.
|
InputStream |
openSource()
Opens the underlying source for access.
|
SplitInputStream |
openSplit(int buffer)
Opens the split for reading using the specified
size for the read buffer.
|
String |
toString() |
public FileSplit(String path)
path
- the file on which the split is definedpublic FileSplit(String path, long start, long length)
path
- the file on which the split is definedstart
- the byte offset in the named file at which
the split beginslength
- the length of the split, in bytespublic FileSplit(Path path)
path
- the file on which the split is definedpublic FileSplit(Path path, long start, long length)
path
- the file on which the split is definedstart
- the byte offset in the named file at which
the split beginslength
- the length of the split, in bytespublic Path getPath()
DataSplit
Some splits may not represent a file; in
this case, null
is returned.
public long getStartOffset()
DataSplit
getStartOffset
in interface DataSplit
public long getLength()
DataSplit
public long getEndOffset()
public FileClient getFileClient()
getFileClient
in interface DataSplit
FileClient
associated with this split.public InputStream openSource() throws IOException
DataSplit
DataSplit.openSplit(int)
, the caller is responsible for making sure
accesses are aligned to split boundaries. The stream is also unbuffered.
This method may be required for dealing with formats which store metadata at the beginning of the file.
openSource
in interface DataSplit
IOException
- if an I/O error occurs opening
the underlying sourcepublic SplitInputStream openSplit(int buffer) throws IOException
DataSplit
SplitInputStreamImpl.hasOverrun()
.openSplit
in interface DataSplit
buffer
- the size of the buffer to use for reads,
in bytesIOException
- if an I/O error occurs opening
the underlying sourcepublic FileSplit authorize(FileClient client)
DataSplit
This method is used by clients of the IO APIs which want to provide an alternative to the OS-level authorization inherited from the JVM's execution environment. Data access methods for the split will use the supplied context.
The authorization context is not a serializable attribute of a data split, as it represents the environment in which the data in accesses, not a property of the data itself. The context is associated with the split as a matter of convenience.
Copyright © 2019 Actian Corporation. All rights reserved.