- java.lang.Object
-
- com.pervasive.datarush.io.FileSplit
-
- All Implemented Interfaces:
DataSplit
,Serializable
- Direct Known Subclasses:
AzureFileSplit
,CompressedFileSplit
,SplittableCompressedFileSplit
public class FileSplit extends Object implements DataSplit
Describes a range of bytes from a file. Ranges are identified by a start offset and a length.DataSplit
objects are used to describe how files can be divided into pieces which can then be parsed in parallel.- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Modifier Constructor Description FileSplit(Path path)
Creates a split encompassing the entire file named by the path.FileSplit(Path path, long start, long length)
Creates a split of the file named by the path.protected
FileSplit(Path path, long start, long length, FileClient client)
FileSplit(String path)
Creates a split encompassing the entire file named by the path.FileSplit(String path, long start, long length)
Creates a split of the file named by the path.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description FileSplit
authorize(FileClient client)
Creates an identical split which will use the specified authorization context for access.long
getEndOffset()
Gets the end index of the byte range represented by this split.FileClient
getFileClient()
Return the file client associated with this split.long
getLength()
Gets the length of the split, in bytes.Path
getPath()
Gets the path to the file on which the split is defined.long
getStartOffset()
Gets the byte offset of the beginning of the split.InputStream
openSource()
Opens the underlying source for access.SplitInputStream
openSplit(int buffer)
Opens the split for reading using the specified size for the read buffer.String
toString()
-
-
-
Constructor Detail
-
FileSplit
public FileSplit(String path)
Creates a split encompassing the entire file named by the path.- Parameters:
path
- the file on which the split is defined
-
FileSplit
public FileSplit(String path, long start, long length)
Creates a split of the file named by the path.- Parameters:
path
- the file on which the split is definedstart
- the byte offset in the named file at which the split beginslength
- the length of the split, in bytes
-
FileSplit
public FileSplit(Path path)
Creates a split encompassing the entire file named by the path.- Parameters:
path
- the file on which the split is defined
-
FileSplit
public FileSplit(Path path, long start, long length)
Creates a split of the file named by the path.- Parameters:
path
- the file on which the split is definedstart
- the byte offset in the named file at which the split beginslength
- the length of the split, in bytes
-
FileSplit
protected FileSplit(Path path, long start, long length, FileClient client)
-
-
Method Detail
-
getPath
public Path getPath()
Description copied from interface:DataSplit
Gets the path to the file on which the split is defined.Some splits may not represent a file; in this case,
null
is returned.
-
getStartOffset
public long getStartOffset()
Description copied from interface:DataSplit
Gets the byte offset of the beginning of the split.- Specified by:
getStartOffset
in interfaceDataSplit
- Returns:
- the position of the first byte of the split
-
getLength
public long getLength()
Description copied from interface:DataSplit
Gets the length of the split, in bytes.
-
getEndOffset
public long getEndOffset()
Gets the end index of the byte range represented by this split. This offset is exclusive; that is, this represents the offset of the first byte past the end of the split.- Returns:
- position of the end of the split
-
getFileClient
public FileClient getFileClient()
Return the file client associated with this split. May return null depending on the state.- Specified by:
getFileClient
in interfaceDataSplit
- Returns:
FileClient
associated with this split.
-
openSource
public InputStream openSource() throws IOException
Description copied from interface:DataSplit
Opens the underlying source for access. Initially, the stream is positioned at the first byte of the source. UnlikeDataSplit.openSplit(int)
, the caller is responsible for making sure accesses are aligned to split boundaries. The stream is also unbuffered.This method may be required for dealing with formats which store metadata at the beginning of the file.
- Specified by:
openSource
in interfaceDataSplit
- Returns:
- a reader of the data in the underlying source
- Throws:
IOException
- if an I/O error occurs opening the underlying source
-
openSplit
public SplitInputStream openSplit(int buffer) throws IOException
Description copied from interface:DataSplit
Opens the split for reading using the specified size for the read buffer. The reader will initially be positioned at the first byte of the split. The reader will indicate when the last byte of the split has been read viaSplitInputStreamImpl.hasOverrun()
.- Specified by:
openSplit
in interfaceDataSplit
- Parameters:
buffer
- the size of the buffer to use for reads, in bytes- Returns:
- a reader of the data in the split
- Throws:
IOException
- if an I/O error occurs opening the underlying source
-
authorize
public FileSplit authorize(FileClient client)
Description copied from interface:DataSplit
Creates an identical split which will use the specified authorization context for access.This method is used by clients of the IO APIs which want to provide an alternative to the OS-level authorization inherited from the JVM's execution environment. Data access methods for the split will use the supplied context.
The authorization context is not a serializable attribute of a data split, as it represents the environment in which the data in accesses, not a property of the data itself. The context is associated with the split as a matter of convenience.
-
-