java.lang.Object
com.pervasive.datarush.io.FileSplit
- All Implemented Interfaces:
DataSplit,Serializable
- Direct Known Subclasses:
AzureFileSplit,CompressedFileSplit,SplittableCompressedFileSplit
Describes a range of bytes from a file. Ranges are identified by a start
offset and a length.
DataSplit objects are used to describe how files
can be divided into pieces which can then be parsed in parallel.- See Also:
-
Constructor Summary
ConstructorsModifierConstructorDescriptionCreates a split encompassing the entire file named by the path.Creates a split of the file named by the path.protectedFileSplit(Path path, long start, long length, FileClient client) Creates a split encompassing the entire file named by the path.Creates a split of the file named by the path. -
Method Summary
Modifier and TypeMethodDescriptionauthorize(FileClient client) Creates an identical split which will use the specified authorization context for access.longGets the end index of the byte range represented by this split.Return the file client associated with this split.longGets the length of the split, in bytes.getPath()Gets the path to the file on which the split is defined.longGets the byte offset of the beginning of the split.Opens the underlying source for access.openSplit(int buffer) Opens the split for reading using the specified size for the read buffer.toString()
-
Constructor Details
-
FileSplit
Creates a split encompassing the entire file named by the path.- Parameters:
path- the file on which the split is defined
-
FileSplit
Creates a split of the file named by the path.- Parameters:
path- the file on which the split is definedstart- the byte offset in the named file at which the split beginslength- the length of the split, in bytes
-
FileSplit
Creates a split encompassing the entire file named by the path.- Parameters:
path- the file on which the split is defined
-
FileSplit
Creates a split of the file named by the path.- Parameters:
path- the file on which the split is definedstart- the byte offset in the named file at which the split beginslength- the length of the split, in bytes
-
FileSplit
-
-
Method Details
-
getPath
Description copied from interface:DataSplitGets the path to the file on which the split is defined.Some splits may not represent a file; in this case,
nullis returned. -
getStartOffset
public long getStartOffset()Description copied from interface:DataSplitGets the byte offset of the beginning of the split.- Specified by:
getStartOffsetin interfaceDataSplit- Returns:
- the position of the first byte of the split
-
getLength
public long getLength()Description copied from interface:DataSplitGets the length of the split, in bytes. -
getEndOffset
public long getEndOffset()Gets the end index of the byte range represented by this split. This offset is exclusive; that is, this represents the offset of the first byte past the end of the split.- Returns:
- position of the end of the split
-
getFileClient
Return the file client associated with this split. May return null depending on the state.- Specified by:
getFileClientin interfaceDataSplit- Returns:
FileClientassociated with this split.
-
openSource
Description copied from interface:DataSplitOpens the underlying source for access. Initially, the stream is positioned at the first byte of the source. UnlikeDataSplit.openSplit(int), the caller is responsible for making sure accesses are aligned to split boundaries. The stream is also unbuffered.This method may be required for dealing with formats which store metadata at the beginning of the file.
- Specified by:
openSourcein interfaceDataSplit- Returns:
- a reader of the data in the underlying source
- Throws:
IOException- if an I/O error occurs opening the underlying source
-
openSplit
Description copied from interface:DataSplitOpens the split for reading using the specified size for the read buffer. The reader will initially be positioned at the first byte of the split. The reader will indicate when the last byte of the split has been read viaSplitInputStreamImpl.hasOverrun().- Specified by:
openSplitin interfaceDataSplit- Parameters:
buffer- the size of the buffer to use for reads, in bytes- Returns:
- a reader of the data in the split
- Throws:
IOException- if an I/O error occurs opening the underlying source
-
authorize
Description copied from interface:DataSplitCreates an identical split which will use the specified authorization context for access.This method is used by clients of the IO APIs which want to provide an alternative to the OS-level authorization inherited from the JVM's execution environment. Data access methods for the split will use the supplied context.
The authorization context is not a serializable attribute of a data split, as it represents the environment in which the data in accesses, not a property of the data itself. The context is associated with the split as a matter of convenience.
-
toString
-