Class FileSplit

    • Constructor Detail

      • FileSplit

        public FileSplit​(String path)
        Creates a split encompassing the entire file named by the path.
        Parameters:
        path - the file on which the split is defined
      • FileSplit

        public FileSplit​(String path,
                         long start,
                         long length)
        Creates a split of the file named by the path.
        Parameters:
        path - the file on which the split is defined
        start - the byte offset in the named file at which the split begins
        length - the length of the split, in bytes
      • FileSplit

        public FileSplit​(Path path)
        Creates a split encompassing the entire file named by the path.
        Parameters:
        path - the file on which the split is defined
      • FileSplit

        public FileSplit​(Path path,
                         long start,
                         long length)
        Creates a split of the file named by the path.
        Parameters:
        path - the file on which the split is defined
        start - the byte offset in the named file at which the split begins
        length - the length of the split, in bytes
      • FileSplit

        protected FileSplit​(Path path,
                            long start,
                            long length,
                            FileClient client)
    • Method Detail

      • getPath

        public Path getPath()
        Description copied from interface: DataSplit
        Gets the path to the file on which the split is defined.

        Some splits may not represent a file; in this case, null is returned.

        Specified by:
        getPath in interface DataSplit
        Returns:
        the path to the underlying source file.
      • getStartOffset

        public long getStartOffset()
        Description copied from interface: DataSplit
        Gets the byte offset of the beginning of the split.
        Specified by:
        getStartOffset in interface DataSplit
        Returns:
        the position of the first byte of the split
      • getLength

        public long getLength()
        Description copied from interface: DataSplit
        Gets the length of the split, in bytes.
        Specified by:
        getLength in interface DataSplit
        Returns:
        the size of the split
      • getEndOffset

        public long getEndOffset()
        Gets the end index of the byte range represented by this split. This offset is exclusive; that is, this represents the offset of the first byte past the end of the split.
        Returns:
        position of the end of the split
      • getFileClient

        public FileClient getFileClient()
        Return the file client associated with this split. May return null depending on the state.
        Specified by:
        getFileClient in interface DataSplit
        Returns:
        FileClient associated with this split.
      • openSource

        public InputStream openSource()
                               throws IOException
        Description copied from interface: DataSplit
        Opens the underlying source for access. Initially, the stream is positioned at the first byte of the source. Unlike DataSplit.openSplit(int), the caller is responsible for making sure accesses are aligned to split boundaries. The stream is also unbuffered.

        This method may be required for dealing with formats which store metadata at the beginning of the file.

        Specified by:
        openSource in interface DataSplit
        Returns:
        a reader of the data in the underlying source
        Throws:
        IOException - if an I/O error occurs opening the underlying source
      • openSplit

        public SplitInputStream openSplit​(int buffer)
                                   throws IOException
        Description copied from interface: DataSplit
        Opens the split for reading using the specified size for the read buffer. The reader will initially be positioned at the first byte of the split. The reader will indicate when the last byte of the split has been read via SplitInputStreamImpl.hasOverrun().
        Specified by:
        openSplit in interface DataSplit
        Parameters:
        buffer - the size of the buffer to use for reads, in bytes
        Returns:
        a reader of the data in the split
        Throws:
        IOException - if an I/O error occurs opening the underlying source
      • authorize

        public FileSplit authorize​(FileClient client)
        Description copied from interface: DataSplit
        Creates an identical split which will use the specified authorization context for access.

        This method is used by clients of the IO APIs which want to provide an alternative to the OS-level authorization inherited from the JVM's execution environment. Data access methods for the split will use the supplied context.

        The authorization context is not a serializable attribute of a data split, as it represents the environment in which the data in accesses, not a property of the data itself. The context is associated with the split as a matter of convenience.

        Specified by:
        authorize in interface DataSplit
        Parameters:
        client - the authorization context to use for access
        Returns:
        a split using the provided authorization context