Interface SplitInputStream

All Known Implementing Classes:
SplitInputStreamImpl

public interface SplitInputStream
Interface defining an input data stream that works within the boundaries of a defined split. A split has a beginning and ending offset that defines the boundary of the split.

The contract for spanning the end of the split follows:

  • A read that encounters a split boundary will return only the data within the split. Data from the current split and the next split are not mixed within the same buffer.
  • Before the read operation that encounters the split boundary returns, an indicator that the boundary has been crossed should be set. This indicator is returned by the hasOverrun() method.
  • The availableInSplit() method should return the number of bytes left in the split (or an estimate) or zero if the split boundary has been crossed.
  • Reading past the split boundary is allowed. This is needed for formats that may need to search for the end of a record that spans splits.
  • Method Summary

    Modifier and Type
    Method
    Description
    long
    Gets the number of bytes remaining in the split.
    void
    Close the underlying raw data stream.
    boolean
    Indicates whether the reader has read past the end of the split.
    int
    read(byte[] buffer, int offset, int length)
    Read data from the split input stream.
  • Method Details

    • availableInSplit

      long availableInSplit()
      Gets the number of bytes remaining in the split.
      Returns:
      the number of bytes left to read in the split. If the split has been completely read, this is 0.
    • hasOverrun

      boolean hasOverrun()
      Indicates whether the reader has read past the end of the split.
      Returns:
      true if all bytes in the split has been read, false if any remain.
    • read

      int read(byte[] buffer, int offset, int length) throws IOException
      Read data from the split input stream. By convention, if reading before the end of a split, the read will only return data contained within the split. A subsequent read will read data beyond the split.
      Parameters:
      buffer - input buffer
      offset - offset to start filling in buffer
      length - amount of data to read
      Returns:
      number of bytes read or -1 if EOD has been reached
      Throws:
      IOException - thrown if an I/O error occurs
    • close

      void close()
      Close the underlying raw data stream.