public class SplitReader extends Reader
SplitReader
provides a text view of split data, handling conversion
from bytes to characters.
Because splits may fall at arbitrary points within a file,
consumers may need to perform additional processing
to place themselves in a valid position. SplitReader
supports this case by allowing a character sequence to
be provided for use as a synchronization marker at the
beginning and end of splits.
Normally, reading begins at the beginning of the split, but can be configured to start at the first character after the synchronization marker. Care should be taken when reading at the split beginning, as split boundaries may occur in the middle of an encoded character sequence.
Similarly, reads can continue beyond the end of the split. Readers can either manage this themselves or auto-terminate after the first complete synchronization marker beyond the end of split.
Modifier and Type | Field and Description |
---|---|
static int |
DEFAULT_BUFFER
Default buffer size for input to the character decoder, in bytes
|
Constructor and Description |
---|
SplitReader(SplitInputStream in,
CharsetEncoding charset)
Creates a new reader on the specified stream, using the given
encoding properties.
|
SplitReader(SplitInputStream in,
CharsetEncoding charset,
int buffer)
Creates a new reader on the specified stream, using the given
encoding properties.
|
SplitReader(SplitInputStream in,
CharsetEncoding charset,
int buffer,
String syncMarker,
boolean doInitialSync)
Creates a new reader on the specified stream, using the given
encoding properties and synchronization marker.
|
SplitReader(SplitInputStream in,
CharsetEncoding charset,
String syncMarker,
boolean doInitialSync)
Creates a new reader on the specified stream, using the given
encoding properties and synchronization marker.
|
Modifier and Type | Method and Description |
---|---|
long |
charsRead()
Gets the character offset into the underlying split.
|
void |
close() |
boolean |
hasOverrun()
Indicates whether the reader has passed the end of
the underlying split.
|
void |
mark(int readAheadLimit) |
boolean |
markSupported() |
int |
read() |
int |
read(char[] cbuf) |
int |
read(char[] cbuf,
int off,
int len) |
int |
read(CharBuffer target) |
boolean |
readIfPresent(char[] chars)
Conditionally reads the input to see if the specified
characters are present.
|
boolean |
readLine(Appendable lineBuffer)
Reads a line of text into the specified buffer.
|
boolean |
ready() |
void |
reset() |
long |
skip(long n) |
boolean |
skipTo(char[] bytes)
Advances the position of the stream to the first character
after the specified pattern.
|
public static final int DEFAULT_BUFFER
public SplitReader(SplitInputStream in, CharsetEncoding charset)
in
- the input stream on the split being readcharset
- character set encoding propertiespublic SplitReader(SplitInputStream in, CharsetEncoding charset, int buffer)
in
- the input stream on the split being readcharset
- character set encoding propertiesbuffer
- the size of the decoding input buffer, in bytespublic SplitReader(SplitInputStream in, CharsetEncoding charset, String syncMarker, boolean doInitialSync) throws IOException
The reader will stop providing data after the first complete synchronization marker appearing after the end of the split.
in
- the input stream on the split being readcharset
- character set encoding propertiessyncMarker
- the character sequence to use to synchronize the read positionsdoInitialSync
- indicates whether to synchronize the read position before
the first readIOException
- if an I/O error occurs while performing initial position synchronizationpublic SplitReader(SplitInputStream in, CharsetEncoding charset, int buffer, String syncMarker, boolean doInitialSync) throws IOException
The reader will stop providing data after the first complete synchronization marker appearing after the end of the split.
in
- the input stream on the split being readcharset
- character set encoding propertiesbuffer
- the size of the decoding input buffer, in bytessyncMarker
- the character sequence to use to synchronize the read positionsdoInitialSync
- indicates whether to synchronize the read position before
the first readIOException
- if an I/O error occurs while performing initial position synchronizationpublic int read() throws IOException
read
in class Reader
IOException
public int read(char[] cbuf) throws IOException
read
in class Reader
IOException
public int read(char[] cbuf, int off, int len) throws IOException
read
in class Reader
IOException
public int read(CharBuffer target) throws IOException
read
in interface Readable
read
in class Reader
IOException
public long skip(long n) throws IOException
skip
in class Reader
IOException
public boolean markSupported()
markSupported
in class Reader
public void mark(int readAheadLimit) throws IOException
mark
in class Reader
IOException
public void reset() throws IOException
reset
in class Reader
IOException
public void close()
public boolean hasOverrun()
Because character encodings may be multiple bytes, the split may fall in the middle of a character. Overrun is flagged with the first character whose encoding has a byte beyond the end of the split.
true
if the current read position
is beyond the end of the split, false
otherwisepublic boolean readIfPresent(char[] chars) throws IOException
chars
- the character sequence which is to be checkedtrue
if the sequence was found, false
otherwiseIOException
- if an I/O error occurs during the readpublic long charsRead()
public boolean skipTo(char[] bytes) throws IOException
bytes
- the pattern to find in the streamfalse
if end of data is reached,
true
otherwise.IOException
- if an I/O error occurs while
reading the streampublic boolean readLine(Appendable lineBuffer) throws IOException
lineBuffer
- the buffer to which to append line data.false
if end of file has been reached during the readIOException
BufferedReader#readLine()
Copyright © 2020 Actian Corporation. All rights reserved.