-
public interface SplitParsingContext
An object representing the context of a data split parsing operation. Parsers use the context to:- Get the split to be parsed
- Publish parsed records
- Handle parsing errors which may arise
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description void
bulkPublish(RecordTokenSequence records)
Provides a set of records to publish.void
discardRecord()
Signals that the current record should be ignored.DataSplit
getSplit()
Gets the split being parsed.void
handleExtraField(String message)
Reports an extra field being found in a record.void
handleFieldError(String message)
Reports a field parsing error.void
handleMissingFields(String message)
Reports a record being found to having missing fields.void
handleParseException(long offsetInSplit, Exception e)
Reports an exception occurring during parsing of the split.void
publishRecord()
Signals that the current record is ready to be published.void
startRecord(long offsetInSplit)
Establishes the context for the current record.
-
-
-
Method Detail
-
getSplit
DataSplit getSplit()
Gets the split being parsed.- Returns:
- the currently parsed split
-
startRecord
void startRecord(long offsetInSplit)
Establishes the context for the current record. Errors messages and published records will be associated with this information.- Parameters:
offsetInSplit
- the offset within the split at which the record begins. This offset should be either in bytes or characters as appropriate for the format.
-
publishRecord
void publishRecord()
Signals that the current record is ready to be published. Field values are set in the buffers provided toDataParser#bindOutput(RecordSettable)
.
-
discardRecord
void discardRecord()
Signals that the current record should be ignored. Field values in the buffers provided toDataParser#bindOutput(RecordSettable)
should be discarded.
-
bulkPublish
void bulkPublish(RecordTokenSequence records)
Provides a set of records to publish.This method is intended only for column oriented block formats which assemble multiple records at once. For row oriented formats,
publishRecord()
should be used instead.- Parameters:
records
-
-
handleFieldError
void handleFieldError(String message)
Reports a field parsing error.- Parameters:
message
- additional information about the error. The message is interpreted within the context of the current split, so this data need not be included.
-
handleMissingFields
void handleMissingFields(String message)
Reports a record being found to having missing fields.- Parameters:
message
- additional information about the error. The message is interpreted within the context of the current split, so this data need not be included.
-
handleExtraField
void handleExtraField(String message)
Reports an extra field being found in a record.- Parameters:
message
- additional information about the error. The message is interpreted within the context of the current split, so this data need not be included.
-
handleParseException
void handleParseException(long offsetInSplit, Exception e)
Reports an exception occurring during parsing of the split. A parser should invoke this when an exception occurs withinDataParser#parseSplit(SplitParsingContext)
.- Parameters:
offsetInSplit
- the current offset, in bytes or characters, within the split when the error occurred. The appropriate units for the format should be used.e
- the exception that occurred
-
-