Interface DataFormat.DataParser

  • All Known Implementing Classes:
    AbstractRegexLogFormat.RegexParser
    Enclosing interface:
    DataFormat

    public static interface DataFormat.DataParser
    A parser for record data in binary or text format. A DataParser bridges the gap between records within and outside of a dataflow graph An implementation represents such a mapping for some concrete external format.
    See Also:
    DataFormat.DataFormatter
    • Method Detail

      • bindOutput

        void bindOutput​(RecordSettable target)
        Called to provide the target port onto which parsed records are pushed.

        This method is called once, before any any calls to parseSplit(SplitParsingContext) are made. Any one-time initialization for the parser should be performed within the implementation of this method.

        Parameters:
        target - the output port which receives parsed records
      • parseSplit

        void parseSplit​(SplitParsingContext ctx)
        Called to convert an input split into records.

        This method may be called one or more times, once for each split needing to be parsed. The implementation is expected to publish all records found within the split to the output passed to bindOutput(RecordSettable). Parsing errors and exceptions should be reported through the provided context.

        Parameters:
        split -
      • release

        void release()
        Called to signal that parsing is complete.

        This method is called once, after the last call to #parseSplit(DataSplit). Any allocated resources which need to be released should be handled within the implementation of this method.