All Implemented Interfaces:
LogicalOperator, RecordSourceOperator, SourceOperator<RecordPort>

public class ReadLog extends AbstractTextReader
Reads a log file as record tokens. The supported log types are enumerated in SupportedLogType. The various log types can accept a format specifier string in the same format as would be accepted by the particular log producer unless otherwise specified. See the specific LogFormat class for the supported log type for more details. The output is determined by the format and the logs being read. The minimum configuration required is setting the LogType property and the source. The Reader will by default attempt to determine the newline character used by the log files, however this can be specified if known in advance. Log entries that cannot be parsed will not be present in the output while fields that cannot be parsed will be null valued. This operator is parallelizable.
  • Constructor Details

    • ReadLog

      public ReadLog()
      Reads an empty source with default settings. The source must be set before execution or an error will be raised.
      See Also:
    • ReadLog

      public ReadLog(String pattern)
      Reads all paths matching the specified pattern as log text using default options. Any matching path which is a directory is replaced with all files in the directory; this expansion is not applied recursively.
      Parameters:
      pattern - a path-matching pattern
      See Also:
    • ReadLog

      public ReadLog(Path path)
      Reads the file specified by the path as log text using default options. If the path refers to a a directory, all files in the directory are read; this expansion is not applied recursively.
      Parameters:
      path - the path to read
    • ReadLog

      public ReadLog(ByteSource source)
      Reads the specified data source using default options.
      Parameters:
      source - the data source to read
  • Method Details

    • clone

      public ReadLog clone()
      Creates a copy of the reader with identical settings. This is a deep copy; subsequent changes in the reader are not reflected in the clone and vice-versa.
      Overrides:
      clone in class Object
    • getLogType

      public SupportedLogType getLogType()
      Gets the type of log this operator will read.
      Returns:
      the SupportedLogType of this reader
    • setLogType

      public void setLogType(SupportedLogType logType)
      Sets the type of log this operator will read.
      Parameters:
      logType - the type of log to read
    • getLogFormat

      public LogFormat getLogFormat()
      Gets the log format class which should be used when parsing the logs.
      Returns:
      the log format class to use
    • setLogFormat

      public void setLogFormat(LogFormat logFormat)
      Sets the log format class which should be used when parsing the logs. This setting is mutually exclusive with LogType.
      Parameters:
      logFormat - the log format class to use
    • getLogPattern

      public String getLogPattern()
      Gets the formatting pattern to use when parsing the log.
      Returns:
      the format pattern to use for parsing
    • setLogPattern

      public void setLogPattern(String logPattern)
      Sets the formatting pattern to use when parsing the log.
      Parameters:
      logPattern - the format pattern to use for parsing
    • getNewline

      public String getNewline()
      Gets the newline characters to use when parsing the log.
      Returns:
      the newline characters to use for parsing
    • setNewline

      public void setNewline(String newline)
      Sets the newline characters to use when parsing the log. Disables auto discovery of newline.
      Parameters:
      newline - the newline characters to use for parsing
    • setAutoDiscoverFormat

      public void setAutoDiscoverFormat(boolean enabled)
      In supported log types performs schema discovery if logPattern is not set or additional information is required. Defaults to enabled.
      Parameters:
      enabled - indicates whether to enable format discovery
    • setAutoDiscoverNewline

      public void setAutoDiscoverNewline(boolean enabled)
      Configures whether the reader attempts to discover the newline style (UNIX or DOS) used in the source. The discovered newline is then used as the record separator. If enabled, the source will be read just prior to graph execution. If reading multiple files, the newline style is determined using the first file.
      Parameters:
      enabled - indicates whether to enable newline discovery
    • discoverMetadata

      public FormatAnalyzer.FormatAnalysis discoverMetadata(FileClient ctx)
    • computeFormat

      protected DataFormat computeFormat(CompositionContext ctx)
      Description copied from class: AbstractReader
      Determines the data format for the source. The returned format is used during composition to construct a ReadSource operator. If an implementation supports schema discovery, it must be performed in this method.
      Specified by:
      computeFormat in class AbstractReader
      Parameters:
      ctx - the composition context for the current invocation of AbstractReader.compose(CompositionContext)
      Returns:
      the source format to use