- java.lang.Object
-
- com.pervasive.datarush.operators.AbstractLogicalOperator
-
- com.pervasive.datarush.operators.CompositeOperator
-
- com.pervasive.datarush.operators.io.AbstractReader
-
- com.pervasive.datarush.operators.io.textfile.AbstractTextReader
-
- com.pervasive.datarush.operators.io.textfile.ReadLog
-
- All Implemented Interfaces:
LogicalOperator
,RecordSourceOperator
,SourceOperator<RecordPort>
public class ReadLog extends AbstractTextReader
Reads a log file as record tokens. The supported log types are enumerated in SupportedLogType. The various log types can accept a format specifier string in the same format as would be accepted by the particular log producer unless otherwise specified. See the specific LogFormat class for the supported log type for more details. The output is determined by the format and the logs being read. The minimum configuration required is setting the LogType property and the source. The Reader will by default attempt to determine the newline character used by the log files, however this can be specified if known in advance. Log entries that cannot be parsed will not be present in the output while fields that cannot be parsed will be null valued. This operator is parallelizable.
-
-
Field Summary
-
Fields inherited from class com.pervasive.datarush.operators.io.textfile.AbstractTextReader
encodingProps
-
Fields inherited from class com.pervasive.datarush.operators.io.AbstractReader
options, output
-
-
Constructor Summary
Constructors Constructor Description ReadLog()
Reads an empty source with default settings.ReadLog(Path path)
Reads the file specified by the path as log text using default options.ReadLog(ByteSource source)
Reads the specified data source using default options.ReadLog(String pattern)
Reads all paths matching the specified pattern as log text using default options.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description ReadLog
clone()
Creates a copy of the reader with identical settings.protected DataFormat
computeFormat(CompositionContext ctx)
Determines the data format for the source.FormatAnalyzer.FormatAnalysis
discoverMetadata(FileClient ctx)
LogFormat
getLogFormat()
Gets the log format class which should be used when parsing the logs.String
getLogPattern()
Gets the formatting pattern to use when parsing the log.SupportedLogType
getLogType()
Gets the type of log this operator will read.String
getNewline()
Gets the newline characters to use when parsing the log.void
setAutoDiscoverFormat(boolean enabled)
In supported log types performs schema discovery if logPattern is not set or additional information is required.void
setAutoDiscoverNewline(boolean enabled)
Configures whether the reader attempts to discover the newline style (UNIX or DOS) used in the source.void
setLogFormat(LogFormat logFormat)
Sets the log format class which should be used when parsing the logs.void
setLogPattern(String logPattern)
Sets the formatting pattern to use when parsing the log.void
setLogType(SupportedLogType logType)
Sets the type of log this operator will read.void
setNewline(String newline)
Sets the newline characters to use when parsing the log.-
Methods inherited from class com.pervasive.datarush.operators.io.textfile.AbstractTextReader
getCharset, getCharsetName, getDecodeBuffer, getEncoding, getErrorAction, getReplacement, setCharset, setCharsetName, setDecodeBuffer, setEncoding, setErrorAction, setReplacement
-
Methods inherited from class com.pervasive.datarush.operators.io.AbstractReader
compose, getExtraFieldAction, getFieldErrorAction, getFieldLengthThreshold, getIncludeSourceInfo, getMissingFieldAction, getOutput, getParseOptions, getPessimisticSplitting, getReadBuffer, getReadOnClient, getRecordWarningThreshold, getSelectedFields, getSource, getSplitOptions, getUseMetadata, setExtraFieldAction, setFieldErrorAction, setFieldLengthThreshold, setIncludeSourceInfo, setMissingFieldAction, setParseErrorAction, setParseOptions, setPessimisticSplitting, setReadBuffer, setReadOnClient, setRecordWarningThreshold, setSelectedFields, setSelectedFields, setSource, setSource, setSource, setSplitOptions, setUseMetadata
-
Methods inherited from class com.pervasive.datarush.operators.AbstractLogicalOperator
disableParallelism, getInputPorts, getOutputPorts, newInput, newInput, newOutput, newRecordInput, newRecordInput, newRecordOutput, notifyError
-
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface com.pervasive.datarush.operators.LogicalOperator
disableParallelism, getInputPorts, getOutputPorts
-
-
-
-
Constructor Detail
-
ReadLog
public ReadLog()
Reads an empty source with default settings. The source must be set before execution or an error will be raised.- See Also:
AbstractReader.setSource(ByteSource)
-
ReadLog
public ReadLog(String pattern)
Reads all paths matching the specified pattern as log text using default options. Any matching path which is a directory is replaced with all files in the directory; this expansion is not applied recursively.- Parameters:
pattern
- a path-matching pattern- See Also:
FileClient.matchPaths(String)
-
ReadLog
public ReadLog(Path path)
Reads the file specified by the path as log text using default options. If the path refers to a a directory, all files in the directory are read; this expansion is not applied recursively.- Parameters:
path
- the path to read
-
ReadLog
public ReadLog(ByteSource source)
Reads the specified data source using default options.- Parameters:
source
- the data source to read
-
-
Method Detail
-
clone
public ReadLog clone()
Creates a copy of the reader with identical settings. This is a deep copy; subsequent changes in the reader are not reflected in the clone and vice-versa.
-
getLogType
public SupportedLogType getLogType()
Gets the type of log this operator will read.- Returns:
- the SupportedLogType of this reader
-
setLogType
public void setLogType(SupportedLogType logType)
Sets the type of log this operator will read.- Parameters:
logType
- the type of log to read
-
getLogFormat
public LogFormat getLogFormat()
Gets the log format class which should be used when parsing the logs.- Returns:
- the log format class to use
-
setLogFormat
public void setLogFormat(LogFormat logFormat)
Sets the log format class which should be used when parsing the logs. This setting is mutually exclusive with LogType.- Parameters:
logFormat
- the log format class to use
-
getLogPattern
public String getLogPattern()
Gets the formatting pattern to use when parsing the log.- Returns:
- the format pattern to use for parsing
-
setLogPattern
public void setLogPattern(String logPattern)
Sets the formatting pattern to use when parsing the log.- Parameters:
logPattern
- the format pattern to use for parsing
-
getNewline
public String getNewline()
Gets the newline characters to use when parsing the log.- Returns:
- the newline characters to use for parsing
-
setNewline
public void setNewline(String newline)
Sets the newline characters to use when parsing the log. Disables auto discovery of newline.- Parameters:
newline
- the newline characters to use for parsing
-
setAutoDiscoverFormat
public void setAutoDiscoverFormat(boolean enabled)
In supported log types performs schema discovery if logPattern is not set or additional information is required. Defaults to enabled.- Parameters:
enabled
- indicates whether to enable format discovery
-
setAutoDiscoverNewline
public void setAutoDiscoverNewline(boolean enabled)
Configures whether the reader attempts to discover the newline style (UNIX or DOS) used in the source. The discovered newline is then used as the record separator. If enabled, the source will be read just prior to graph execution. If reading multiple files, the newline style is determined using the first file.- Parameters:
enabled
- indicates whether to enable newline discovery
-
discoverMetadata
public FormatAnalyzer.FormatAnalysis discoverMetadata(FileClient ctx)
-
computeFormat
protected DataFormat computeFormat(CompositionContext ctx)
Description copied from class:AbstractReader
Determines the data format for the source. The returned format is used during composition to construct aReadSource
operator. If an implementation supports schema discovery, it must be performed in this method.- Specified by:
computeFormat
in classAbstractReader
- Parameters:
ctx
- the composition context for the current invocation ofAbstractReader.compose(CompositionContext)
- Returns:
- the source format to use
-
-