java.lang.Object
com.pervasive.datarush.operators.AbstractLogicalOperator
com.pervasive.datarush.operators.CompositeOperator
com.pervasive.datarush.operators.io.AbstractReader
com.pervasive.datarush.operators.io.textfile.AbstractTextReader
com.pervasive.datarush.operators.io.textfile.ReadLog
- All Implemented Interfaces:
LogicalOperator,RecordSourceOperator,SourceOperator<RecordPort>
Reads a log file as record tokens. The supported log types are
enumerated in SupportedLogType. The various log types can accept
a format specifier string in the same format as would be accepted
by the particular log producer unless otherwise specified.
See the specific LogFormat class for the supported log type
for more details. The output is determined by the format and
the logs being read. The minimum configuration required is
setting the LogType property and the source. The Reader will by
default attempt to determine the newline character used by
the log files, however this can be specified if known in advance.
Log entries that cannot be parsed will not be present in the output
while fields that cannot be parsed will be null valued.
This operator is parallelizable.
-
Field Summary
Fields inherited from class com.pervasive.datarush.operators.io.textfile.AbstractTextReader
encodingPropsFields inherited from class com.pervasive.datarush.operators.io.AbstractReader
options, output -
Constructor Summary
ConstructorsConstructorDescriptionReadLog()Reads an empty source with default settings.Reads the file specified by the path as log text using default options.ReadLog(ByteSource source) Reads the specified data source using default options.Reads all paths matching the specified pattern as log text using default options. -
Method Summary
Modifier and TypeMethodDescriptionclone()Creates a copy of the reader with identical settings.protected DataFormatDetermines the data format for the source.Gets the log format class which should be used when parsing the logs.Gets the formatting pattern to use when parsing the log.Gets the type of log this operator will read.Gets the newline characters to use when parsing the log.voidsetAutoDiscoverFormat(boolean enabled) In supported log types performs schema discovery if logPattern is not set or additional information is required.voidsetAutoDiscoverNewline(boolean enabled) Configures whether the reader attempts to discover the newline style (UNIX or DOS) used in the source.voidsetLogFormat(LogFormat logFormat) Sets the log format class which should be used when parsing the logs.voidsetLogPattern(String logPattern) Sets the formatting pattern to use when parsing the log.voidsetLogType(SupportedLogType logType) Sets the type of log this operator will read.voidsetNewline(String newline) Sets the newline characters to use when parsing the log.Methods inherited from class com.pervasive.datarush.operators.io.textfile.AbstractTextReader
getCharset, getCharsetName, getDecodeBuffer, getEncoding, getErrorAction, getReplacement, setCharset, setCharsetName, setDecodeBuffer, setEncoding, setErrorAction, setReplacementMethods inherited from class com.pervasive.datarush.operators.io.AbstractReader
compose, getExtraFieldAction, getFieldErrorAction, getFieldLengthThreshold, getIncludeSourceInfo, getMissingFieldAction, getOutput, getParseOptions, getPessimisticSplitting, getReadBuffer, getReadOnClient, getRecordWarningThreshold, getSelectedFields, getSource, getSplitOptions, getUseMetadata, setExtraFieldAction, setFieldErrorAction, setFieldLengthThreshold, setIncludeSourceInfo, setMissingFieldAction, setParseErrorAction, setParseOptions, setPessimisticSplitting, setReadBuffer, setReadOnClient, setRecordWarningThreshold, setSelectedFields, setSelectedFields, setSource, setSource, setSource, setSplitOptions, setUseMetadataMethods inherited from class com.pervasive.datarush.operators.AbstractLogicalOperator
disableParallelism, getInputPorts, getOutputPorts, newInput, newInput, newOutput, newRecordInput, newRecordInput, newRecordOutput, notifyErrorMethods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface com.pervasive.datarush.operators.LogicalOperator
disableParallelism, getInputPorts, getOutputPorts
-
Constructor Details
-
ReadLog
public ReadLog()Reads an empty source with default settings. The source must be set before execution or an error will be raised.- See Also:
-
ReadLog
Reads all paths matching the specified pattern as log text using default options. Any matching path which is a directory is replaced with all files in the directory; this expansion is not applied recursively.- Parameters:
pattern- a path-matching pattern- See Also:
-
ReadLog
Reads the file specified by the path as log text using default options. If the path refers to a a directory, all files in the directory are read; this expansion is not applied recursively.- Parameters:
path- the path to read
-
ReadLog
Reads the specified data source using default options.- Parameters:
source- the data source to read
-
-
Method Details
-
clone
Creates a copy of the reader with identical settings. This is a deep copy; subsequent changes in the reader are not reflected in the clone and vice-versa. -
getLogType
Gets the type of log this operator will read.- Returns:
- the SupportedLogType of this reader
-
setLogType
Sets the type of log this operator will read.- Parameters:
logType- the type of log to read
-
getLogFormat
Gets the log format class which should be used when parsing the logs.- Returns:
- the log format class to use
-
setLogFormat
Sets the log format class which should be used when parsing the logs. This setting is mutually exclusive with LogType.- Parameters:
logFormat- the log format class to use
-
getLogPattern
Gets the formatting pattern to use when parsing the log.- Returns:
- the format pattern to use for parsing
-
setLogPattern
Sets the formatting pattern to use when parsing the log.- Parameters:
logPattern- the format pattern to use for parsing
-
getNewline
Gets the newline characters to use when parsing the log.- Returns:
- the newline characters to use for parsing
-
setNewline
Sets the newline characters to use when parsing the log. Disables auto discovery of newline.- Parameters:
newline- the newline characters to use for parsing
-
setAutoDiscoverFormat
public void setAutoDiscoverFormat(boolean enabled) In supported log types performs schema discovery if logPattern is not set or additional information is required. Defaults to enabled.- Parameters:
enabled- indicates whether to enable format discovery
-
setAutoDiscoverNewline
public void setAutoDiscoverNewline(boolean enabled) Configures whether the reader attempts to discover the newline style (UNIX or DOS) used in the source. The discovered newline is then used as the record separator. If enabled, the source will be read just prior to graph execution. If reading multiple files, the newline style is determined using the first file.- Parameters:
enabled- indicates whether to enable newline discovery
-
discoverMetadata
-
computeFormat
Description copied from class:AbstractReaderDetermines the data format for the source. The returned format is used during composition to construct aReadSourceoperator. If an implementation supports schema discovery, it must be performed in this method.- Specified by:
computeFormatin classAbstractReader- Parameters:
ctx- the composition context for the current invocation ofAbstractReader.compose(CompositionContext)- Returns:
- the source format to use
-