- java.lang.Object
-
- com.pervasive.datarush.operators.io.textfile.AbstractRegexLogFormat
-
- com.pervasive.datarush.operators.io.textfile.CLFLogFormat
-
- All Implemented Interfaces:
LogFormat
public class CLFLogFormat extends AbstractRegexLogFormat
Describes the format of a web server log in NCSA Common log format.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class com.pervasive.datarush.operators.io.textfile.AbstractRegexLogFormat
AbstractRegexLogFormat.RegexParser
-
-
Field Summary
Fields Modifier and Type Field Description protected RecordTextSchema<?>
schema
-
Fields inherited from class com.pervasive.datarush.operators.io.textfile.AbstractRegexLogFormat
formatPattern, logType
-
-
Constructor Summary
Constructors Constructor Description CLFLogFormat()
Create a log format for accessing common log format data.CLFLogFormat(String formatPattern)
Create a log format for accessing common log format data.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description DataFormat.DataParser
createParser(ParsingOptions options, CharsetEncoding charEncoding, String newline)
RecordTextSchema<?>
getSchema()
Gets the record schema of the source.RecordTokenType
getType()
Gets the record type associated with the format.boolean
isSplittable()
Indicates if the format supports parsing of subsections of a file.protected void
refreshSchema()
Refresh and recalculate the schema.-
Methods inherited from class com.pervasive.datarush.operators.io.textfile.AbstractRegexLogFormat
analyzeFormat, getFormatPattern, getLogType, setAnalysis, setFormatPattern
-
-
-
-
Field Detail
-
schema
protected RecordTextSchema<?> schema
-
-
Constructor Detail
-
CLFLogFormat
public CLFLogFormat()
Create a log format for accessing common log format data.
-
CLFLogFormat
public CLFLogFormat(String formatPattern)
Create a log format for accessing common log format data.- Parameters:
formatPattern
-
-
-
Method Detail
-
getType
public RecordTokenType getType()
Description copied from interface:LogFormat
Gets the record type associated with the format. Records produced by the associated parser or consumed by the associated formatter will be of this type.For many formats, this may be derived from a schema object describing the format layout.
- Returns:
- the format's record type
-
getSchema
public RecordTextSchema<?> getSchema()
Description copied from class:AbstractRegexLogFormat
Gets the record schema of the source.- Specified by:
getSchema
in classAbstractRegexLogFormat
- Returns:
- the record schema of the source
-
refreshSchema
protected void refreshSchema()
Description copied from class:AbstractRegexLogFormat
Refresh and recalculate the schema. This is usually done after changing a setting.- Specified by:
refreshSchema
in classAbstractRegexLogFormat
-
isSplittable
public boolean isSplittable()
Description copied from interface:LogFormat
Indicates if the format supports parsing of subsections of a file.A format should only return
true
if it can, at least in some situations, support this sort of parsing. If a format requires reading the entire file, it must returnfalse
.If a format is not splittable, a file in the format cannot be parsed in parallel; however, individual files can still be parsed independently in parallel, as when reading the contents of a directory or using a file globbing pattern.
- Specified by:
isSplittable
in interfaceLogFormat
- Overrides:
isSplittable
in classAbstractRegexLogFormat
- Returns:
true
if the format supports parsing only a portion of the file,false
otherwise
-
createParser
public DataFormat.DataParser createParser(ParsingOptions options, CharsetEncoding charEncoding, String newline)
-
-