public class ReadFixedText extends AbstractTextReader
The reader requires a FixedWidthTextRecord
object to provide field position
as well as parsing and type information for fields. The schema, in conjunction
with any specified field filter, defines the output type of the parser. These
can be manually constructed via the API provided, although this metadata is often
persisted externally. StructuredSchemaReader
provides support for reading
in Pervasive DataIntegrator structured schema descriptors (.schema files) for
use with readers.
Normally, the output of the parsing includes all records in the file, both those with and without parsing errors. Fields which can not be parsed are null valued in the resulting record. If desired, the reader can be configured to filter failed records from the output.
Since record boundaries occur at known positions, fixed text files can be parsed in parallel.
encodingProps
options, output
Constructor and Description |
---|
ReadFixedText()
Reads an empty source with default settings.
|
ReadFixedText(ByteSource source)
Reads the specified data source using default
options.
|
ReadFixedText(Path path)
Reads the file specified by the path as fixed text
using default options.
|
ReadFixedText(String pattern)
Reads all paths matching the specified pattern
as fixed text using default options.
|
Modifier and Type | Method and Description |
---|---|
ReadFixedText |
clone() |
protected DataFormat |
computeFormat(CompositionContext ctx)
Determines the data format for the source.
|
String |
getLineComment()
Get the value that indicates a line of data is a comment.
|
String |
getRecordSeparator()
Get the record separator property.
|
FixedWidthTextRecord |
getSchema()
Get the input schema property.
|
void |
setLineComment(String lineComment)
Set the text that represents that a line of input data is a comment.
|
void |
setRecordSeparator(String separator)
Set the string that represents the separator between records in the
input file.
|
void |
setSchema(FixedWidthTextRecord schema)
Set the schema of the input data to read.
|
getCharset, getCharsetName, getDecodeBuffer, getEncoding, getErrorAction, getReplacement, setCharset, setCharsetName, setDecodeBuffer, setEncoding, setErrorAction, setReplacement
compose, getExtraFieldAction, getFieldErrorAction, getFieldLengthThreshold, getIncludeSourceInfo, getMissingFieldAction, getOutput, getParseOptions, getPessimisticSplitting, getReadBuffer, getReadOnClient, getRecordWarningThreshold, getSelectedFields, getSource, getSplitOptions, getUseMetadata, setExtraFieldAction, setFieldErrorAction, setFieldLengthThreshold, setIncludeSourceInfo, setMissingFieldAction, setParseErrorAction, setParseOptions, setPessimisticSplitting, setReadBuffer, setReadOnClient, setRecordWarningThreshold, setSelectedFields, setSelectedFields, setSource, setSource, setSource, setSplitOptions, setUseMetadata
disableParallelism, getInputPorts, getOutputPorts, newInput, newInput, newOutput, newRecordInput, newRecordInput, newRecordOutput, notifyError
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
disableParallelism, getInputPorts, getOutputPorts
public ReadFixedText()
public ReadFixedText(String pattern)
The schema must be set before execution or an error will be raised.
pattern
- a path-matching patternsetSchema(FixedWidthTextRecord)
,
FileClient#matchPaths(String)
public ReadFixedText(Path path)
The schema must be set before execution or an error will be raised.
path
- the path to readsetSchema(FixedWidthTextRecord)
public ReadFixedText(ByteSource source)
source
- the data source to readsetSchema(FixedWidthTextRecord)
public ReadFixedText clone()
public void setRecordSeparator(String separator)
By default the record separator is set to the native filesystem separator for the architecture on which the application is running. This is normally divided into Unix/Linux style and Windows style end of record delimiters.
separator
- record separatorpublic String getRecordSeparator()
public void setSchema(FixedWidthTextRecord schema)
schema
- required input schemapublic FixedWidthTextRecord getSchema()
public String getLineComment()
public void setLineComment(String lineComment)
By default this option has a null value indicating that no comment lines are contained within the data.
lineComment
- text representing a commentprotected DataFormat computeFormat(CompositionContext ctx)
AbstractReader
ReadSource
operator. If an
implementation supports schema discovery, it must be
performed in this method.computeFormat
in class AbstractReader
ctx
- the composition context for the current invocation
of AbstractReader.compose(CompositionContext)
Copyright © 2016 Actian Corporation. All rights reserved.