public interface DataFormat
DataFormat
object provides
the necessary information for reading and
writing external data, converting it to
and from records in a dataflow graph.
Many formats are predefined in the library;
an implementation is only required if a new
format needs to be defined. Normally, it
is not necessary to work directly with
formats. Instead, operators are provided
which hide the DataFormat
object
and present a view more appropriate to
the specific format. Examples of this technique are
the ReadDelimitedText
and WriteDelimitedText
operators.
ReadSource
,
WriteSink
Modifier and Type | Interface and Description |
---|---|
static interface |
DataFormat.DataFormatter
A formatter for converting record data to binary or
text format.
|
static interface |
DataFormat.DataParser
A parser for record data in binary or text format.
|
Modifier and Type | Method and Description |
---|---|
DataFormat.DataParser |
createParser(ParsingOptions options)
Create a new parser for the format using the specified parsing options.
|
DataFormat.DataFormatter |
createWriter(FormattingOptions options)
Create a new writer for the format using the specified formatting options.
|
FileMetadata |
getMetadata()
Gets the metadata associated with the format.
|
RecordTokenType |
getType()
Gets the record type associated with the format.
|
boolean |
isSplittable()
Indicates if the format supports parsing of subsections
of a file.
|
FileMetadata |
readMetadata(FileClient fileClient,
ByteSource source)
Reads the metadata associated with the format.
|
void |
setMetadata(FileMetadata metadata)
Sets the metadata associated with the format.
|
void |
writeMetadata(FileMetadata metadata,
FileClient fileClient,
ByteSink target)
Writes the provided metadata associated with the format.
|
RecordTokenType getType()
For many formats, this may be derived from a schema object describing the format layout.
FileMetadata getMetadata()
void setMetadata(FileMetadata metadata)
FileMetadata readMetadata(FileClient fileClient, ByteSource source)
fileClient
- client used to read filesource
- location of the filesvoid writeMetadata(FileMetadata metadata, FileClient fileClient, ByteSink target)
metadata
- the metadata to writefileClient
- client used to write filesource
- location of the filesDataFormat.DataParser createParser(ParsingOptions options)
options
- parsing options to useDataFormat.DataFormatter createWriter(FormattingOptions options)
options
- formatting options to useboolean isSplittable()
A format should only return true
if it can,
at least in some situations, support this sort of parsing.
If a format requires reading the entire file, it
must return false
.
If a format is not splittable, a file in the format cannot be parsed in parallel; however, individual files can still be parsed independently in parallel, as when reading the contents of a directory or using a file globbing pattern.
true
if the format supports parsing
only a portion of the file, false
otherwiseCopyright © 2020 Actian Corporation. All rights reserved.