public interface DataFormat
DataFormat object provides
the necessary information for reading and
writing external data, converting it to
and from records in a dataflow graph.
Many formats are predefined in the library;
an implementation is only required if a new
format needs to be defined. Normally, it
is not necessary to work directly with
formats. Instead, operators are provided
which hide the DataFormat object
and present a view more appropriate to
the specific format. Examples of this technique are
the ReadDelimitedText and WriteDelimitedText
operators.
ReadSource,
WriteSink| Modifier and Type | Interface and Description |
|---|---|
static interface |
DataFormat.DataFormatter
A formatter for converting record data to binary or
text format.
|
static interface |
DataFormat.DataParser
A parser for record data in binary or text format.
|
| Modifier and Type | Method and Description |
|---|---|
DataFormat.DataParser |
createParser(ParsingOptions options)
Create a new parser for the format using the specified parsing options.
|
DataFormat.DataFormatter |
createWriter(FormattingOptions options)
Create a new writer for the format using the specified formatting options.
|
FileMetadata |
getMetadata()
Gets the metadata associated with the format.
|
RecordTokenType |
getType()
Gets the record type associated with the format.
|
boolean |
isSplittable()
Indicates if the format supports parsing of subsections
of a file.
|
FileMetadata |
readMetadata(FileClient fileClient,
ByteSource source)
Reads the metadata associated with the format.
|
void |
setMetadata(FileMetadata metadata)
Sets the metadata associated with the format.
|
void |
writeMetadata(FileMetadata metadata,
FileClient fileClient,
ByteSink target)
Writes the provided metadata associated with the format.
|
RecordTokenType getType()
For many formats, this may be derived from a schema object describing the format layout.
FileMetadata getMetadata()
void setMetadata(FileMetadata metadata)
FileMetadata readMetadata(FileClient fileClient, ByteSource source)
fileClient - client used to read filesource - location of the filesvoid writeMetadata(FileMetadata metadata, FileClient fileClient, ByteSink target)
metadata - the metadata to writefileClient - client used to write filesource - location of the filesDataFormat.DataParser createParser(ParsingOptions options)
options - parsing options to useDataFormat.DataFormatter createWriter(FormattingOptions options)
options - formatting options to useboolean isSplittable()
A format should only return true if it can,
at least in some situations, support this sort of parsing.
If a format requires reading the entire file, it
must return false.
If a format is not splittable, a file in the format cannot be parsed in parallel; however, individual files can still be parsed independently in parallel, as when reading the contents of a directory or using a file globbing pattern.
true if the format supports parsing
only a portion of the file, false otherwiseCopyright © 2016 Actian Corporation. All rights reserved.