public class DelimitedTextFormat extends Object implements DataFormat
ReadDelimitedText and
WriteDelimitedText to access data stored as delimited text.DataFormat.DataFormatter, DataFormat.DataParser| Constructor and Description |
|---|
DelimitedTextFormat(RecordTextSchema<?> schema,
FieldDelimiterSettings delimiters,
CharsetEncoding encoding)
Create a data format for accessing delimited text data.
|
DelimitedTextFormat(RecordTextSchema<?> schema,
FieldDelimiterSettings delimiters,
CharsetEncoding encoding,
FileMetadata metadata)
Create a data format for accessing delimited text data.
|
DelimitedTextFormat(RecordTextSchema<?> schema,
FieldDelimiterSettings delimiters,
CharsetEncoding encoding,
FileMetadata metadata,
boolean hasHeader,
String lineComment,
int skipCount)
Create a data format for accessing delimited text data.
|
| Modifier and Type | Method and Description |
|---|---|
DataFormat.DataParser |
createParser(ParsingOptions options)
Create a new parser for the format using the specified parsing options.
|
DataFormat.DataFormatter |
createWriter(FormattingOptions options)
Create a new writer for the format using the specified formatting options.
|
FileMetadata |
getMetadata()
Gets the metadata associated with the format.
|
RecordTokenType |
getType()
Gets the record type associated with the format.
|
boolean |
isSplittable()
Indicates if the format supports parsing of subsections
of a file.
|
FileMetadata |
readMetadata(FileClient fileClient,
ByteSource source)
Reads the metadata associated with the format.
|
void |
setMetadata(FileMetadata metadata)
Sets the metadata associated with the format.
|
void |
writeMetadata(FileMetadata metadata,
FileClient fileClient,
ByteSink target)
Writes the provided metadata associated with the format.
|
public DelimitedTextFormat(RecordTextSchema<?> schema, FieldDelimiterSettings delimiters, CharsetEncoding encoding)
schema - the schema to use for records. This provides
fields names as well as formatting information for field values.delimiters - a description of the delimiters used in the textencoding - character set definition for data encoding in the textpublic DelimitedTextFormat(RecordTextSchema<?> schema, FieldDelimiterSettings delimiters, CharsetEncoding encoding, FileMetadata metadata)
schema - the schema to use for records. This provides
fields names as well as formatting information for field values.delimiters - a description of the delimiters used in the textencoding - character set definition for data encoding in the textmetadata - the metadata associated with the datapublic DelimitedTextFormat(RecordTextSchema<?> schema, FieldDelimiterSettings delimiters, CharsetEncoding encoding, FileMetadata metadata, boolean hasHeader, String lineComment, int skipCount)
schema - the schema to use for records. This provides
fields names as well as formatting information for field values.delimiters - a description of the delimiters used in the textencoding - character set definition for data encodingmetadata - the metadata associated with the datahasHeader - indicates whether the first record is a header.lineComment - characters used to indicate line comments
in the textskipCount - public boolean isSplittable()
A format should only return true if it can,
at least in some situations, support this sort of parsing.
If a format requires reading the entire file, it
must return false.
If a format is not splittable, a file in the format cannot be parsed in parallel; however, individual files can still be parsed independently in parallel, as when reading the contents of a directory or using a file globbing pattern.
Generally, delimited text data is splittable. However, if any fields contain the record separator in their delimited value, it may not be.
isSplittable in interface DataFormattrue if the format supports parsing
only a portion of the file, false otherwisepublic RecordTokenType getType()
DataFormatFor many formats, this may be derived from a schema object describing the format layout.
getType in interface DataFormatpublic FileMetadata getMetadata()
DataFormatgetMetadata in interface DataFormatpublic void setMetadata(FileMetadata metadata)
DataFormatsetMetadata in interface DataFormatpublic FileMetadata readMetadata(FileClient fileClient, ByteSource source)
DataFormatreadMetadata in interface DataFormatfileClient - client used to read filesource - location of the filespublic void writeMetadata(FileMetadata metadata, FileClient fileClient, ByteSink target)
DataFormatwriteMetadata in interface DataFormatmetadata - the metadata to writefileClient - client used to write filepublic DataFormat.DataParser createParser(ParsingOptions options)
DataFormatcreateParser in interface DataFormatoptions - parsing options to usepublic DataFormat.DataFormatter createWriter(FormattingOptions options)
DataFormatcreateWriter in interface DataFormatoptions - formatting options to useCopyright © 2016 Actian Corporation. All rights reserved.