public class ARFFDataFormat extends Object implements DataFormat
DataFormat.DataFormatter, DataFormat.DataParser
Constructor and Description |
---|
ARFFDataFormat(ARFFMode mode,
TextRecord schema,
CharsetEncoding encoding,
char fieldDelimiter)
Create an ARFF data format for reading an ARFF file.
|
ARFFDataFormat(ARFFMode mode,
TextRecord schema,
CharsetEncoding encoding,
String recordSeparator,
char fieldDelimiter,
String relationName,
List<String> comments)
Create an ARFF data format for writing to an ARFF file.
|
Modifier and Type | Method and Description |
---|---|
DataFormat.DataParser |
createParser(ParsingOptions options)
Create a new parser for the format using the specified parsing options.
|
DataFormat.DataFormatter |
createWriter(FormattingOptions options)
Create a new writer for the format using the specified formatting options.
|
FileMetadata |
getMetadata()
Gets the metadata associated with the format.
|
RecordTokenType |
getType()
Gets the record type associated with the format.
|
boolean |
isSplittable()
Indicates if the format supports parsing of subsections
of a file.
|
FileMetadata |
readMetadata(FileClient fileClient,
ByteSource source)
Reads the metadata associated with the format.
|
void |
setMetadata(FileMetadata metadata)
Sets the metadata associated with the format.
|
void |
writeMetadata(FileMetadata metadata,
FileClient fileClient,
ByteSink target)
Writes the provided metadata associated with the format.
|
public ARFFDataFormat(ARFFMode mode, TextRecord schema, CharsetEncoding encoding, String recordSeparator, char fieldDelimiter, String relationName, List<String> comments)
mode
- indicates the mode for representing record dataschema
- the schema to use for records. This provides
field names (and positions) as well as the formatting
of field values.encoding
- character set definition for data encodingrecordSeparator
- characters to use as a separator between
recordsfieldDelimiter
- the character use to delimit field values
when the value contains spacesrelationName
- name of the relationcomments
- comments to associate with data written using this
format.public ARFFDataFormat(ARFFMode mode, TextRecord schema, CharsetEncoding encoding, char fieldDelimiter)
mode
- indicates the mode for representing record dataschema
- the schema to use for records. This provides
field names (and positions) as well as the formatting
of field values. This schema also controls the output type
of the reader.encoding
- character set definition for data encodingfieldDelimiter
- the character use to delimit field values
when the value contains spacespublic RecordTokenType getType()
DataFormat
For many formats, this may be derived from a schema object describing the format layout.
getType
in interface DataFormat
public FileMetadata getMetadata()
DataFormat
getMetadata
in interface DataFormat
public void setMetadata(FileMetadata metadata)
DataFormat
setMetadata
in interface DataFormat
public FileMetadata readMetadata(FileClient fileClient, ByteSource source)
DataFormat
readMetadata
in interface DataFormat
fileClient
- client used to read filesource
- location of the filespublic void writeMetadata(FileMetadata metadata, FileClient fileClient, ByteSink target)
DataFormat
writeMetadata
in interface DataFormat
metadata
- the metadata to writefileClient
- client used to write filepublic DataFormat.DataParser createParser(ParsingOptions options)
DataFormat
createParser
in interface DataFormat
options
- parsing options to usepublic DataFormat.DataFormatter createWriter(FormattingOptions options)
DataFormat
createWriter
in interface DataFormat
options
- formatting options to usepublic boolean isSplittable()
A format should only return true
if it can,
at least in some situations, support this sort of parsing.
If a format requires reading the entire file, it
must return false
.
If a format is not splittable, a file in the format cannot be parsed in parallel; however, individual files can still be parsed independently in parallel, as when reading the contents of a directory or using a file globbing pattern.
ARFF is a splittable format.
isSplittable
in interface DataFormat
true
if the format supports parsing
only a portion of the file, false
otherwiseCopyright © 2020 Actian Corporation. All rights reserved.