- java.lang.Object
-
- com.pervasive.datarush.operators.io.textfile.ARFFDataFormat
-
- All Implemented Interfaces:
DataFormat
public class ARFFDataFormat extends Object implements DataFormat
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface com.pervasive.datarush.operators.io.DataFormat
DataFormat.DataFormatter, DataFormat.DataParser
-
-
Constructor Summary
Constructors Constructor Description ARFFDataFormat(ARFFMode mode, TextRecord schema, CharsetEncoding encoding, char fieldDelimiter)Create an ARFF data format for reading an ARFF file.ARFFDataFormat(ARFFMode mode, TextRecord schema, CharsetEncoding encoding, String recordSeparator, char fieldDelimiter, String relationName, List<String> comments)Create an ARFF data format for writing to an ARFF file.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description DataFormat.DataParsercreateParser(ParsingOptions options)Create a new parser for the format using the specified parsing options.DataFormat.DataFormattercreateWriter(FormattingOptions options)Create a new writer for the format using the specified formatting options.FileMetadatagetMetadata()Gets the metadata associated with the format.RecordTokenTypegetType()Gets the record type associated with the format.booleanisSplittable()Indicates if the format supports parsing of subsections of a file.FileMetadatareadMetadata(FileClient fileClient, ByteSource source)Reads the metadata associated with the format.voidsetMetadata(FileMetadata metadata)Sets the metadata associated with the format.voidwriteMetadata(FileMetadata metadata, FileClient fileClient, ByteSink target)Writes the provided metadata associated with the format.
-
-
-
Constructor Detail
-
ARFFDataFormat
public ARFFDataFormat(ARFFMode mode, TextRecord schema, CharsetEncoding encoding, String recordSeparator, char fieldDelimiter, String relationName, List<String> comments)
Create an ARFF data format for writing to an ARFF file.- Parameters:
mode- indicates the mode for representing record dataschema- the schema to use for records. This provides field names (and positions) as well as the formatting of field values.encoding- character set definition for data encodingrecordSeparator- characters to use as a separator between recordsfieldDelimiter- the character use to delimit field values when the value contains spacesrelationName- name of the relationcomments- comments to associate with data written using this format.
-
ARFFDataFormat
public ARFFDataFormat(ARFFMode mode, TextRecord schema, CharsetEncoding encoding, char fieldDelimiter)
Create an ARFF data format for reading an ARFF file.- Parameters:
mode- indicates the mode for representing record dataschema- the schema to use for records. This provides field names (and positions) as well as the formatting of field values. This schema also controls the output type of the reader.encoding- character set definition for data encodingfieldDelimiter- the character use to delimit field values when the value contains spaces
-
-
Method Detail
-
getType
public RecordTokenType getType()
Description copied from interface:DataFormatGets the record type associated with the format. Records produced by the associated parser or consumed by the associated formatter will be of this type.For many formats, this may be derived from a schema object describing the format layout.
- Specified by:
getTypein interfaceDataFormat- Returns:
- the format's record type
-
getMetadata
public FileMetadata getMetadata()
Description copied from interface:DataFormatGets the metadata associated with the format. Records produces by the associated parser or consumed by the associated formatter will use this metadata.- Specified by:
getMetadatain interfaceDataFormat- Returns:
- the format's metadata
-
setMetadata
public void setMetadata(FileMetadata metadata)
Description copied from interface:DataFormatSets the metadata associated with the format.- Specified by:
setMetadatain interfaceDataFormat
-
readMetadata
public FileMetadata readMetadata(FileClient fileClient, ByteSource source)
Description copied from interface:DataFormatReads the metadata associated with the format.- Specified by:
readMetadatain interfaceDataFormat- Parameters:
fileClient- client used to read filesource- location of the files
-
writeMetadata
public void writeMetadata(FileMetadata metadata, FileClient fileClient, ByteSink target)
Description copied from interface:DataFormatWrites the provided metadata associated with the format.- Specified by:
writeMetadatain interfaceDataFormat- Parameters:
metadata- the metadata to writefileClient- client used to write file
-
createParser
public DataFormat.DataParser createParser(ParsingOptions options)
Description copied from interface:DataFormatCreate a new parser for the format using the specified parsing options.- Specified by:
createParserin interfaceDataFormat- Parameters:
options- parsing options to use- Returns:
- a new parser for reading external data
-
createWriter
public DataFormat.DataFormatter createWriter(FormattingOptions options)
Description copied from interface:DataFormatCreate a new writer for the format using the specified formatting options.- Specified by:
createWriterin interfaceDataFormat- Parameters:
options- formatting options to use- Returns:
- a new formatter for writing external data
-
isSplittable
public boolean isSplittable()
Indicates if the format supports parsing of subsections of a file.A format should only return
trueif it can, at least in some situations, support this sort of parsing. If a format requires reading the entire file, it must returnfalse.If a format is not splittable, a file in the format cannot be parsed in parallel; however, individual files can still be parsed independently in parallel, as when reading the contents of a directory or using a file globbing pattern.
ARFF is a splittable format.
- Specified by:
isSplittablein interfaceDataFormat- Returns:
trueif the format supports parsing only a portion of the file,falseotherwise
-
-