public class ReadARFF extends AbstractTextReader
ARFF can be parsed in parallel under "optimistic" assumptions: namely, that parse splits do not occur in the middle of a delimited field value and somewhere before an escaped record separator. This is assumed by default, but can be disabled, with an accompanying reduction of scalability and performance.
encodingPropsoptions, output| Constructor and Description |
|---|
ReadARFF()
Reads an empty source with default settings.
|
ReadARFF(ByteSource source)
Reads the specified data source using default
options.
|
ReadARFF(Path path)
Reads the file specified by the path as ARFF data
using default options.
|
ReadARFF(String pattern)
Reads all paths matching the specified pattern
as ARFF data using default options.
|
| Modifier and Type | Method and Description |
|---|---|
protected DataFormat |
computeFormat(CompositionContext ctx)
Determines the data format for the source.
|
ARFFAnalyzer.Analysis |
discoverMetadata(FileClient ctx)
Gets the metadata for the currently configured data source.
|
char |
getFieldDelimiter()
Get the configured field delimiter property value.
|
void |
setFieldDelimiter(char fieldDelimiter)
Set the field delimiter to use when reading the file contents.
|
getCharset, getCharsetName, getDecodeBuffer, getEncoding, getErrorAction, getReplacement, setCharset, setCharsetName, setDecodeBuffer, setEncoding, setErrorAction, setReplacementcompose, getExtraFieldAction, getFieldErrorAction, getFieldLengthThreshold, getIncludeSourceInfo, getMissingFieldAction, getOutput, getParseOptions, getPessimisticSplitting, getReadBuffer, getReadOnClient, getRecordWarningThreshold, getSelectedFields, getSource, getSplitOptions, getUseMetadata, setExtraFieldAction, setFieldErrorAction, setFieldLengthThreshold, setIncludeSourceInfo, setMissingFieldAction, setParseErrorAction, setParseOptions, setPessimisticSplitting, setReadBuffer, setReadOnClient, setRecordWarningThreshold, setSelectedFields, setSelectedFields, setSource, setSource, setSource, setSplitOptions, setUseMetadatadisableParallelism, getInputPorts, getOutputPorts, newInput, newInput, newOutput, newRecordInput, newRecordInput, newRecordOutput, notifyErrorclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitdisableParallelism, getInputPorts, getOutputPortspublic ReadARFF()
AbstractReader.setSource(ByteSource)public ReadARFF(String pattern)
pattern - a path-matching patternFileClient.matchPaths(String)public ReadARFF(Path path)
path - the path to readpublic ReadARFF(ByteSource source)
source - the data source to readpublic void setFieldDelimiter(char fieldDelimiter)
fieldDelimiter - character value to use the field delimiterpublic char getFieldDelimiter()
public ARFFAnalyzer.Analysis discoverMetadata(FileClient ctx)
ctx - the authorization context to use for accessing the fileprotected DataFormat computeFormat(CompositionContext ctx)
AbstractReaderReadSource operator. If an
implementation supports schema discovery, it must be
performed in this method.computeFormat in class AbstractReaderctx - the composition context for the current invocation
of AbstractReader.compose(CompositionContext)Copyright © 2016 Actian Corporation. All rights reserved.