public class ReadARFF extends AbstractTextReader
ARFF can be parsed in parallel under "optimistic" assumptions: namely, that parse splits do not occur in the middle of a delimited field value and somewhere before an escaped record separator. This is assumed by default, but can be disabled, with an accompanying reduction of scalability and performance.
encodingProps
options, output
Constructor and Description |
---|
ReadARFF()
Reads an empty source with default settings.
|
ReadARFF(ByteSource source)
Reads the specified data source using default
options.
|
ReadARFF(Path path)
Reads the file specified by the path as ARFF data
using default options.
|
ReadARFF(String pattern)
Reads all paths matching the specified pattern
as ARFF data using default options.
|
Modifier and Type | Method and Description |
---|---|
protected DataFormat |
computeFormat(CompositionContext ctx)
Determines the data format for the source.
|
ARFFAnalyzer.Analysis |
discoverMetadata(FileClient ctx)
Gets the metadata for the currently configured data source.
|
char |
getFieldDelimiter()
Get the configured field delimiter property value.
|
void |
setFieldDelimiter(char fieldDelimiter)
Set the field delimiter to use when reading the file contents.
|
getCharset, getCharsetName, getDecodeBuffer, getEncoding, getErrorAction, getReplacement, setCharset, setCharsetName, setDecodeBuffer, setEncoding, setErrorAction, setReplacement
compose, getExtraFieldAction, getFieldErrorAction, getFieldLengthThreshold, getIncludeSourceInfo, getMissingFieldAction, getOutput, getParseOptions, getPessimisticSplitting, getReadBuffer, getReadOnClient, getRecordWarningThreshold, getSelectedFields, getSource, getSplitOptions, getUseMetadata, setExtraFieldAction, setFieldErrorAction, setFieldLengthThreshold, setIncludeSourceInfo, setMissingFieldAction, setParseErrorAction, setParseOptions, setPessimisticSplitting, setReadBuffer, setReadOnClient, setRecordWarningThreshold, setSelectedFields, setSelectedFields, setSource, setSource, setSource, setSplitOptions, setUseMetadata
disableParallelism, getInputPorts, getOutputPorts, newInput, newInput, newOutput, newRecordInput, newRecordInput, newRecordOutput, notifyError
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
disableParallelism, getInputPorts, getOutputPorts
public ReadARFF()
AbstractReader.setSource(ByteSource)
public ReadARFF(String pattern)
pattern
- a path-matching patternFileClient.matchPaths(String)
public ReadARFF(Path path)
path
- the path to readpublic ReadARFF(ByteSource source)
source
- the data source to readpublic void setFieldDelimiter(char fieldDelimiter)
fieldDelimiter
- character value to use the field delimiterpublic char getFieldDelimiter()
public ARFFAnalyzer.Analysis discoverMetadata(FileClient ctx)
ctx
- the authorization context to use for accessing the fileprotected DataFormat computeFormat(CompositionContext ctx)
AbstractReader
ReadSource
operator. If an
implementation supports schema discovery, it must be
performed in this method.computeFormat
in class AbstractReader
ctx
- the composition context for the current invocation
of AbstractReader.compose(CompositionContext)
Copyright © 2021 Actian Corporation. All rights reserved.