Module datarush.library
Class AbstractTextWriter
- java.lang.Object
-
- com.pervasive.datarush.operators.AbstractLogicalOperator
-
- com.pervasive.datarush.operators.CompositeOperator
-
- com.pervasive.datarush.operators.io.AbstractWriter
-
- com.pervasive.datarush.operators.io.textfile.AbstractTextWriter
-
- All Implemented Interfaces:
LogicalOperator
,RecordSinkOperator
,SinkOperator<RecordPort>
- Direct Known Subclasses:
WriteARFF
,WriteDelimitedText
,WriteFixedText
public abstract class AbstractTextWriter extends AbstractWriter
A generic writer of text data representing a stream of records. The writer encompasses the basic attributes any such reader should have beyond a standard byte-oriented writer, namely information on how to encode the characters into bytes.- See Also:
CharsetEncoding
-
-
Field Summary
Fields Modifier and Type Field Description protected CharsetEncoding
encodingProps
Container for character encoding related attributes-
Fields inherited from class com.pervasive.datarush.operators.io.AbstractWriter
input, options
-
-
Constructor Summary
Constructors Modifier Constructor Description protected
AbstractTextWriter()
Writes text to an empty target with default settings.protected
AbstractTextWriter(boolean provideDoneSignal)
Writes text to an empty target with default settings, optionally providing a port for signaling completion of the write.protected
AbstractTextWriter(Path path, WriteMode mode)
Writes text to the specified path in the given mode, using default settings.protected
AbstractTextWriter(ByteSink target, WriteMode mode)
Writes text to the specified target sink in the given mode.protected
AbstractTextWriter(String path, WriteMode mode)
Writes text to the specified path in the given mode, using default settings.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Charset
getCharset()
Gets the character set used by the data sink.String
getCharsetName()
Gets the name of the character set used by the data sink.int
getEncodeBuffer()
Gets the size of the buffer, in bytes, used to encode character data.CharsetEncoding
getEncoding()
Get the character set encoding properties.CodingErrorAction
getErrorAction()
Get the configured encoding error action.String
getReplacement()
Get the text used by the replacement error action.void
setCharset(Charset charset)
Sets the character set used by the data sink.void
setCharsetName(String charsetName)
Sets the character set used by the data sink.void
setEncodeBuffer(int size)
Sets the size of the buffer, in bytes, used to encode character data.void
setEncoding(CharsetEncoding settings)
Set the properties that control character set encoding.void
setErrorAction(CodingErrorAction errorAction)
Set the encoding error action.void
setReplacement(String replacement)
Sets the error policy to be replacement with the specified string.-
Methods inherited from class com.pervasive.datarush.operators.io.AbstractWriter
compose, computeFormat, getFormatOptions, getInput, getMode, getSaveMetadata, getTarget, getWriteBuffer, getWriteOnClient, getWriteSingleSink, isIgnoreSortOrder, setFormatOptions, setIgnoreSortOrder, setMode, setSaveMetadata, setTarget, setTarget, setTarget, setWriteBuffer, setWriteOnClient, setWriteSingleSink
-
Methods inherited from class com.pervasive.datarush.operators.AbstractLogicalOperator
disableParallelism, getInputPorts, getOutputPorts, newInput, newInput, newOutput, newRecordInput, newRecordInput, newRecordOutput, notifyError
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface com.pervasive.datarush.operators.LogicalOperator
disableParallelism, getInputPorts, getOutputPorts
-
-
-
-
Field Detail
-
encodingProps
protected final CharsetEncoding encodingProps
Container for character encoding related attributes
-
-
Constructor Detail
-
AbstractTextWriter
protected AbstractTextWriter()
Writes text to an empty target with default settings. The target must be set before execution or an error will be raised.- See Also:
AbstractWriter.setTarget(ByteSink)
-
AbstractTextWriter
protected AbstractTextWriter(boolean provideDoneSignal)
Writes text to an empty target with default settings, optionally providing a port for signaling completion of the write. The target must be set before execution or an error will be raised.- Parameters:
provideDoneSignal
- indicates whether a done signal port should be created- See Also:
AbstractWriter.setTarget(ByteSink)
-
AbstractTextWriter
protected AbstractTextWriter(String path, WriteMode mode)
Writes text to the specified path in the given mode, using default settings.If the writer is parallelized, this is interpreted as a directory in which each partition will write a fragment of the entire input stream. Otherwise, it is interpreted as the file to write.
- Parameters:
path
- the path to which to writemode
- how to handle existing files
-
AbstractTextWriter
protected AbstractTextWriter(Path path, WriteMode mode)
Writes text to the specified path in the given mode, using default settings.If the writer is parallelized, this is interpreted as a directory in which each partition will write a fragment of the entire input stream. Otherwise, it is interpreted as the file to write.
- Parameters:
path
- the path to which to writemode
- how to handle existing files
-
AbstractTextWriter
protected AbstractTextWriter(ByteSink target, WriteMode mode)
Writes text to the specified target sink in the given mode.The writer can only be parallelized if the sink is fragmentable. In this case, each partition will be written as an independent sink. Otherwise, the writer will run non-parallel.
- Parameters:
target
- the sink to which to writemode
- how to handle an existing sink
-
-
Method Detail
-
getEncodeBuffer
public int getEncodeBuffer()
Gets the size of the buffer, in bytes, used to encode character data.- Returns:
- the decoding buffer size
-
setEncodeBuffer
public void setEncodeBuffer(int size)
Sets the size of the buffer, in bytes, used to encode character data. By default, this will be automatically derived using the character set and write buffer size.- Parameters:
size
- the encoding buffer size to use
-
getEncoding
public CharsetEncoding getEncoding()
Get the character set encoding properties.- Returns:
- properties used for character set encoding
-
setEncoding
public void setEncoding(CharsetEncoding settings)
Set the properties that control character set encoding.- Parameters:
settings
- character set encoding properties
-
getCharset
public Charset getCharset()
Gets the character set used by the data sink.- Returns:
- the character set of the target
-
setCharset
public void setCharset(Charset charset)
Sets the character set used by the data sink. By default ISO-8859-1 is used.- Parameters:
charset
- the character set to use
-
getCharsetName
public String getCharsetName()
Gets the name of the character set used by the data sink.- Returns:
- the name of character set of the target
-
setCharsetName
public void setCharsetName(String charsetName)
Sets the character set used by the data sink.- Parameters:
charsetName
- name of the character set- Throws:
InvalidPropertyValueException
- if the named character set is not supported.
-
getErrorAction
public CodingErrorAction getErrorAction()
Get the configured encoding error action.- Returns:
- encoding error action
-
setErrorAction
public void setErrorAction(CodingErrorAction errorAction)
Set the encoding error action. The error action determines how to handle errors encoding the input data into the configured character set. The default action is to replace the faulty data with a replacement character.- Parameters:
errorAction
- encoding error action
-
getReplacement
public String getReplacement()
Get the text used by the replacement error action. This value is only used if the error action is to replace.- Returns:
- replacement text
-
setReplacement
public void setReplacement(String replacement)
Sets the error policy to be replacement with the specified string. By default, "?" is used.- Parameters:
replacement
- replacement value to use for encoding errors
-
-