public final class WriteStagingDataset extends AbstractWriter
It is generally best to perform parallel writes, creating multiple files. This allows reads to be parallelized effectively, as the staging format is not splitable.
WriteStagingDataset
input, options
Constructor and Description |
---|
WriteStagingDataset()
Writes to an empty target with default settings.
|
WriteStagingDataset(boolean provideDoneSignal)
Writes an empty target with default settings,
optionally providing a port for signaling
completion of the write.
|
WriteStagingDataset(ByteSink target,
WriteMode mode)
Writes to the specified target sink using default
options.
|
WriteStagingDataset(Path path,
WriteMode mode)
Writes to the specified path in the
given mode, using default options.
|
WriteStagingDataset(String path,
WriteMode mode)
Writes to the specified path in the
given mode, using default options.
|
Modifier and Type | Method and Description |
---|---|
protected DataFormat |
computeFormat(CompositionContext ctx)
Determines the data format for the target.
|
DatasetMetadata |
discoverMetadata(FileClient client)
Gets the metadata for the currently configured data target.
|
int |
getBlockSize()
Gets the block size, in rows, used for encoding data.
|
DatasetStorageFormat |
getFormat()
Gets the data set format used to store data
|
void |
setBlockSize(int blockSize)
Sets the block size, in rows, used for encoding data.
|
void |
setFormat(DatasetStorageFormat format)
Sets the data set format used to store data.
|
compose, getFormatOptions, getInput, getMode, getSaveMetadata, getTarget, getWriteBuffer, getWriteOnClient, getWriteSingleSink, isIgnoreSortOrder, setFormatOptions, setIgnoreSortOrder, setMode, setSaveMetadata, setTarget, setTarget, setTarget, setWriteBuffer, setWriteOnClient, setWriteSingleSink
disableParallelism, getInputPorts, getOutputPorts, newInput, newInput, newOutput, newRecordInput, newRecordInput, newRecordOutput, notifyError
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
disableParallelism, getInputPorts, getOutputPorts
public WriteStagingDataset()
AbstractWriter.setTarget(ByteSink)
public WriteStagingDataset(boolean provideDoneSignal)
provideDoneSignal
- indicates whether a
done signal port should be createdAbstractWriter.setTarget(ByteSink)
public WriteStagingDataset(String path, WriteMode mode)
If the writer is parallelized, this is interpreted as a directory in which each partition will write a fragment of the entire input stream. Otherwise, it is interpreted as the file to write.
path
- the path to which to writemode
- how to handle existing filespublic WriteStagingDataset(Path path, WriteMode mode)
If the writer is parallelized, this is interpreted as a directory in which each partition will write a fragment of the entire input stream. Otherwise, it is interpreted as the file to write.
path
- the path to which to writemode
- how to handle existing filespublic WriteStagingDataset(ByteSink target, WriteMode mode)
The writer can only be parallelized if the sink is fragmentable. In this case, each partition will be written as an independent sink. Otherwise, the writer will run non-parallel.
target
- the sink to which to writemode
- how to handle an existing sinkpublic void setFormat(DatasetStorageFormat format)
DatasetStorageFormat.COMPACT_ROW
.format
- the format to use when writingpublic DatasetStorageFormat getFormat()
public int getBlockSize()
public void setBlockSize(int blockSize)
DatasetStorageFormat.COLUMNAR
.
Using larger values may increase efficiency, but at a cost of using more memory.
blockSize
- the size of encoded data blocksprotected DataFormat computeFormat(CompositionContext ctx)
AbstractWriter
WriteSink
operator. If an
implementation supports schema discovery, it must be
performed in this method.computeFormat
in class AbstractWriter
ctx
- the composition context for the current invocation
of AbstractWriter.compose(CompositionContext)
public DatasetMetadata discoverMetadata(FileClient client)
client
- the file clientCopyright © 2015 Actian Corporation. All Rights Reserved.