Class BasicByteSink

java.lang.Object
com.pervasive.datarush.operators.io.BasicByteSink
All Implemented Interfaces:
ByteSink

public class BasicByteSink extends Object implements ByteSink
A data source identified by a Path. Typically, these represent writable files. In the case of a fragmented (parallel) write, this would represent the directory containing the individual partition fragment files.
  • Constructor Details

    • BasicByteSink

      public BasicByteSink(String path)
      Creates a data sink for the named path. If used with a parallel write, the path is assumed to be a directory; output files will be written in this directory.

      The last element of the path is used to automatically determine if the target is compressed, and if so, what the format is. If this is not desired, use BasicByteSink(Path, CompressionFormat) instead. In a parallel write using automatic compression as described above, the directory name will have the compressed file suffix removed.

      Parameters:
      path - the file/directory to write
    • BasicByteSink

      public BasicByteSink(Path path)
      Creates a data sink for the named path. If used with a parallel write, the path is assumed to be a directory; output files will be written in this directory.

      The last element of the path is used to automatically determine if the target is compressed, and if so, what the format is. If this is not desired, use BasicByteSink(Path, CompressionFormat) instead. In a parallel write using automatic compression as described above, the directory name will have the compressed file suffix removed.

      Parameters:
      path - the file/directory to write
    • BasicByteSink

      public BasicByteSink(Path path, CompressionFormat format)
      Creates a data sink for the named path. Data will be written using the specified compression format.

      If used with a parallel write, the path is assumed to be a directory; output files will be written in this directory.

      Parameters:
      path - the file/directory to write
      format - the compression format of the sink file(s)
  • Method Details

    • getPath

      public Path getPath()
      Gets the path identifying the byte sink.
      Returns:
      the path
    • getCompression

      public CompressionFormat getCompression()
      Gets the compression format used for writing.
      Returns:
      the requested sink compression format
    • getClient

      public FileClient getClient()
      Returns:
      the client
    • authorize

      public ByteSink authorize(FileClient client)
      Description copied from interface: ByteSink
      Creates a new sink with the same properties, but using the specified authorization.

      If a sink is supposed to be used with a specific authorization context, this method should be called to produce a new sink to use.

      Specified by:
      authorize in interface ByteSink
      Parameters:
      client - the authorization context to use for access
      Returns:
      a sink using the provided authorization context
    • open

      public OutputStream open(WriteMode mode) throws IOException
      Description copied from interface: ByteSink
      Opens the sink for writing in the specified mode. For some sinks, such as one representing the console, the mode may be irrelevant.
      Specified by:
      open in interface ByteSink
      Parameters:
      mode - indicates how to handle an existing source
      Returns:
      a writer for sending bytes to the sink
      Throws:
      IOException - if an I/O error occurs opening the sink
    • openForRead

      public InputStream openForRead() throws IOException
      Description copied from interface: ByteSink
      Opens the sink for reading. This may be needed for appends where there is metadata which must be read. For some sinks, such as one representing standard output, this operation may not be valid. Calls to this method should be guarded with a check of ByteSink.appendsExisting(WriteMode).
      Specified by:
      openForRead in interface ByteSink
      Returns:
      a reader for reading the existing target sink
      Throws:
      IOException - if an I/O error occurs opening the sink
    • supportsRandomAccess

      public boolean supportsRandomAccess()
      Description copied from interface: ByteSink
      Indicates whether the sink supports random access. In some cases, it may be desirable to perform random writes to the sink.
      Specified by:
      supportsRandomAccess in interface ByteSink
      Returns:
      true if the sink supports random access, false otherwise.
      See Also:
    • openChannel

      public FileChannel openChannel() throws IOException
      Description copied from interface: ByteSink
      Opens the sink for random access. Not all sources support random access; check by calling ByteSink.supportsRandomAccess() first.
      Specified by:
      openChannel in interface ByteSink
      Returns:
      a channel for random access to the sink
      Throws:
      IOException - if an I/O error occurs opening the sink
    • isFragmentable

      public boolean isFragmentable()
      Description copied from interface: ByteSink
      Indicates whether the sink can be broken into subordinate sinks for parallel writes.

      Typically, this is only true if the sink represents a path in some hierarchical store, such as a file system.

      Specified by:
      isFragmentable in interface ByteSink
      Returns:
      true if the sink can be "fragmented", false otherwise.
      See Also:
    • fragmentForPartition

      public ByteSink fragmentForPartition(int fragmentId)
      Description copied from interface: ByteSink
      Creates a new subordinate sink. The new sink represents a portion (or fragment) of the complete data set which is written to the master.

      Fragments are used for parallel writes; each logical partition is written to a separate fragment created from the provided source.

      Specified by:
      fragmentForPartition in interface ByteSink
      Parameters:
      fragmentId - a partition identifier
      Returns:
      a new sink for writing the identified partition.
    • cleanForOverwrite

      public void cleanForOverwrite(int partitionCount) throws IOException
      Description copied from interface: ByteSink
      Erases contents of the sink which do not correspond to expected partition fragments.

      This method will be called by one of the nodes during a parallel write in overwrite mode to ensure only the data from the current write is present on completion of the write.

      Specified by:
      cleanForOverwrite in interface ByteSink
      Parameters:
      partitionCount - the number of partitions expected
      Throws:
      IOException - if an error occurs while attempting to delete extraneous files from the target location
    • appendsExisting

      public boolean appendsExisting(WriteMode mode)
      Description copied from interface: ByteSink
      Indicates whether a write in the given mode represents an append to an existing sink.

      This is used to determine whether file metadata needs to be written or updated. It is not always the case that this can be determined solely from the mode. For instance, an append to a non-existent file needs metadata to be written, but an append to an existing file may need to update metadata. Another example is console output, which should always write metadata, even though the sink always exists.

      Specified by:
      appendsExisting in interface ByteSink
      Parameters:
      mode - the intended write mode
      Returns:
      true if the write represents the creation of a new dataset, false otherwise.
    • validate

      public void validate(WriteMode mode, boolean forFragments) throws IOException
      Description copied from interface: ByteSink
      Performs validation of the sink configuration. This checks things such as the existence and accessibility of the sink.

      The caller should indicate how the write will be performed so that appropriate checks can be performed. For instance, if no overwriting is allowed, the target is checked to see if it already exists.

      Specified by:
      validate in interface ByteSink
      Parameters:
      mode - the intended write mode
      forFragments - indicates whether the sink will be fragmented for the write
      Throws:
      IOException - if an I/O error occurs while validating the source
    • resolveChildFragment

      public ByteSink resolveChildFragment(String childPath)
      Creates a new subordinate sink. The new sink represents a portion (or fragment) of the complete data set which is written to the master.

      Fragments are used for parallel writes; each fragment generated will be a relative path from the original target of the sink,

      Parameters:
      the - relative path of the new target
      Returns:
      a new sink for writing the identified child partition.