-
- All Superinterfaces:
InputStreamSupplier
- All Known Implementing Classes:
BasicByteSource
,ConcatenatedByteSource
,GlobbingByteSource
public interface ByteSource extends InputStreamSupplier
An abstract source of bytes.ByteSource
objects represent entities existing outside of a logical graph, such as files and sockets, which can be read as a stream of bytes. These can then be used in conjunction withDataFormat
objects to produce records which then flow through the dataflow graph, the most common reason being loading persisted data from disk.Generally, it is not necessarily to implement or even directly use
ByteSource
objects. Most read operators provide a more convenient interface which obscures the object; seeAbstractReader
as an example.By default, sources use OS-level authorization inherited from the execution environment, but can be configured to use use more complex authentication mechanisms to provide an authorization context.
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description ByteSource
authorize(FileClient client)
Creates a new source with the same properties, but using the specified authorization.SplitIterator
generateSplits(SplitOptions options)
Gets an iterator producing a set ofDataSplit
objects covering the source.InputStream
open()
Opens the source for reading.ByteSource
validate()
Performs validation of the source configuration.
-
-
-
Method Detail
-
authorize
ByteSource authorize(FileClient client)
Creates a new source with the same properties, but using the specified authorization.If a source is supposed to be used with a specific authorization context, this method should be called to produce a new source to use.
- Parameters:
client
- the authorization context to use for access- Returns:
- a source using the provided authorization context
-
validate
ByteSource validate() throws IOException
Performs validation of the source configuration. This checks things such as the existence and accessibility of the source. It may also optionally rewrite the source to an equivalent one, doing file glob and directory expansion.- Returns:
- a valid source equivalent to this one
- Throws:
IOException
- if an I/O error occurs while validating the source
-
open
InputStream open() throws IOException
Opens the source for reading. The caller is responsible for closing the returnedInputStream
.- Specified by:
open
in interfaceInputStreamSupplier
- Returns:
- a reader of the bytes from the source
- Throws:
IOException
- if an I/O error occurs while opening the source
-
generateSplits
SplitIterator generateSplits(SplitOptions options) throws IOException
Gets an iterator producing a set ofDataSplit
objects covering the source. The source is split as requested in the specified options, within the source's ability to meet the requirements.- Parameters:
options
- configurable options to use in generating the splits- Returns:
- an iterator over valid splits of the source
- Throws:
IOException
- if an I/O error occurs while generating splits
-
-