com.pervasive.datarush.io (Dataflow Library Distribution Project 8.0.0-98 API)

Provides classes and interfaces performing file-like I/O operations. DataRush uses model for file access based on the concept of paths, a generic "file system" location. Paths are represented in a URI-like fashion, having a scheme and a scheme-specific component. This abstraction provides a consistent mechanism for interacting with data which can be extended so that existing operators can automatically support many different sources of data.

There are three main components to this model:

Paths, which as mentioned above, refer to locations where data resides. Paths are merely syntactic entities; path objects to not expose access to the reference data directly. Paths may be syntactically valid without pointing to an existing location.
File systems which represent a logical storage location. Every path has an associated file system; a file system may have many paths associated with it.
File system providers provide the means of accessing data located on a file system via a path. Every file system has an associated provider; a provider may have multiple file systems associated with it.

Of these items, users only need to be aware of paths and the utility classes surrounding them. File systems and file system providers are implementation-specific and only required when developing support for a new type of path.

Additionally, the DataRush model has the concept of a file split, similar to that found in Hadoop. Splits are used to parallelize processing on files. File system providers should provide support for splitting files when possible to get parallelism when reading files.

Interface Summary
Interface	Description
DataSplit	Describes a range of bytes from a data source.
DirectoryFilter	A filter for selecting paths.
FileSystem	Describes the file system identified by a path scheme.
FileSystemProvider	Provides basic operations on paths for a specific path scheme or schemes.
InputStreamSupplier	An abstract factory for input streams.
IOChannelStatsCollector	Gathers statistics for an I/O channel.
IOMonitoringContext	Provides a context for instrumenting I/O operations.
Path	An abstract identifier for a resource.
PathDetails	Describes a `Path` along with its metadata.
PathGlob
SplitInputStream	Interface defining an input data stream that works within the boundaries of a defined split.
SplitIterator	A forward-only iterator over data splits with associated locality information.

Class Summary
Class	Description
BasicPathDetails
BinaryBuilder	A buffer for building variable-length binary valued data.
BinaryReader	Provides extended data access methods on binary data flows.
BuiltinStreamProvider	Provides access to built-in data streams.
CharsetEncoding	Describes the encoding format of character data.
CompressedFileSplit	Describes a range of bytes from a compressed file.
CompressionSplitIterator
FileClient	Provides access to files and directories.
FileSplit	Describes a range of bytes from a file.
FTPFileSystemProvider	Provides access to FTP resources as a file system.
FTPPath
InputStreamSuppliers	Contains various factory methods and utilities for creating `InputStreamSupplier`'s.
LocalFileSystemProvider	Provides access to the local file system.
Paths	A factory for creating `Path` objects.
PortRange
SFTPFileSystemProvider	Provides access to SFTP resources as a file system.
SingleSplitIterator	A split iterator containing a single split.
SplitInputStreamImpl	A wrapper for input streams providing windowing behavior.
SplitOptions	Settings which control the generation of splits on files.
SplitReader	A character based reader for splits.
SplittableCompressedFileSplit	Represents a file split for a compression format that supports splitting.
UnixStyleGlobbing	Provides UNIX-style globbing over paths.
UnixStyleGlobbing.GlobDefinition	Provides information for performing globbing.
URLFileSystemProvider	Provides generic access to URL resources.

Enum Summary
Enum	Description
BasicPathDetails.ObjectType
FTPPath.FTPProtocol
IOChannelOperation	Valid operations on an I/O byte channel, such as a file or network socket.
WriteMode	Enumerates the possible file dispositions for writing.

Exception Summary
Exception	Description
EOFException	An exception indicating end-of-file has been unexpectedly reached on a stream.
FileAlreadyExistsException	An I/O exception indicating the file in question already exists.

Package com.pervasive.datarush.io