Module datarush.commons
Package com.pervasive.datarush.io
package com.pervasive.datarush.io
Provides classes and interfaces performing file-like I/O operations. DataRush
uses model for file access based on the concept of paths, a generic
"file system" location. Paths are represented in a URI-like fashion, having a
scheme and a scheme-specific component. This abstraction provides a
consistent mechanism for interacting with data which can be extended so that
existing operators can automatically support many different sources of data.
There are three main components to this model:
- Paths, which as mentioned above, refer to locations where data resides. Paths are merely syntactic entities; path objects to not expose access to the reference data directly. Paths may be syntactically valid without pointing to an existing location.
- File systems which represent a logical storage location. Every path has an associated file system; a file system may have many paths associated with it.
- File system providers provide the means of accessing data located on a file system via a path. Every file system has an associated provider; a provider may have multiple file systems associated with it.
Of these items, users only need to be aware of paths and the utility classes surrounding them. File systems and file system providers are implementation-specific and only required when developing support for a new type of path.
Additionally, the DataRush model has the concept of a file split, similar to that found in Hadoop. Splits are used to parallelize processing on files. File system providers should provide support for splitting files when possible to get parallelism when reading files.
-
ClassDescriptionA buffer for building variable-length binary valued data.Provides extended data access methods on binary data flows.Provides access to built-in data streams.Describes the encoding format of character data.Describes a range of bytes from a compressed file.Describes a range of bytes from a data source.A filter for selecting paths.An exception indicating end-of-file has been unexpectedly reached on a stream.An I/O exception indicating the file in question already exists.Provides access to files and directories.Describes a range of bytes from a file.Describes the file system identified by a path scheme.Provides basic operations on paths for a specific path scheme or schemes.Provides access to FTP resources as a file system.An abstract factory for input streams.Contains various factory methods and utilities for creating
InputStreamSupplier's.Valid operations on an I/O byte channel, such as a file or network socket.Gathers statistics for an I/O channel.Provides a context for instrumenting I/O operations.Provides access to the local file system.An abstract identifier for a resource.Describes aPathalong with its metadata.A factory for creatingPathobjects.Provides access to SFTP resources as a file system.A split iterator containing a single split.Interface defining an input data stream that works within the boundaries of a defined split.A wrapper for input streams providing windowing behavior.A forward-only iterator over data splits with associated locality information.Settings which control the generation of splits on files.A character based reader for splits.Represents a file split for a compression format that supports splitting.Provides UNIX-style globbing over paths.Provides information for performing globbing.Provides generic access to URL resources.Enumerates the possible file dispositions for writing.