Class FileClient

java.lang.Object
com.pervasive.datarush.io.FileClient

public final class FileClient extends Object
Provides access to files and directories. Path objects only point to a (possible) location in an associated file system. A FileClient provides methods for checking whether the location actually exists and accessing the contents of the referenced file.

A given FileClient encompasses an authorization context for accesses. A FileMetaConfiguration supplies the accessor with any configuration required to authenticate with file systems and gain authorization to perform requested actions. While a number of file systems require no additional access rights beyond those inherited from the JVM's execution environment, others, such as cloud storage providers, require additional information to grant access.

See Also:
  • Method Details

    • defaultIndexPathBase

      public static File defaultIndexPathBase()
      The default value for index path base. This is set to java.io.tmpdir
      Returns:
      the default value for index path base.
    • basicClient

      public static FileClient basicClient()
      Creates a new client only using the OS-level credentials inherited from the JVM's execution environment.

      This client is sufficiently authorized for local file system access, but may not work for accessing remote systems.

      Returns:
      a minimally authorized client
    • configuredClient

      public static FileClient configuredClient(FileMetaConfiguration configuration, ModuleConfiguration moduleConfiguration)
      Creates a new client, authorized according to the given configuration. If an access requires authorization, the configuration will be queried to obtain credentials to authenticate with the appropriate system.
      Parameters:
      configuration - the configuration
      Returns:
      a new client having the specified configuration
    • configuredClient

      public static FileClient configuredClient(FileMetaConfiguration configuration, ModuleConfiguration moduleConfiguration, File indexPathBase, Path[] tempFsRootDirs)
      Creates a new client, authorized according to the given configuration. If an access requires authorization, the configuration will be queried to obtain credentials to authenticate with the appropriate system.
      Parameters:
      configuration - the configuration
      indexPathBase - the base path to use for temporary index files
      tempFsRootDirs - the root directories to use by default for temp storage
      Returns:
      a new client having the specified configuration
    • instrumentWith

      public FileClient instrumentWith(IOMonitoringContext monitor)
      Creates a new client which instrumented I/O operations using the specified context. The resulting client preserves the configuration of the original.
      Parameters:
      monitor - the monitoring context to use. null disables monitoring for the client.
      Returns:
      a new client with the same authorization, but different I/O monitoring
    • withNetworkConfiguration

      public FileClient withNetworkConfiguration(NetworkConfiguration networkConfiguration)
      Creates a new client with the specified network configuration. The resulting client preserves the configuration of the original.
      Parameters:
      networkConfiguration - the network configuration to use
      Returns:
      a new client with the specified network configuration
    • getConfiguration

      public FileMetaConfiguration getConfiguration()
      Returns the meta-configuration containing the configuration of this file client.
      Returns:
      the meta-configuration
    • getModuleConfiguration

      public ModuleConfiguration getModuleConfiguration()
      Returns the module configuration for this file client
      Returns:
      the module configuration
    • getNetworkConfiguration

      public NetworkConfiguration getNetworkConfiguration()
      Returns the network configuration to use
      Returns:
      the network configuration to use
    • getMonitoring

      public IOMonitoringContext getMonitoring()
      Returns the monitoring used to instrument I/O operations by the client.
      Returns:
      the active monitoring context, null if none is active.
    • getTempFilesystemRootDirectories

      public Path[] getTempFilesystemRootDirectories()
      Returns the root directories used by default for temp storage by this client
      Returns:
      the root directories used by default
    • supportsRandomAccess

      public boolean supportsRandomAccess(Path path)
      Indicates whether the filesystem corresponding to this path supports random access
      Parameters:
      path - the path
      Returns:
      whether the filesystem corresponding to this path supports random access
      See Also:
    • supportsDirectories

      public boolean supportsDirectories(Path path)
      Indicates whether the filesystem corresponding to this path supports directories
      Parameters:
      path - the path
      Returns:
      whether the filesystem corresponding to this path supports directories
      See Also:
    • createDirectories

      public Path createDirectories(Path path) throws IOException
      Creates the directory identified by the given path, creating any necessary parent directories.
      Parameters:
      path - the directory to create
      Returns:
      the path to the directory
      Throws:
      IOException - if I/O errors occur or if any parent directories could not be created
      See Also:
    • createDirectory

      public Path createDirectory(Path path) throws IOException
      Creates the directory identified by the given path. All parent directories must already exist. Use createDirectories(Path) if nonexistent parents need to be created.
      Parameters:
      path - the directory to create
      Returns:
      the path to the directory
      Throws:
      IOException - if I/O errors occur or if the parent directory does not exist
      See Also:
    • createNewFile

      public boolean createNewFile(Path path) throws IOException
      Creates the file identified by the given path. All parent directories must already exist. Use createDirectories(Path) if nonexistent parents need to be created.

      If the underlying file system does not provide an atomic test-and-create, a race condition where two callers return true is possible.

      Parameters:
      path - the file to create
      Returns:
      true if the file was created, false if the file already existed.
      Throws:
      IOException - if I/O errors occur or if the parent directory does not exist
    • delete

      public void delete(Path path) throws IOException
      Deletes the file or directory identified by the specified path. It is not an error to delete a non-existent path.
      Parameters:
      path - the file or directory to delete
      Throws:
      IOException - if I/O errors occur or if the path identifies a non-empty directory
    • delete

      public void delete(Path path, boolean recursively) throws IOException
      Deletes the file or directory identified by the specified path. If the path is a directory, the contents will be deleted recursively as indicated. It is not an error to delete a non-existent path.
      Parameters:
      path - the file or directory to delete
      recursively - indicates whether deletes of directories should recursively delete the contents
      Throws:
      IOException - if I/O errors occur or if the path identifies a non-empty directory and a recursive delete was not requested
      See Also:
    • newInputStream

      public InputStream newInputStream(Path path) throws IOException
      Opens the specified file for reading. If the path identifies a directory, all files in the directory are read as a concatenated stream, in the order they are returned by listFiles(Path). The read starts at the first byte of the (first) file. The returned stream is generally not buffered; consult specific path scheme implementations for details.

      If the FileClient has been instrumented with monitoring, the resulting stream will be registered in statistics for the file system type.

      Parameters:
      path - the file or directory to read
      Returns:
      a stream for reading the contents of the path
      Throws:
      IOException - if I/O errors occur
      FileNotFoundException - if the specified path does not exist
      See Also:
    • newOutputStream

      public OutputStream newOutputStream(Path path, WriteMode mode) throws IOException
      Opens the specified file for writing. Behavior depends on the provided mode and whether the target file exists. If the target file does not exist, it is created and the resulting stream is positioned at the first byte. Otherwise, if the target file exists:
      • If the mode is CREATE_NEW, an error is raised.
      • If the mode is OVERWRITE, the existing file is replaced with the byte written to the resulting stream.
      • If the mode is APPEND, the resulting stream is positioned after the last byte of the existing file.

      The returned stream is generally not buffered; consult specific path scheme implementations for details.

      If the FileClient has been instrumented with monitoring, the resulting stream will be registered in statistics for the file system type.

      Parameters:
      path - the file to write
      mode - how to handle writes to existing files
      Returns:
      a stream for writing (or appending) the contents of the file
      Throws:
      IOException - if I/O errors occur or if the target is a directory
      IllegalArgumentException - if the path is not is not for a scheme supported by the provider
      See Also:
    • newFileChannel

      public FileChannel newFileChannel(Path path) throws IOException
      Opens the specified file for random access. Not all schemes support random access; consult specific implementations for details.

      If the FileClient has been instrumented with monitoring, the resulting channel will be registered in statistics for the file system type.

      Parameters:
      path - the file or directory to read
      Returns:
      a stream for reading the contents of the path
      Throws:
      IOException - if I/O errors occur
      FileNotFoundException - if the specified path does not exist
      IllegalArgumentException - if the path is not is not for a scheme supported by the provider
      See Also:
    • exists

      public boolean exists(Path path) throws IOException
      Indicates whether the specified path represents an existing file or directory.
      Parameters:
      path - the path to test
      Returns:
      true the path exists, false otherwise.
      Throws:
      IOException - if I/O errors occur
      See Also:
    • getDetails

      public PathDetails getDetails(Path path) throws IOException
      Returns metadata associated with the specified path.

      If a number of queries on the metadata for a path are going to be made, it can be more efficient to use this method to get a "snapshot", as only one request is performed instead of many.

      Parameters:
      path - the path for which to get metadata
      Returns:
      metadata for the specified file system object, if it exists. null if it does not exist.
      Throws:
      IOException - if I/O errors occur
      See Also:
    • getLength

      public long getLength(Path path) throws IOException
      Returns the length of the file represented by the given path
      Parameters:
      path - the path to test
      Returns:
      the length of the file
      Throws:
      IOException - if I/O errors occur
      See Also:
    • isDirectory

      public boolean isDirectory(Path path) throws IOException
      Indicates whether the specified path represents a directory.
      Parameters:
      path - the path to test
      Returns:
      true the path represents a directory,
      Throws:
      IOException
    • isFile

      public boolean isFile(Path path) throws IOException
      Indicates whether the specified path represents a file.
      Parameters:
      path - the path to test
      Returns:
      true the path represents a file,
      Throws:
      IOException
    • isReadable

      public boolean isReadable(Path path) throws IOException
      Indicates whether the specified path can be read.
      Parameters:
      path - the path to test
      Returns:
      true the path represents a readable file or directory, false otherwise.
      Throws:
      IOException - if I/O errors occur
      See Also:
    • isWritable

      public boolean isWritable(Path path) throws IOException
      Indicates whether the specified path can be written.
      Parameters:
      path - the path to test
      Returns:
      true the path represents a writable file or directory, false otherwise.
      Throws:
      IOException - if I/O errors occur
      See Also:
    • isCreatable

      public boolean isCreatable(Path path) throws IOException
      Indicates whether the specified path can be created.
      Parameters:
      path - the path to test
      Returns:
      true the path represents a nonexistent file or directory which could be created, false otherwise.
      Throws:
      IOException - if I/O errors occur
      See Also:
    • isHidden

      public boolean isHidden(Path path) throws IOException
      Indicates whether the associated path is hidden by the file system.
      Parameters:
      path - the path to test
      Returns:
      true if the path is hidden, false otherwise.
      Throws:
      IOException - if I/O errors occur
      See Also:
    • getSplitIterator

      public SplitIterator getSplitIterator(Path path) throws IOException
      Computes data splits over the specified file. Files are split using the default mechanisms of the file system. If the file system represents remotely located data, the location of the split is indicated.
      Parameters:
      path - the file for which to get splits
      Returns:
      an iterator generating the data splits over the target
      Throws:
      IOException - if an I/O error occurs
      See Also:
    • getSplitIterator

      public SplitIterator getSplitIterator(Path path, SplitOptions options) throws IOException
      Computes data splits over the specified file. Files are split following the provided options, to the degree that the file system can do so. If the file system represents remotely located data, the location of the split is indicated.
      Parameters:
      path - the file for which to get splits
      options - configuration for the process of dividing the file
      Returns:
      an iterator generating the data splits over the target
      Throws:
      IOException - if an I/O error occurs
      See Also:
    • listFiles

      public List<Path> listFiles(Path path) throws IOException
      Gets the contents of the specified directory.

      In cases where metadata will be accessed immediately, it is more efficient to use listDirectory(Path) and make metadata calls against the returned PathDetails objects.

      Parameters:
      path - the directory for which to get a content list
      Returns:
      all children of the specified directory
      Throws:
      IOException - if I/O errors occur or if the path identifies a file
      See Also:
      • FileSystem#listFiles(Path)
    • listFiles

      public List<Path> listFiles(Path path, DirectoryFilter filter) throws IOException
      Gets the contents of the specified directory, applying a filter.

      In cases where metadata will be accessed immediately, it is more efficient to use listDirectory(Path) and make metadata calls against the returned PathDetails objects.

      Parameters:
      path - the directory for which to get a filtered content list
      filter - a filter to apply to the contents
      Returns:
      all children of the directory which pass the filter
      Throws:
      IOException - if I/O errors occur or if the path identifies a file
      See Also:
      • FileSystem#listFiles(Path,DirectoryFilter)
    • listAllFiles

      public List<Path> listAllFiles(Path path, DirectoryFilter filter) throws IOException
      Gets the contents of the specified directory including hidden files, applying a filter.

      In cases where metadata will be accessed immediately, it is more efficient to use listDirectory(Path) and make metadata calls against the returned PathDetails objects.

      Parameters:
      path - the directory for which to get a filtered content list
      filter - a filter to apply to the contents
      Returns:
      all children of the directory which pass the filter
      Throws:
      IOException - if I/O errors occur or if the path identifies a file
      See Also:
      • FileSystem#listFiles(Path,DirectoryFilter)
    • listDirectory

      public List<PathDetails> listDirectory(Path path) throws IOException
      Gets the contents of the specified directory.
      Parameters:
      path - the directory for which to get a content list
      Returns:
      all children of the specified directory
      Throws:
      IOException - if I/O errors occur or if the path identifies a file
      See Also:
      • FileSystem#listFiles(Path)
    • listDirectory

      public List<PathDetails> listDirectory(Path path, DirectoryFilter filter) throws IOException
      Gets the contents of the specified directory, applying a filter.
      Parameters:
      path - the directory for which to get a filtered content list
      filter - a filter to apply to the contents
      Returns:
      all children of the directory which pass the filter
      Throws:
      IOException - if I/O errors occur or if the path identifies a file
      See Also:
      • FileSystem#listFiles(Path,DirectoryFilter)
    • move

      public void move(Path from, Path to) throws IOException
      Moves a file or directory from one location to another. For paths referring to different file systems, this will be a copy-then-delete operation. For paths referring to the same file system, the provider will perform a rename instead, if possible.
      Parameters:
      from - the source location
      to - the target location
      Throws:
      IOException - if I/O errors occur
      See Also:
    • copy

      public void copy(Path from, Path to) throws IOException
      Copies a file or directory from one location to another. If the source is a directory, all its contents will be copied recursively. A directory can only be copied to another directory,
      Parameters:
      from - the source location
      to - the target location
      Throws:
      IOException - if I/O errors occur
    • matchPaths

      public List<Path> matchPaths(String pattern) throws IOException
      Finds all paths matching the specified pattern. Patterns, like paths, begin with a scheme, identifying the provider. Pattern syntax is scheme-specific; refer to a specific provider for details on supported syntax.

      In cases where metadata will be accessed immediately, it is more efficient to use listMatches(String) and make metadata calls against the returned PathDetails objects.

      Parameters:
      pattern - a scheme prefixed matching pattern
      Returns:
      all paths which matched the pattern
      Throws:
      IOException - if an I/O error occurs while resolving the pattern
      DRException - if no configured provider could be found for the scheme
    • listMatches

      public List<PathDetails> listMatches(String pattern) throws IOException
      Finds all paths matching the specified pattern. Patterns, like paths, begin with a scheme, identifying the provider. Pattern syntax is scheme-specific; refer to a specific provider for details on supported syntax.
      Parameters:
      pattern - a scheme prefixed matching pattern
      Returns:
      all paths which matched the pattern
      Throws:
      IOException - if an I/O error occurs while resolving the pattern
      DRException - if no configured provider could be found for the scheme
    • createTemporaryStorage

      public Path createTemporaryStorage()
      Creates a temporary storage location using default configuration. The returned path is for the storage of ephemeral data; the path may not be valid at a future point in time after the calling process terminates. This path is guaranteed to be unique and can be used either as a file or directory, as needed.

      It is the caller's responsibility to delete temporary storage when it is no longer required. Failure to do so may leave files on the local file system.

      Returns:
      a path under which temporary data can be stored
    • createTemporaryStorage

      public Path createTemporaryStorage(File indexPath, Path[] storagePaths)
      Creates a temporary storage location using the specified configuration. The returned path is for the storage of ephemeral data; the path may not be valid at a future point in time after the calling process terminates. This path is guaranteed to be unique and can be used either as a file or directory, as needed.

      Temporary storage consists of two logical spaces, the index space and the data space. The data space is defined by one or more root directories, under which data files are stored. The data space is flat; it does not reflect the hierarchical structure in paths. The index space is a single root directory, with directories in the index space corresponding to directories in temporary storage and files being indexes to data files containing the contents of temporary files. This indirection expands the amount of temporary storage available as well as spreads I/O activity across more disks.

      It is the caller's responsibility to delete temporary storage when it is no longer required. Failure to do so may leave files on the local file system.

      Parameters:
      indexPath - the root directory to use for indexing temporary files. A unique directory for this storage will be created under the root; multiple temporary locations may safely share the same value.
      storagePaths - root directories to use for storing data files. Unique files will be created under these directories as needed; multiple temporary locations may safely share the same values.
      Returns:
      a path under which temporary data can be stored
    • getFileSystemType

      public String getFileSystemType(Path path)
      Returns an identifier for the filesystem type. When the client has been instrumented to collect I/O statistics, this identifier is used as a label for the path.
      Parameters:
      path - the path
      Returns:
      an identifier for the filesystem type.
    • getFileSystemImplementation

      public Object getFileSystemImplementation(Path path)
      Returns the underlying implementation object for the file system implementation of the given path. This is an optional method and may not be supported by all file system implentations.
      Parameters:
      path - the path
      Returns:
      an object reference of the underlying filesystem or null
    • zipRecursively

      public boolean zipRecursively(Path from, OutputStream out, boolean ignoreNonReadable) throws IOException
      Recursively zips the given input file/directory to the given stream
      Parameters:
      from - may be a file or a directory
      out - the destination output stream. Will not be closed as part of this operation.
      ignoreNonReadable - ignore files that we don't have permission to read
      Returns:
      true if there was anything zipped (may be false if the directory is empty or no files are readable)
      Throws:
      IOException - if an error occurs
    • unzip

      public void unzip(Path from, Path to) throws IOException
      Recursively unzips the given file to a given directory/file
      Parameters:
      from - must be a file
      to - must not be a file
      Throws:
      IOException - if an error occurs