Class YarnClusterExecutor

  • All Implemented Interfaces:
    DistributedExecutorService

    public class YarnClusterExecutor
    extends com.pervasive.datarush.cal.dr.AbstractDRClusterExecutor
    The executor responsible for starting Dataflow worker containers within YARN. An executor is started by the cluster execution framework for every phase of a dataflow job. It is the executor's job to start and manage a container for every partition of a job. The executor is started within the application master and is invoked by the cluster execution framework as needed to manage containers.
    • Nested Class Summary

      • Nested classes/interfaces inherited from class com.pervasive.datarush.cal.dr.AbstractDRClusterExecutor

        com.pervasive.datarush.cal.dr.AbstractDRClusterExecutor.CommandServiceHelper, com.pervasive.datarush.cal.dr.AbstractDRClusterExecutor.CommandServiceInfo
    • Field Summary

      • Fields inherited from class com.pervasive.datarush.cal.dr.AbstractDRClusterExecutor

        allocationProvider, DIST_MUTEX, distExecutors, fileClient, jobSpec
    • Constructor Summary

      Constructors 
      Constructor Description
      YarnClusterExecutor​(FileClient fileClient, JobSpecifier jobSpec, com.pervasive.datarush.hadoop.shims.yarn.ContainerManager containerMgr, com.pervasive.datarush.cluster.ExecutorOptions execOptions, Path cacheLocation, Path libArchiveLocation, com.pervasive.datarush.cluster.preferences.ClusterPreferences clusterPrefs, Map<String,​ApplicationResource> appResources)  
    • Constructor Detail

      • YarnClusterExecutor

        public YarnClusterExecutor​(FileClient fileClient,
                                   JobSpecifier jobSpec,
                                   com.pervasive.datarush.hadoop.shims.yarn.ContainerManager containerMgr,
                                   com.pervasive.datarush.cluster.ExecutorOptions execOptions,
                                   Path cacheLocation,
                                   Path libArchiveLocation,
                                   com.pervasive.datarush.cluster.preferences.ClusterPreferences clusterPrefs,
                                   Map<String,​ApplicationResource> appResources)
    • Method Detail

      • start

        protected List<com.pervasive.datarush.cal.dr.AbstractDRClusterExecutor.CommandServiceInfo> start()
        Specified by:
        start in class com.pervasive.datarush.cal.dr.AbstractDRClusterExecutor
      • submit

        protected <R> List<com.pervasive.datarush.cal.dr.CommandHandleInfo> submit​(DistributedCallableBatch<R> command,
                                                                                   List<com.pervasive.datarush.cal.dr.AbstractDRClusterExecutor.CommandServiceHelper> services)
        Description copied from class: com.pervasive.datarush.cal.dr.AbstractDRClusterExecutor
        Overridden to either do parallel submit (for dist) or non-parallel (for pseudo-dist)
        Specified by:
        submit in class com.pervasive.datarush.cal.dr.AbstractDRClusterExecutor
        Returns:
      • getMasterResources

        protected ResourceAllocation getMasterResources()
        Overrides:
        getMasterResources in class com.pervasive.datarush.cal.dr.AbstractDRClusterExecutor
      • drfsRoots

        public Map<String,​Path> drfsRoots()
        Overrides:
        drfsRoots in class com.pervasive.datarush.cal.dr.AbstractDRClusterExecutor
      • parallelStartup

        protected final com.pervasive.datarush.cal.dr.ParallelStartupHelper parallelStartup()