Class Join

  • All Implemented Interfaces:
    LogicalOperator

    public final class Join
    extends AbstractRelationalJoin
    Performs a relational equi-join on two input datasets by a specified set of keys. Depending on the value of AbstractRelationalJoin.getUseHashJoinHint() one of two procedures are used.
    1. If hash join hint is false, input data will be sorted and hash partitioned by the specified keys (if not already sorted according to upstream metadata). Once sorted and partitioned, data is is them combined in a streaming fashion. Note that if a key group contains multiple rows on both sides, then all combinations of left and right rows in the key group appear in the output for that key group. In this case, memory consumption is lower if the side with more key repetition is linked to the left input.
    2. If hash join hint is true, a full copy of the data from the right will be distributed to the cluster and loaded into memory within each node in the cluster. The left side will not be sorted or partitioned. Thus, the right side should always be small.
    NOTE: hash join hint is currently ignored if join mode is JoinMode.FULL_OUTER or JoinMode.RIGHT_OUTER.
    • Constructor Detail

      • Join

        public Join()
        Default constructor. Prior to graph compilation, the following property must be set:
      • Join

        public Join​(JoinKey[] joinKeys)
        Performs a join with the given set of join keys
        Parameters:
        joinKeys - the join keys
      • Join

        public Join​(List<JoinKey> joinKeys)
        Performs a join with the given set of join keys
        Parameters:
        joinKeys - the join keys
    • Method Detail

      • isMergeLeftAndRightKeys

        public boolean isMergeLeftAndRightKeys()
        Returns whether to merge the left and right key fields into a single field for output. If true, output keys will be the widest type of left and right types and will be named after the left. If false, left and right keys will be output independently. False by-default.
        Returns:
        whether to merge the left and right key fields into a single field for output
      • setMergeLeftAndRightKeys

        public void setMergeLeftAndRightKeys​(boolean mergeLeftAndRightKeys)
        Sets whether to merge the left and right key fields into a single field for output. If true, output keys will be the widest type of left and right types and will be named after the left. If false, left and right keys will be output independently. False by-default.
        Parameters:
        mergeLeftAndRightKeys - whether to merge the left and right key fields into a single field for output
      • getJoinMode

        public JoinMode getJoinMode()
        Returns the join mode to use when performing a join. JoinMode.INNER by default.
        Returns:
        the join mode to use when performing a join.
      • setJoinMode

        public void setJoinMode​(JoinMode joinMode)
        Sets the join mode to use when performing a join. JoinMode.INNER by default.
        Parameters:
        joinMode - the join mode to use when performing a join.