Class Phase


  • public class Phase
    extends Object
    Configures a phase of field comparisons, classifiers and a filter to use during a matching operation. A matching operation may be composed with multiple phases. A phase must contain comparisons and a filter. It can contain multiple classifiers, but classifiers are not required by a phase.

    Comparisons are linked to classifiers by a field name. A comparison produces an output field (a score). Multiple scores are normally fed into a classifier which produces an aggregated score. It is also possible to configure a classifer to take the output of multiple classifiers. A filter is used to filter out records that don't meet the criteria of the filter. The result of a comparison may also be directly fed into a filter when a classifier is not needed.

    The result of a phase is a set of field comparison results that are classified into an aggregate score(s) that are used to filter record pairs. Only record pairs that meet the filter criteria are pushed to the resultant data flow.

    • Constructor Detail

      • Phase

        public Phase​(List<Comparison> comparisons,
                     Classifier classifier,
                     Filter filter,
                     Phase.CleanupMode cleanupMode)
        Construct a phase with the given configuration.
        Parameters:
        comparisons - configuration of the comparisons to execute
        classifier - configuration of the classifier to use
        filter - configuration of the filter to use
        cleanupMode - indicates how to handle input and intermediate fields
      • Phase

        public Phase​(List<Comparison> comparisons,
                     List<Classifier> classifiers,
                     Filter filter,
                     Phase.CleanupMode cleanupMode)
        Construct a phase with the given configuration.
        Parameters:
        comparisons - configuration of the comparisons to execute
        classifiers - configuration of the classifiers to use
        filter - configuration of the filter to use
        cleanupMode - indicates how to handle input and intermediate fields
    • Method Detail

      • setComparisons

        public void setComparisons​(List<Comparison> comparisons)
        Set the comparisons to use in this phase. Replaces any comparisons already set.
        Parameters:
        comparisons - configuration of the comparisons to execute
      • addComparison

        public void addComparison​(Comparison comparison)
        Add a comparison to this phase.
        Parameters:
        comparison - configuration of a comparison
      • addComparison

        public void addComparison​(String leftField,
                                  String rightField,
                                  ComparisonType op,
                                  String outputField)
        Add a comparison to this phase.
        Parameters:
        leftField - left hand side field to compare
        rightField - right hand side field to compare
        op - the comparison operation to execute
        outputField - name of the field with the comparison result
      • addComparison

        public void addComparison​(String leftField,
                                  String rightField,
                                  ComparisonType op,
                                  String outputField,
                                  String propertyName,
                                  Object propertyValue)
        Add a comparison to this phase.
        Parameters:
        leftField - left hand side field to compare
        rightField - right hand side field to compare
        op - the comparison operation to execute
        outputField - name of the field with the comparison result
        propertyName - name of an implementation specific property
        propertyValue - value of an implementation specific property
      • addComparison

        public void addComparison​(String leftField,
                                  String rightField,
                                  ComparisonType op,
                                  String outputField,
                                  Map<String,​Object> properties)
        Add a comparison to this phase.
        Parameters:
        leftField - left hand side field to compare
        rightField - right hand side field to compare
        op - the comparison operation to execute
        outputField - name of the field with the comparison result
        properties - implementation specific properties
      • getComparisons

        public List<Comparison> getComparisons()
        Get the list of comparisons configured for this phase.
        Returns:
        list of comparisons configured for this phase
      • getClassifiers

        public List<Classifier> getClassifiers()
        Get the list of classifiers configured for this phase.
        Returns:
        list of classifiers configured for this phase
      • addClassifiers

        public void addClassifiers​(Classifier... classifier)
        Add a classifier to this phase.
        Parameters:
        classifier - configuration of a classifier
      • setClassifiers

        public void setClassifiers​(List<Classifier> classifiers)
        Set the classifiers for this phase. Replaces any classifiers already set.
        Parameters:
        classifiers - classifiers for this phase
      • addClassifier

        public void addClassifier​(ClassifierType type,
                                  String[] fieldNames)
        Add a classifier to this phase using the default output field name.
        Parameters:
        type - classifier type
        fieldNames - names of input fields
      • addClassifier

        public void addClassifier​(ClassifierType type,
                                  String[] fieldNames,
                                  String outputFieldName)
        Add a classifier to this phase.
        Parameters:
        type - classifier type
        fieldNames - names of input fields
        outputFieldName - name of the field with the classifier result
      • addClassifier

        public void addClassifier​(ClassifierType type,
                                  String[] fieldNames,
                                  String propertyName,
                                  Object propertyValue)
        Add a classifier to this phase.
        Parameters:
        type - classifier type
        fieldNames - names of input fields
        propertyName - name of an implementation specific property
        propertyValue - value of an implementation specific property
      • addClassifier

        public void addClassifier​(ClassifierType type,
                                  String[] fieldNames,
                                  Map<String,​Object> properties)
        Add a classifier to this phase.
        Parameters:
        type - classifier type
        fieldNames - names of input fields
        properties - implementation specific properties
      • addClassifier

        public void addClassifier​(ClassifierType type,
                                  String[] fieldNames,
                                  String outputFieldName,
                                  String propertyName,
                                  Object propertyValue)
        Add a classifier to this phase.
        Parameters:
        type - classifier type
        fieldNames - names of input fields
        outputFieldName - name of the field with the classifier result
        propertyName - name of an implementation specific property
        propertyValue - value of an implementation specific property
      • addClassifier

        public void addClassifier​(ClassifierType type,
                                  String[] fieldNames,
                                  String outputFieldName,
                                  Map<String,​Object> properties)
        Add a classifier to this phase.
        Parameters:
        type - classifier type
        fieldNames - names of input fields
        outputFieldName - name of the field with the classifier result
        properties - implementation specific properties
      • getFilter

        public Filter getFilter()
        Get the filter configured for this phase.
        Returns:
        configured filter
      • setFilter

        public void setFilter​(Filter filter)
        Set the filter to use for this phase.
        Parameters:
        filter - configured filter
      • setFilter

        public void setFilter​(FilterType type,
                              String fieldName,
                              Map<String,​Object> properties)
        Set the filter to use for this phase.
        Parameters:
        type - filter type
        fieldName - input field name
        properties - implementation specific properties
      • setFilter

        public void setFilter​(FilterType type,
                              String fieldName,
                              String propertyName,
                              Object propertyValue)
        Set the filter to use for this phase.
        Parameters:
        type - filter type
        fieldName - input field name
        propertyName - name of an implementation specific property
        propertyValue - value of an implementation specific property
      • setFilter

        public void setFilter​(FilterType type,
                              String propertyName,
                              Object propertyValue)
        Set the filter to use for this phase using the default output field name.
        Parameters:
        type - filter type
        propertyName - name of an implementation specific property
        propertyValue - value of an implementation specific property
      • getCleanupMode

        public Phase.CleanupMode getCleanupMode()
        Gets the cleanup mode configured for phases.
        Returns:
        the configured post-phase cleanup.
      • setCleanupMode

        public void setCleanupMode​(Phase.CleanupMode cleanupMode)
        Sets the cleanup mode to apply after a phase completes.
        Parameters:
        cleanupMode - the post-phase cleanup action