Class TypeUtil


  • public class TypeUtil
    extends Object
    Various utilities for manipulating token types.
    • Method Detail

      • getRecordType

        public static <T extends ScalarTyped & NamedRecordTokenType getRecordType​(T... fields)
        Computes the record type which a composite of the specified fields would have.
        Parameters:
        fields - the objects representing the fields of the composite
        Returns:
        the composite's record type
      • validateFieldSelection

        public static final RecordTokenType validateFieldSelection​(String propertyName,
                                                                   RecordTokenType source,
                                                                   String... selected)
        Validates a set of field names against the source type, computing the resulting type.
        Parameters:
        propertyName - the property name to use in the validation failure exception
        source - the record type to check
        selected - the field names to validate, in the desired ordering for the new type
        Returns:
        the type filtered by the selected fields
        Throws:
        com.pervasive.datarush.graphs.physical.InvalidPropertyValueException - if one or more of the specified fields do not exist in the source type
      • validateRequiredFields

        public static final void validateRequiredFields​(String propertyName,
                                                        RecordTokenType source,
                                                        RecordTokenType required)
        Validates a set of Fields against the source type ensuring the required field names are present and of the same type.
        Parameters:
        propertyName - the property name to use in the validation failure exception
        source - the record type to check
        required - the fields to validate
        Throws:
        com.pervasive.datarush.graphs.physical.InvalidPropertyValueException - if one or more of the specified fields do not exist in the source type
      • select

        public static final RecordTokenType select​(RecordTokenType source,
                                                   String... selected)
        Creates a new record type containing only fields from the source type which match one of the specified field names. The order of the fields in the new type match the order in which the names were specified.

        The specified fields must be valid names in the source type.

        Parameters:
        source - the record type to filter
        selected - the field names to keep in the new type, in the desired ordering for the new type
        Returns:
        the resulting filtered record type
        Throws:
        InvalidFieldException - if one or more of the specified fields do not exist in the source type
      • toType

        public static final RecordTokenType toType​(DataRepresentation representation,
                                                   Namespace<ScalarTokenType> namespace)
        Converts from a list of fields to a record token type.
        Parameters:
        representation - the representation
        namespace - the list of fields
        Returns:
        a newly constructed type
      • select

        public static final RecordTokenType select​(RecordTokenType source,
                                                   List<String> selected)
        Creates a new record type containing only fields from the source type which match one of the specified field names. The order of the fields in the new type match the order in which the names were specified.

        The specified fields must be valid names in the source type.

        Parameters:
        source - the record type to filter
        selected - the field names to keep in the new type, in the desired ordering for the new type
        Returns:
        the resulting filtered record type
        Throws:
        InvalidFieldException - if one or more of the specified fields do not exist in the source type
      • retain

        public static final RecordTokenType retain​(RecordTokenType source,
                                                   String... retained)
        Creates a new record type containing only fields from the source type which match one of the specified field names. The order of the fields is unchanged; fields remain in the same relative order as in the source.

        The specified fields do not need to be valid names in the source type.

        Parameters:
        source - the record type to filter
        retained - the field names to keep in the new type
        Returns:
        the resulting filtered record type
      • retain

        public static final RecordTokenType retain​(RecordTokenType source,
                                                   List<String> retained)
        Creates a new record type containing only fields from the source type which match one of the specified field names. The order of the fields is unchanged; fields remain in the same relative order as in the source.

        The specified fields do not need to be valid names in the source type.

        Parameters:
        source - the record type to filter
        retained - the field names to keep in the new type
        Returns:
        the resulting filtered record type
      • remove

        public static final RecordTokenType remove​(RecordTokenType source,
                                                   String... removed)
        Creates a new record type containing all fields from the source type except those matching one of the specified field names. The order of the fields is unchanged; fields remain in the same relative order as in the source.

        The specified fields do not need to be valid names in the source type.

        Parameters:
        source - the record type to filter
        removed - the field names to remove from the new type
        Returns:
        the resulting filtered record type
      • remove

        public static final RecordTokenType remove​(RecordTokenType source,
                                                   List<String> removed)
        Creates a new record type containing all fields from the source type except those matching one of the specified field names. The order of the fields is unchanged; fields remain in the same relative order as in the source.

        The specified fields do not need to be valid names in the source type.

        Parameters:
        source - the record type to filter
        removed - the field names to remove from the new type
        Returns:
        the resulting filtered record type
      • reorderAndRename

        public static final RecordTokenType reorderAndRename​(RecordTokenType source,
                                                             String[] sourceNames,
                                                             String[] targetNames)
        Utility for the renaming/reordering of a given record type. Reorders the source record type to match the order specified in sourceNames. Those fields are then renamed to the corresponding ordinal counterpart in targetNames.
        Parameters:
        source - source record type
        sourceNames - list of fields in source ordered to the desired output
        targetNames - list of names the corresponding ordinal source field name will be changed to
        Returns:
        the remapped type
      • rename

        public static final RecordTokenType rename​(RecordTokenType source,
                                                   Map<String,​String> sourceToTargetNameMap)
        Utility for renaming fields in the source type using the given source and target field names. The returned type will contain the same number of fields in the original order. Only the names provided in the given map will be changed. The source and field name lists should contain the same number of entries and are order dependent.
        Parameters:
        source - source token type
        sourceToTargetNameMap - mapping from old names to new names
        Returns:
        new type with names changed
      • rename

        public static final RecordTokenType rename​(RecordTokenType source,
                                                   String[] sourceNames,
                                                   String[] targetNames)
        Utility for renaming fields in the source type using the given source and target field names. The returned type will contain the same number of fields in the original order. Only the names provided in the given map will be changed. The source and field name lists should contain the same number of entries and are order dependent.
        Parameters:
        source - source token type
        sourceNames - list of field names in the source
        targetNames - list of names to substitute for the source names
        Returns:
        new type with names changed
      • wrap

        public static RecordTokenType wrap​(String name,
                                           ScalarTokenType type)
        Creates a new record type with a single named field of the given type.
        Parameters:
        name - the name for the field
        type - the type of the field data
        Returns:
        a new record type with a single field
      • wrap

        public static RecordTokenType wrap​(String prefix,
                                           List<? extends ScalarTokenType> types)
        Creates a new record schema with the specified field types. Each field is given a name of prefix + i, where i is the field's 0-based position in the input list.
        Parameters:
        prefix - the field name prefix
        types - the types of field data
        Returns:
        a new record type with fields typed as specified
      • wrap

        public static RecordTokenType wrap​(List<? extends ScalarTokenType> types)
        Creates a new record type with the specified field types. Each flow is given a field name of "field" + i, where i is the flow's 0-based position in the input list.
        Parameters:
        types - the types of field data
        Returns:
        a new record type with fields typed as specified
      • mergeRepresentations

        public static DataRepresentation mergeRepresentations​(TokenType... types)
        Returns the overall representation to be used when combining types. If any of the types are sparse, the result is sparse.
        Parameters:
        types - List of types
        Returns:
        The overall representation
      • merge

        public static RecordTokenType merge​(RecordTokenType... types)
        Merges the specified record types into a new one, handling name collisions by renaming. The nth instance of a field name will have "_" appended to it. If there is more than one primary key, the first primary key is chosen.

        The following conditions will hold with respect to the ordering of fields in the result:

        • All fields from a record type will be before any field from a record type later in the input list.
        • All fields from a record type will preserve their relative ordering.
        As an example, consider two record types - "A" which is ordered {"a", "c"} and "B" which is ordered {"b", "c"}. The resulting record type will be {"a", "c", "b", "c_2"}; "c" conflicts, so is renamed. The type associated with "c" is the one from type "A", with "c_2" the one from type "B".

        For a destructive merge which overwrites fields in collision, use overlay(RecordTokenType...) instead.

        Parameters:
        types - the record types to merge
        Returns:
        a new record type representing the merge of the input record types
      • overlay

        public static RecordTokenType overlay​(RecordTokenType... types)
        Merges the specified record types into a new record type, handling name collisions with a last-one-wins mechanism. Last is defined as the rightmost record type in the input list containing the name in conflict.

        The following conditions will hold with respect to the ordering of fields in the result:

        • All fields from a record type will be before any field from a record type later in the input list.
        • All fields from a record type will preserve their relative ordering except those colliding with a field from an earlier record type. Those fields always occur before other fields in the record type, in an ordering consistent with the first record type containing each.
        As an example, consider two record types - "A" which is ordered {"a", "c"} and "B" which is ordered {"b", "c"}. The resulting record type will be {"a", "c", "b"}; "c" conflicts, so it appears in an order consistent with "A", the first record type containing it. Note however, the type associated with "c" will be the one from record type "B"; the type associated with "c" in "A" is lost.

        For a non-destructive merge which doesn't replace collisions, use merge(RecordTokenType...) instead.

        Parameters:
        types - the record types to merge
        Returns:
        a new record type representing the merge of the input record types
      • strictOverlay

        public static RecordTokenType strictOverlay​(boolean dropUnique,
                                                    RecordTokenType... types)
                                             throws InvalidFieldException
        Merges the specified record types into a new record type, allowing name collisions only if isAssignableFrom() is true for the types. Otherwise null will be returned.

        The following conditions will hold with respect to the ordering of fields in the result:

        • All fields from a record type will be before any field from a record type later in the input list.
        • All fields from a record type will preserve their relative ordering except those colliding with a field from an earlier record type. Those fields always occur before other fields in the record type, in an ordering consistent with the first record type containing each.
        As an example, consider two record types - "A" which is ordered {"a", "c"} and "B" which is ordered {"b", "c"}. The resulting record type will be {"a", "c", "b"}; "c" conflicts, so it appears in an order consistent with "A", the first record type containing it. Note however, the type associated with "c" will be the widest of the types associated with "c" in "A" and "B".

        For a non-destructive merge which doesn't replace collisions, use merge(RecordTokenType...) instead.

        Parameters:
        dropUnique - if true will not include fields not present in all records
        types - the record types to merge
        Returns:
        a new record type representing the merge of the input record types
        Throws:
        InvalidFieldException
      • mergeTypes

        public static RecordTokenType mergeTypes​(TokenType... types)
        Merges the specified types into a single record type, handling name collisions by renaming. If there is more than one primary key, the first primary key is chosen.

        The result is equivalent to calling merge(RecordTokenType...), passing each record type straight through and replacing all scalar flows in the input with wrap("input"+i, type). Refer to merge(RecordTokenType...) for specific details on the merged result.

        Parameters:
        types - the types to merge
        Returns:
        a new record type representing the merge of the input types
      • matchFieldNames

        public static final RecordTokenType matchFieldNames​(RecordTokenType source,
                                                            RecordTokenType match)
        Utility for renaming the fields of a given record type to match that of another. A new record type is created with the fields of source but the field names of match.
        Parameters:
        source - source record type
        match - record type whose field names should be matched
        Returns:
        the resulting type
      • homogeneousRecord

        public static RecordTokenType homogeneousRecord​(int size,
                                                        ScalarTokenType fieldType,
                                                        String fieldBase)
        Constructs a record type descriptor in which all fields have the same type. Field names are computed as fieldBase + index where index is the field's index in the record type.
        Parameters:
        size - Number of desired fields
        fieldType - Scalar type of all fields
        fieldBase - Base field name
        Returns:
        Homogeneous record type
      • fromJSON

        public static TokenType fromJSON​(String json)
        Parses a JSON description of a TokenType. This method acts as an inverse to the method {toJSON(TokenType) method on token types; for any type, it will always be the case that:

        type.equals(TypeUtil.fromJSON(TypeUtil.toJSON(type))

        Parameters:
        json - the JSON format of a type.
        Returns:
        the described type
      • toJSON

        public static String toJSON​(TokenType type)
        Generates the JSON description of the specified type.
        Parameters:
        type - the type for which to generate a description
        Returns:
        the type description
      • widestType

        public static ScalarTokenType widestType​(ScalarTokenType... types)
        Determines the widest of the specified types. That is, which of the scalar token types T for which T.isAssignableFrom() is true for all of the types.
        Parameters:
        types - the scalar types to analyze
        Returns:
        the widest of the types. If there is no such type or no types are specified, null.
        See Also:
        TokenType.isAssignableFrom(TokenType)
      • getTypes

        public static ScalarTokenType[] getTypes​(RecordTokenType type)
        Returns the types of the fields of this record type.
        Parameters:
        type - the record type
        Returns:
        the types of the fields of this record type.
      • valueOf

        public static ScalarTokenType valueOf​(String type)
        Gets the named scalar type.
        Parameters:
        type - the type to get
        Returns:
        the identified type
      • widestType

        public static RecordTokenType widestType​(RecordTokenType... types)
        Calculates the record type for which isAssignableFrom() is true for all of the specified types. Such a type can only be found if all record types contain the same number of fields and a widest scalar type can be found for each field.
        Parameters:
        types - the record types to analyze
        Returns:
        a record type can be assigned from any of the input types. If there is no such type or no types are specified, null.
        See Also:
        TokenType.isAssignableFrom(TokenType), widestType(ScalarTokenType...)
      • widestNamedType

        public static RecordTokenType widestNamedType​(RecordTokenType... types)
        Calculates the record type for which isAssignableFrom() is true for all of the specified types matched by name. Such a type can only be found if all record types contain the same number of named fields and a widest scalar type can be found for each named field.
        Parameters:
        types - the record types to analyze
        Returns:
        a record type can be assigned from any of the input types. If there is no such type or no types are specified, null.
        See Also:
        TokenType.isAssignableFrom(TokenType), widestType(ScalarTokenType...)
      • annotate

        public static Field annotate​(Field field,
                                     String propertyName,
                                     String propertyValue)
        Returns a new field object with the same type and name as the original, but with the given property set to the given value.
        Parameters:
        field - The field to annotate
        propertyName - The name of the annotation
        propertyValue - The value of the annotation
        Returns:
        a new field with the annotation as specified
      • primaryKey

        public static Field primaryKey​(Field field,
                                       boolean primaryKey)
        Returns a new field object with the same type and name as the original, but with the primaryKey flag set to the given value.
        Parameters:
        field - The original field
        primaryKey - The value for the primaryKey flag
        Returns:
        a new field with the unique flag set to the specified value
      • nonUnique

        public static Field nonUnique​(Field field)
        Returns a new field object with the same type and name as the original, but with the primaryKey flag set to false
        Parameters:
        field - The original field
        Returns:
        a new field with the primaryKey flag set to false
      • nonUnique

        public static RecordTokenType nonUnique​(RecordTokenType type)
        Returns a RecordTokenType, equivalent to the original, but with all primaryKey flags set to false
        Parameters:
        type - the original type
        Returns:
        a new type, with uniqueness constraints removed
      • annotate

        public static RecordTokenType annotate​(RecordTokenType type,
                                               String propertyName,
                                               String propertyValue)
        Returns a new RecordTokenType with the given annotation applied to all of its fields.
        Parameters:
        type - The type to annotate
        propertyName - The name of the annotation
        propertyValue - The value of the annotation
        Returns:
        a new type with the annotation as specified
      • deriveSchema

        public static RecordTokenType deriveSchema​(ScalarTyped... columns)
        Builds a default record schema using the types of the specified columnar objects. Fields will be assigned default names using the pattern "field0", "field1", ..., "fieldN".
        Parameters:
        columns - objects describing the columns. These objects implicitly provide the column data type.
        Returns:
        a schema which would be appropriate for the specified columns
      • addSourceInfoFields

        public static RecordTokenType addSourceInfoFields​(RecordTokenType type)
        Constructs the output type used when tagging with source information fields. There are three fields added:
        • sourcePath, a string naming the original source file from which the record originated.
        • splitOffset, a long providing the starting byte offset of the the parse split in the file.
        • recordOffset, a long providing the starting offset for the record within the parse split.
        These fields will always be the first fields of the result type. However, they will be renamed as necessary to resolve collisions; fields in the source type will never be renamed.
        Parameters:
        type - the original source type to be extended with source information
        Returns:
        the expected source-tagged output type
      • mutating

        public static RecordTokenType mutating​(RecordTokenType type,
                                               Collection<String> modifiedFields)
        Computes the resulting type assuming that the specified fields may be mutated. Fields that are not modified will have domain and custom metadata preserved. Those that are modified will have name and type only preserved.

        This methods assumes that both the underlying schema and the relative ordering of records in the flow is unchanged.

        Parameters:
        the - the original type
        modifiedFields - the fields that may be modified
        Returns:
        the type after changing field values
      • getDomainValuesAsStrings

        public static List<String> getDomainValuesAsStrings​(FieldDomain domain)
        Returns the values of the domain as strings
        Parameters:
        domain - the domain
        Returns:
        the the domain as strings
      • hasDomainValues

        public static boolean hasDomainValues​(RecordTokenType type,
                                              String field)
        Returns whether the given field has domain values defined.
        Parameters:
        type - the type
        field - the name of the field
        Returns:
        whether the given field has domain values defined.
        Throws:
        InvalidFieldException - if the field is not defined
      • hasDomainValues

        public static boolean hasDomainValues​(RecordTokenType type)
        Returns whether all of the fields in the given type have domain values.
        Parameters:
        type - the type
        Returns:
        whether the given field has domain values defined.
        Throws:
        InvalidFieldException - if the field is not defined
      • withDomainValues

        public static RecordTokenType withDomainValues​(RecordTokenType type,
                                                       String fieldName,
                                                       Set<String> discovered)
        Returns a new RecordTokenType with domain values set to the discovered values.
        Parameters:
        type - the original type
        fieldName - the field to update
        discovered - the discovered values
        Returns:
        a new RecordTokenType
      • mergeDomain

        public static FieldDomain mergeDomain​(FieldDomain domain1,
                                              FieldDomain domain2)
        Merges two domains. The result will have:
        1. lowerBound equal to the min of the two lower bounds or unspecified lower bound if either is unspecified
        2. upperBound equal to the max of the two upper bounds or unspecified upper bound if either is unspecified
        3. values equal to the union of the two sets of values or unspecified if either is unspecified
        Parameters:
        domain1 - the first domain
        domain2 - the second domain
        Returns:
        the combined domain
        Throws:
        IllegalArgumentException - if there is no common base class between the types of the two domains