Class TextRecord

  • All Implemented Interfaces:
    RecordTextSchema<TextDataType>, TextSchema

    public class TextRecord
    extends Object
    A definition of a variable-width record type in a text file. Because fields are not necessarily fixed-size, additional information must be provided to a structured text reader or writer so that fields can be identified correctly.
    • Field Detail

      • DEFAULT_AUTODISCOVER

        public static final TextRecordDiscoverer DEFAULT_AUTODISCOVER
        The default schema discoverer used for reading delimited text. This is the same discoverer as obtained by creating a new PatternBasedDiscovery.
      • TEXT_FIELD_DISCOVER

        public static final TextRecordDiscoverer TEXT_FIELD_DISCOVER
        A schema discoverer which treats all fields as raw text. When used with ReadDelimitedText, the resulting schema identifies the field values without interpreting them.

        The output type produced is compatible for use as input to ParseTextFields.

    • Constructor Detail

      • TextRecord

        public TextRecord()
        Defines an empty record. Default text conversion behavior is used.

        An empty record definition is not meaningful; at least one field must be defined before the schema is used.

        See Also:
        TextConversionDefaults
      • TextRecord

        public TextRecord​(TextConversionDefaults defaults)
        Defines an empty record with the specified text conversion behavior.

        An empty record definition is not meaningful; at least one field must be defined before the schema is used.

        Parameters:
        defaults - text conversion behavior to apply by default to the schema
      • TextRecord

        public TextRecord​(TextConversionDefaults defaults,
                          Map<String,​TextDataType> fieldTypes)
        Defines a record with the specified text conversion behavior and fields. The field order is the iteration order of the provided field mapping.
        Parameters:
        defaults - default text conversion behavior for the schema
        fieldTypes - the defined fields in the record schema default to the schema
    • Method Detail

      • convert

        public static final TextRecord convert​(RecordTokenType type)
        Generates a text file schema from a token type. The names and order of fields in the resulting schema are the same as in the record token type. Default conversion for each token type will be used.
        Parameters:
        type - the template type for which to create a text schema
        Returns:
        a new schema which maps to the specified token type
      • convert

        public static final TextRecord convert​(RecordTokenType type,
                                               TextTypes.StringConversion behavior)
        Generates a text file schema from a token type, using the specified string conversion default behavior. The names and order of fields in the resulting schema are the same as in the record token type. Default conversion for each token type will be used, with strings following the requested conversion behavior.
        Parameters:
        type - the template type for which to create a text schema
        behavior - the schema-wide default behavior for string conversion
        Returns:
        a new schema which maps to the specified token type
      • convert

        public static final TextRecord convert​(RecordTokenType type,
                                               TextConversionDefaults defaults)
        Generates a text file schema from a token type, using the specified schema-wide default behavior. The names and order of fields in the resulting schema are the same as in the record token type. Default conversion for each token type will be used, following the requested behavior where applicable.
        Parameters:
        type - the template type for which to create a text schema
        defaults - the schema-wide default settings for text conversion
        Returns:
        a new schema which maps to the specified token type
      • extendDefault

        public static TextRecordDiscoverer extendDefault​(List<TypePattern> additions)
        Creates a new discoverer which extends the default patterns. The specified patterns take precedence over default patterns; this is equivalent to prepending the additions to the PatternBasedDiscovery.DEFAULT_PATTERNS.
        Parameters:
        additions - additional patterns to use in determining field types
        Returns:
        a new discoverer for automatically generating a schema for a delimited text file
      • getDefaults

        public final TextConversionDefaults getDefaults()
        Description copied from interface: RecordTextSchema
        Gets the default conversion behaviors for fields in the schema which do not explicitly define any. These include settings such as the null indicator for values and whether string values are trimmed or preserved as is.
        Specified by:
        getDefaults in interface RecordTextSchema<T extends TextDataType>
        Returns:
        the default behavior for the schema
      • getTokenType

        public final RecordTokenType getTokenType()
        Description copied from interface: RecordTextSchema
        Gets the type of the token representation of the text record. The fields of the resulting type are in the same order and have the same name as the fields in the schema.
        Specified by:
        getTokenType in interface RecordTextSchema<T extends TextDataType>
        Returns:
        the token type representing the schema
      • getFieldNames

        public final List<String> getFieldNames()
        Description copied from interface: RecordTextSchema
        Gets the names of the fields defined in the record. The names are returned in the order they were defined.
        Specified by:
        getFieldNames in interface RecordTextSchema<T extends TextDataType>
        Returns:
        a list of defined field names
      • isFieldDefined

        public final boolean isFieldDefined​(String name)
        Description copied from interface: RecordTextSchema
        Indicates whether a field with the given name is already defined.
        Specified by:
        isFieldDefined in interface RecordTextSchema<T extends TextDataType>
        Parameters:
        name - the field to check
        Returns:
        true if a field with the name is defined in the schema, false otherwise.
      • defineField

        public final void defineField​(String name,
                                      T type)
        Defines a new field in the record with the specified name and type.
        Parameters:
        name - the name associated with the field. The name must be unique amongst all defined fields.
        type - specifies how to convert between text and token values
      • defineField

        public final void defineField​(String name,
                                      T type,
                                      FieldDomain domain)
        Defines a new field in the record with the specified name and type.
        Parameters:
        name - the name associated with the field. The name must be unique amongst all defined fields.
        type - specifies how to convert between text and token values
      • getFieldType

        public final T getFieldType​(String name)
        Description copied from interface: RecordTextSchema
        Gets the type of the specified fields in the record. The names are returned in the order they were defined.
        Specified by:
        getFieldType in interface RecordTextSchema<T extends TextDataType>
        Parameters:
        name - the field for which to fetch the type
        Returns:
        the text type of the named field
      • setFieldType

        public final void setFieldType​(String name,
                                       T newType)
        Modifies the value conversion scheme for an existing field.
        Parameters:
        name - the field to modify. This field must exist in the schema.
        newType - specifies how to convert between text and token values
      • createFieldParser

        public final TokenParser createFieldParser​(String fieldName)
        Description copied from interface: RecordTextSchema
        Creates a new parser for values of the specified field.
        Specified by:
        createFieldParser in interface RecordTextSchema<T extends TextDataType>
        Parameters:
        fieldName - the field for which to obtain a parser
        Returns:
        a parser for converting text values to token values for the field
      • createFieldFormatter

        public final TokenFormatter createFieldFormatter​(String fieldName)
        Description copied from interface: RecordTextSchema
        Creates a new formatter for values of the specified field.
        Specified by:
        createFieldFormatter in interface RecordTextSchema<T extends TextDataType>
        Parameters:
        fieldName - the field for which to obtain a formatter
        Returns:
        a formatter for converting token values to text values for the field