Class TextRecord

java.lang.Object
com.pervasive.datarush.schema.TextRecord
All Implemented Interfaces:
RecordTextSchema<TextDataType>, TextSchema

public class TextRecord extends Object
A definition of a variable-width record type in a text file. Because fields are not necessarily fixed-size, additional information must be provided to a structured text reader or writer so that fields can be identified correctly.
  • Field Details

    • DEFAULT_AUTODISCOVER

      public static final TextRecordDiscoverer DEFAULT_AUTODISCOVER
      The default schema discoverer used for reading delimited text. This is the same discoverer as obtained by creating a new PatternBasedDiscovery.
    • TEXT_FIELD_DISCOVER

      public static final TextRecordDiscoverer TEXT_FIELD_DISCOVER
      A schema discoverer which treats all fields as raw text. When used with ReadDelimitedText, the resulting schema identifies the field values without interpreting them.

      The output type produced is compatible for use as input to ParseTextFields.

    • textFieldsByName

      protected final Map<String,TextField<TextDataType>> textFieldsByName
  • Constructor Details

    • TextRecord

      public TextRecord()
      Defines an empty record. Default text conversion behavior is used.

      An empty record definition is not meaningful; at least one field must be defined before the schema is used.

      See Also:
    • TextRecord

      public TextRecord(TextConversionDefaults defaults)
      Defines an empty record with the specified text conversion behavior.

      An empty record definition is not meaningful; at least one field must be defined before the schema is used.

      Parameters:
      defaults - text conversion behavior to apply by default to the schema
    • TextRecord

      public TextRecord(TextConversionDefaults defaults, Map<String,TextDataType> fieldTypes)
      Defines a record with the specified text conversion behavior and fields. The field order is the iteration order of the provided field mapping.
      Parameters:
      defaults - default text conversion behavior for the schema
      fieldTypes - the defined fields in the record schema default to the schema
  • Method Details

    • convert

      public static final TextRecord convert(RecordTokenType type)
      Generates a text file schema from a token type. The names and order of fields in the resulting schema are the same as in the record token type. Default conversion for each token type will be used.
      Parameters:
      type - the template type for which to create a text schema
      Returns:
      a new schema which maps to the specified token type
    • convert

      public static final TextRecord convert(RecordTokenType type, TextTypes.StringConversion behavior)
      Generates a text file schema from a token type, using the specified string conversion default behavior. The names and order of fields in the resulting schema are the same as in the record token type. Default conversion for each token type will be used, with strings following the requested conversion behavior.
      Parameters:
      type - the template type for which to create a text schema
      behavior - the schema-wide default behavior for string conversion
      Returns:
      a new schema which maps to the specified token type
    • getTextFields

      public List<TextField<TextDataType>> getTextFields()
    • convert

      public static final TextRecord convert(RecordTokenType type, TextConversionDefaults defaults)
      Generates a text file schema from a token type, using the specified schema-wide default behavior. The names and order of fields in the resulting schema are the same as in the record token type. Default conversion for each token type will be used, following the requested behavior where applicable.
      Parameters:
      type - the template type for which to create a text schema
      defaults - the schema-wide default settings for text conversion
      Returns:
      a new schema which maps to the specified token type
    • extendDefault

      public static TextRecordDiscoverer extendDefault(List<TypePattern> additions)
      Creates a new discoverer which extends the default patterns. The specified patterns take precedence over default patterns; this is equivalent to prepending the additions to the PatternBasedDiscovery.DEFAULT_PATTERNS.
      Parameters:
      additions - additional patterns to use in determining field types
      Returns:
      a new discoverer for automatically generating a schema for a delimited text file
    • getDefaults

      public final TextConversionDefaults getDefaults()
      Description copied from interface: RecordTextSchema
      Gets the default conversion behaviors for fields in the schema which do not explicitly define any. These include settings such as the null indicator for values and whether string values are trimmed or preserved as is.
      Specified by:
      getDefaults in interface RecordTextSchema<T extends TextDataType>
      Returns:
      the default behavior for the schema
    • getTokenType

      public final RecordTokenType getTokenType()
      Description copied from interface: RecordTextSchema
      Gets the type of the token representation of the text record. The fields of the resulting type are in the same order and have the same name as the fields in the schema.
      Specified by:
      getTokenType in interface RecordTextSchema<T extends TextDataType>
      Returns:
      the token type representing the schema
    • getFieldCount

      public final int getFieldCount()
      Description copied from interface: RecordTextSchema
      Gets the number of fields defined in the record.
      Specified by:
      getFieldCount in interface RecordTextSchema<T extends TextDataType>
      Returns:
      the number of defined fields
    • getFieldNames

      public final List<String> getFieldNames()
      Description copied from interface: RecordTextSchema
      Gets the names of the fields defined in the record. The names are returned in the order they were defined.
      Specified by:
      getFieldNames in interface RecordTextSchema<T extends TextDataType>
      Returns:
      a list of defined field names
    • isFieldDefined

      public final boolean isFieldDefined(String name)
      Description copied from interface: RecordTextSchema
      Indicates whether a field with the given name is already defined.
      Specified by:
      isFieldDefined in interface RecordTextSchema<T extends TextDataType>
      Parameters:
      name - the field to check
      Returns:
      true if a field with the name is defined in the schema, false otherwise.
    • defineField

      public final void defineField(String name, TextDataType type)
      Defines a new field in the record with the specified name and type.
      Parameters:
      name - the name associated with the field. The name must be unique amongst all defined fields.
      type - specifies how to convert between text and token values
    • defineField

      public final void defineField(String name, TextDataType type, FieldDomain domain)
      Defines a new field in the record with the specified name and type.
      Parameters:
      name - the name associated with the field. The name must be unique amongst all defined fields.
      type - specifies how to convert between text and token values
    • getFieldType

      public final TextDataType getFieldType(String name)
      Description copied from interface: RecordTextSchema
      Gets the type of the specified fields in the record. The names are returned in the order they were defined.
      Specified by:
      getFieldType in interface RecordTextSchema<T extends TextDataType>
      Parameters:
      name - the field for which to fetch the type
      Returns:
      the text type of the named field
    • getFieldDomain

      public final FieldDomain getFieldDomain(String name)
      Description copied from interface: RecordTextSchema
      Gets the domain of the specified field in the record.
      Specified by:
      getFieldDomain in interface RecordTextSchema<T extends TextDataType>
      Parameters:
      name - the field for which to fetch the domain
      Returns:
      the domain of the named field
    • setFieldType

      public final void setFieldType(String name, TextDataType newType)
      Modifies the value conversion scheme for an existing field.
      Parameters:
      name - the field to modify. This field must exist in the schema.
      newType - specifies how to convert between text and token values
    • createFieldParser

      public final TokenParser createFieldParser(String fieldName)
      Description copied from interface: RecordTextSchema
      Creates a new parser for values of the specified field.
      Specified by:
      createFieldParser in interface RecordTextSchema<T extends TextDataType>
      Parameters:
      fieldName - the field for which to obtain a parser
      Returns:
      a parser for converting text values to token values for the field
    • createFieldFormatter

      public final TokenFormatter createFieldFormatter(String fieldName)
      Description copied from interface: RecordTextSchema
      Creates a new formatter for values of the specified field.
      Specified by:
      createFieldFormatter in interface RecordTextSchema<T extends TextDataType>
      Parameters:
      fieldName - the field for which to obtain a formatter
      Returns:
      a formatter for converting token values to text values for the field