public class AvroSchemaUtils extends Object
Modifier and Type | Method and Description |
---|---|
static String |
cleanseName(String fieldName)
Cleanses the specified name so it is a valid field name
in Avro.
|
static RecordTokenType |
determineType(org.apache.avro.Schema schema)
Maps an Avro schema to a DataRush record type.
|
static org.apache.avro.Schema |
generateSchema(RecordTokenType type)
Creates an Avro schema from the given DataRush record type.
|
static boolean |
isNullable(org.apache.avro.Schema schema)
Indicates whether the specified Avro schema supports
setting null values.
|
static boolean |
isWritable(ScalarTokenType type,
org.apache.avro.Schema schema)
Indicates whether the specified DataRush type can be encoded in
the given schema.
|
public static String cleanseName(String fieldName)
fieldName
- the name to cleansepublic static org.apache.avro.Schema generateSchema(RecordTokenType type)
The generated schema is an Avro RECORD consisting of fields in the the same order as the record,
having the same names. Field names are cleansed to be valid Avro field names using
cleanseName(String)
. If this cleansing results in a name collision, an error is raised.
Each field in the generated schema will have a UNION type including NULL and the appropriate Avro schema type
based on the input type as listed below:
domain
on the source field. If no domain is specified, it is mapped to the STRING primitive type.
If a domain is specified, it is mapped to an ENUM having the same set of symbols as the domain.DateValued#asEpochDays()
.TimeValued#asDayMillis()
.TimestampValued
.type
- the for which to generate a schemapublic static boolean isWritable(ScalarTokenType type, org.apache.avro.Schema schema)
type
- the field type to checkschema
- the target schema for the fieldtrue
if the target schema permits values of the
specified type to be written (excluding consideration of null values),
false
otherwise.public static RecordTokenType determineType(org.apache.avro.Schema schema)
The provided schema will be converted to a record type having fields of the same name and appearing in the same order. If the schema is not of RECORD type, it will be treated as if it were a single field name "field0" in a records.
Fields with primitive Avro types are mapped to DataRush as indicated in the table below:
Source Avro Type | Target DataRush Type |
---|---|
BOOLEAN | BOOLEAN |
BYTES | BINARY |
DOUBLE | DOUBLE |
FIXED | BINARY |
FLOAT | FLOAT |
LONG | LONG |
INT | INT |
STRING | STRING |
For complex Avro datatypes, the mapping to DataRush is as follows:
scalar type
. Nested
records are not currently allowed except for the Avro RECORD representations of
DataRush DATE, TIME, and TIMESTAMP types as described in the WriteAvro
operator.domain
to the enumerated list of symbols.schema
- the schema for which to determine the equivalent record typeDRException
- if the schema cannot be converted to a record typepublic static boolean isNullable(org.apache.avro.Schema schema)
schema
- the schema to checktrue
if a null value can be written
to the schema, false
otherwise.Copyright © 2024 Actian Corporation. All rights reserved.