public abstract class KeyValueOperator extends KeyOperator
A DataRush Field can be mapped to a select HBase cell or it can be mapped to all cells in a family as a sub table:
1. mapCell(java.lang.String, java.lang.String, java.lang.String)
- Map a select cell within a column family as a field. Cells within a family
can be of heterogeneous type and only the mapped cells are accessed. All mapped cell fields
will be together in a single record. Optional row key, and time key fields can be specified to uniquely id each DataRush record.
2. mapFamily(java.lang.String, java.lang.String)
- Map all cells within a column family as a sub table of fields. All cells within a family
are of homogeneous type and all cells are accessed. Each cell within a family is contained in an individual record.
Optional row key, qualifier key, and time key fields can be specified to uniquely id each DataRush record.
Mapping cells such that each cell contains a single field will serialize/deserialize the following DataRush data types using common HBase formats:
An HBase cell can also be mapped as a record of fields:
1. mapCellRecord(java.lang.String, java.lang.String, java.util.Map<java.lang.String, java.lang.String>)
- Map a select cell within a column family as a record. Cells within a family
can be of heterogeneous type and only the mapped cells are accessed. All mapped cell fields
will be together in a single record.Optional row key, and time key fields can be specified to uniquely id each DataRush record.
2. mapFamilyRecord(java.lang.String, java.util.Map<java.lang.String, java.lang.String>)
- Map all cells within a column family as a sub table of records. All cells within a family
are of homogeneous type and all cells are accessed. Each cell within a family is contained in an individual record.
Optional row key, qualifier key, and time key fields can be specified to uniquely id each DataRush record.
Mapping a single HBase cell as a record of fields will allow multiple fields to be packed into a single cell thus greatly increasing io performance at the expense of reduced version granularity. All fields packed together in a single cell are versioned together and therefore all fields must be present when writing. The default DataRush serialization is used exclusively in this case.
DataRush schema is persisted in HBase for any cell that DataRush writes to.
If a DataRush schema exists:
If a DataRush schema does not exist:
WriteHBase
,
ReadHBase
,
DeleteHBase
catalogTableName, cellSchemaFamilyName, keySchemaFamilyName, statsFamilyName
Constructor and Description |
---|
KeyValueOperator() |
Modifier and Type | Method and Description |
---|---|
Map<String,Map<String,Map<String,String>>> |
getCellFieldMap()
Get the HBase cell to field mapping.
|
Map<String,Map<String,String>> |
getFamilyFieldMap()
Get the column family to field mapping.
|
String |
getHCatalogDatabase()
Get the HCatalog database from which to retrieve a schema and mapping.
|
List<String> |
getHCatalogFields()
Get the HCatalog fields to read or write.
|
String |
getHCatalogTable()
Get the HCatalog table from which to retrieve a schema and mapping.
|
void |
mapCell(String familyName,
String qualifier,
List<String> schemaNames,
List<String> fieldNames)
Deprecated.
|
void |
mapCell(String familyName,
String qualifier,
String fieldName)
Map Cell as a Field.
|
void |
mapCellRecord(String familyName,
String qualifier,
Map<String,String> fieldMap)
Map Cell as a Record.
|
void |
mapFamily(String familyName,
List<String> schemaNames,
List<String> fieldNames)
Deprecated.
|
void |
mapFamily(String familyName,
String fieldName)
Map family as a sub-table of fields.
|
void |
mapFamilyRecord(String familyName,
Map<String,String> fieldMap)
Map family as a sub-table of records.
|
protected com.pervasive.datarush.hadoop.shims.hbase.TableSchema |
mapFromHCatalog(MetadataContext ctx)
Load a mapping from an existing HCatalog table.
|
protected void |
mapToHCatalog(MetadataContext ctx,
RecordPort input)
Write a mapping to a new HCatalog table.
|
protected boolean |
schemaSupportedByHCatalog(MetadataContext ctx)
Determines if the current schema can be written to HCatalog.
|
void |
setCellFieldMap(Map<String,Map<String,Map<String,String>>> cellFieldMap)
Set the HBase cell to field mapping.
|
void |
setFamilyFieldMap(Map<String,Map<String,String>> familyFieldMap)
Set the column family to field mapping.
|
void |
setHCatalogDatabase(String database)
Set the HCatalog database from which to retrieve a schema and mapping.
|
void |
setHCatalogFields(List<String> fields)
Set the HCatalog fields to read or write.
|
void |
setHCatalogFields(String... fields)
Set the HCatalog fields to read or write.
|
void |
setHCatalogTable(String table)
Set the HCatalog table from which to retrieve a schema and mapping.
|
protected boolean |
tableExistsInHCatalog(MetadataContext ctx)
Determines if the currently selected HCatalog table already exists.
|
addFamily, effectiveConfiguration, getConfiguration, getFamilies, getFilesystem, getHiveMetastore, getQualifierFieldMap, getRootDirectory, getRowFieldMap, getTableName, getTimeFieldName, getZookeeperParentZNode, getZookeeperPort, getZookeeperQuorum, mapQualifier, mapRow, mapRowRecord, setConfiguration, setFamilies, setFilesystem, setHiveMetastore, setQualifierFieldMap, setRootDirectory, setRowFieldMap, setTableName, setTimeFieldName, setZookeeperParentZNode, setZookeeperPort, setZookeeperQuorum
compose
disableParallelism, getInputPorts, getOutputPorts, newInput, newInput, newOutput, newRecordInput, newRecordInput, newRecordOutput, notifyError
public Map<String,Map<String,String>> getFamilyFieldMap()
public void setFamilyFieldMap(Map<String,Map<String,String>> familyFieldMap)
familyFieldMap
- column family field mappingpublic Map<String,Map<String,Map<String,String>>> getCellFieldMap()
public void setCellFieldMap(Map<String,Map<String,Map<String,String>>> cellFieldMap)
cellFieldMap
- cell to field mappingpublic void mapCellRecord(String familyName, String qualifier, Map<String,String> fieldMap)
familyName
- the name of the HBase column family.qualifier
- the HBase column qualifier identifying the cell to be mapped as a Record.fieldMap
- cell record schema name to DataRush field name map.public void mapCell(String familyName, String qualifier, String fieldName)
familyName
- the name of the HBase column family.qualifier
- the HBase column qualifier identifying the cell to be mapped as a Field.fieldName
- the field name to be mapped to the specified cell qualifier.@Deprecated public void mapCell(String familyName, String qualifier, List<String> schemaNames, List<String> fieldNames)
public void mapFamilyRecord(String familyName, Map<String,String> fieldMap)
familyName
- the name of the HBase column family.fieldMap
- cell record schema name to DataRush field name map.public void mapFamily(String familyName, String fieldName)
familyName
- the name of the HBase column family.fieldName
- the field name to be mapped to the specified family.@Deprecated public void mapFamily(String familyName, List<String> schemaNames, List<String> fieldNames)
public String getHCatalogDatabase()
public void setHCatalogDatabase(String database)
database
- the HCatalog databasepublic String getHCatalogTable()
public void setHCatalogTable(String table)
table
- the HCatalog tablepublic List<String> getHCatalogFields()
public void setHCatalogFields(List<String> fields)
fields
- a list of field namespublic void setHCatalogFields(String... fields)
fields
- a list of field namesprotected com.pervasive.datarush.hadoop.shims.hbase.TableSchema mapFromHCatalog(MetadataContext ctx)
ctx
- protected void mapToHCatalog(MetadataContext ctx, RecordPort input)
ctx
- protected boolean tableExistsInHCatalog(MetadataContext ctx)
ctx
- true
if the table exists, false
otherwiseprotected boolean schemaSupportedByHCatalog(MetadataContext ctx)
A schema is supported by HCatalog if its only mappings are cells to fields.
ctx
- true
Copyright © 2024 Actian Corporation. All rights reserved.