-
- All Implemented Interfaces:
LogicalOperator
public class ReadHBase extends KeyValueOperator
Read a result set from HBase.The default read behavior returns the latest version of any retrieved cells. The versionCount property controls the number of versions returned for each cell. Optional row key, qualifier key (when mapping family as sub table), and time key (when versionCount > 1) fields can be specified to uniquely id each DataRush record.
Cell versions can also be filtered by a time range. The startTime and endTime properties specify the time range.
Each partition will read its assigned regions in row key ascending, qualifier key ascending, time key descending order. If cells from multiple families are mapped then they will be joined in the following manor:
- cells mapped via mapFamily() and mapFamilyRecord() will be joined using the row key, qualifier key, and time key fields.
- The result of the first step will then be joined with cells mapped via mapCell() and mapCellRecord() using the row key, and time key fields.
- See Also:
WriteHBase
,DeleteHBase
-
-
Field Summary
-
Fields inherited from class com.pervasive.datarush.hbase.KeyOperator
catalogTableName, cellSchemaFamilyName, keySchemaFamilyName, statsFamilyName
-
-
Constructor Summary
Constructors Constructor Description ReadHBase()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
compose(CompositionContext ctx)
Compose the body of this operator.Date
getEndTime()
Get time range filter end timeboolean
getOmitHBaseScan()
Get omit HBase Scan stateboolean
getOmitHFileReader()
Get omit HFile Reader stateRecordPort
getOutput()
Date
getStartTime()
Get time range filter start timelong
getVersionCount()
Get cell version countvoid
setEndTime(Date endTime)
Set time range filter end timevoid
setOmitHBaseScan(boolean omitHBaseScan)
Set omit HBase Scan state.void
setOmitHFileReader(boolean omitHFileReader)
Set omit HFile Reader state.void
setStartTime(Date startTime)
Set time range filter start timevoid
setVersionCount(long versionCount)
Set cell version count-
Methods inherited from class com.pervasive.datarush.hbase.KeyValueOperator
getCellFieldMap, getFamilyFieldMap, getHCatalogDatabase, getHCatalogFields, getHCatalogTable, mapCell, mapCell, mapCellRecord, mapFamily, mapFamily, mapFamilyRecord, mapFromHCatalog, mapToHCatalog, schemaSupportedByHCatalog, setCellFieldMap, setFamilyFieldMap, setHCatalogDatabase, setHCatalogFields, setHCatalogFields, setHCatalogTable, tableExistsInHCatalog
-
Methods inherited from class com.pervasive.datarush.hbase.KeyOperator
addFamily, effectiveConfiguration, getConfiguration, getFamilies, getFilesystem, getHiveMetastore, getQualifierFieldMap, getRootDirectory, getRowFieldMap, getTableName, getTimeFieldName, getZookeeperParentZNode, getZookeeperPort, getZookeeperQuorum, mapQualifier, mapRow, mapRowRecord, setConfiguration, setFamilies, setFilesystem, setHiveMetastore, setQualifierFieldMap, setRootDirectory, setRowFieldMap, setTableName, setTimeFieldName, setZookeeperParentZNode, setZookeeperPort, setZookeeperQuorum
-
Methods inherited from class com.pervasive.datarush.operators.AbstractLogicalOperator
disableParallelism, getInputPorts, getOutputPorts, newInput, newInput, newOutput, newRecordInput, newRecordInput, newRecordOutput, notifyError
-
-
-
-
Method Detail
-
getStartTime
public Date getStartTime()
Get time range filter start time- Returns:
- start time.
-
getEndTime
public Date getEndTime()
Get time range filter end time- Returns:
- end time.
-
setStartTime
public void setStartTime(Date startTime)
Set time range filter start time- Parameters:
startTime
- time range start time inclusive.
-
setEndTime
public void setEndTime(Date endTime)
Set time range filter end time- Parameters:
endTime
- time range end time exclusive
-
getVersionCount
public long getVersionCount()
Get cell version count- Returns:
- the maximum number of versions to return for each cell.
-
setVersionCount
public void setVersionCount(long versionCount)
Set cell version count- Parameters:
versionCount
- - the maximum number of versions to return for each cell. Defaults to 1.
-
getOmitHBaseScan
public boolean getOmitHBaseScan()
Get omit HBase Scan state- Returns:
- boolean indicating whether to omit HBase Scan from result set.
-
setOmitHBaseScan
public void setOmitHBaseScan(boolean omitHBaseScan)
Set omit HBase Scan state. The result set will not include data in HBase memstore, instead the returned result set will consist of only data stored in HFiles and retrieved via HFile Reader exclusively.- Parameters:
omitHBaseScan
- - boolean indicating whether to omit HBase Scan from result set. Defaults to false.
-
getOmitHFileReader
public boolean getOmitHFileReader()
Get omit HFile Reader state- Returns:
- boolean indicating whether to omit HFile Reader from result set.
-
setOmitHFileReader
public void setOmitHFileReader(boolean omitHFileReader)
Set omit HFile Reader state. HFiles will not be accessed directly, instead the returned result set, consisting of both memstore and HFile data, will be retrieved via the HBase Scan api exclusively.- Parameters:
omitHFileReader
- - boolean indicating whether to omit HFile Reader from result set. Defaults to false.
-
compose
public void compose(CompositionContext ctx)
Description copied from class:CompositeOperator
Compose the body of this operator. Implementations should do the following:- Perform any validation of configuration, input types, etc
- Instantiate and configure sub-operators, adding them to the provided context via
the method
OperatorComposable.add(O)
- Create necessary connections via the method
OperatorComposable.connect(P, P)
. This includes connections from the composite's input ports to sub-operators, connections between sub-operators, and connections from sub-operators output ports to the composite's output ports
- Specified by:
compose
in classCompositeOperator
- Parameters:
ctx
- the context
-
getOutput
public RecordPort getOutput()
-
-