一、爲什麼需要定位
hbase是一個主從的master/slave架構,默認使用zk的選舉來支持HMaster的高可用實通過監聽臨時節點,使用類似分佈式鎖的方法來爭搶創建節點後成爲新的master。
一個HMaster通常會對應多個HRegionServer,而每一個HRegionServer又可以有多個HRegion,需要注意的是,我們的Table數據剛開始的時候只會存在於一個HRegion裏面,但是隨着表數據量的增大,會發生spilter操作,然後table數據很可能會存在不同HRegionServer的HRgion裏面,而每個HRegion存的是某張表的一部分連續的數據。當我們拿着一個tableName,Rowkey去hbase上查數據的時候,它是怎麼定位到是哪一個HRegion裏存着這個rowkey對應的數據呢,下面我們以單個rowkey的get操作爲例來看一下源碼實現(其它操作如scan等在找Region部分的源碼實現也是一樣)。
二、如何定位
先看一段簡單的單元測試
@Test
public void readFromHbase() throws IOException{
HBaseSource hbaseSource = HBaseSource.getHbaseSource(appName);
hbaseSource.openConnection();
Result rs = hbaseSource.searchDataByGet("TABLE", "TTTTT1201701011200222");
for(KeyValue kv : rs.raw()){
System.out.println("row:" + new String(kv.getRow()));
System.out.println("qualifier-value:" + new String(kv.getQualifier()) +";" + new String(kv.getValue()));
}
}
這裏面的searchDataByGet返給就是通過tableName,rowkey去查詢數據,下面我們就來分析如何根據這2個調節去獲取數據
2.1 首先在初始化Table實例的時候,有一個地方需要注意
private void finishSetup() throws IOException {
if (connConfiguration == null) {
connConfiguration = new ConnectionConfiguration(configuration);
}
this.operationTimeout = tableName.isSystemTable() ?
connConfiguration.getMetaOperationTimeout() : connConfiguration.getOperationTimeout();
this.rpcTimeout = configuration.getInt(HConstants.HBASE_RPC_TIMEOUT_KEY,
HConstants.DEFAULT_HBASE_RPC_TIMEOUT);
this.scannerCaching = connConfiguration.getScannerCaching();
this.scannerMaxResultSize = connConfiguration.getScannerMaxResultSize();
if (this.rpcCallerFactory == null) {
this.rpcCallerFactory = connection.getNewRpcRetryingCallerFactory(configuration);
}
if (this.rpcControllerFactory == null) {
this.rpcControllerFactory = RpcControllerFactory.instantiate(configuration);
}
// puts need to track errors globally due to how the APIs currently work.
multiAp = this.connection.getAsyncProcess();
this.closed = false;
this.locator = new HRegionLocator(tableName, connection);
}
構造函數初始化HRegionLocator對象,conection可用理解爲一個集羣連接器
, row): locateRegion(tableName, row); }
public HRegionLocator(TableName tableName, ClusterConnection connection) {
this.connection = connection;
this.tableName = tableName;
}
2.2 執行get操作
代碼裏的get會調用Htable裏的下面的方法
@Override
public Result get(final Get get) throws IOException {
return get(get, get.isCheckExistenceOnly());
}
接着調用
private Result get(Get get, final boolean checkExistenceOnly) throws IOException {
// if we are changing settings to the get, clone it.
if (get.isCheckExistenceOnly() != checkExistenceOnly || get.getConsistency() == null) {
get = ReflectionUtils.newInstance(get.getClass(), get);
get.setCheckExistenceOnly(checkExistenceOnly);
if (get.getConsistency() == null){
get.setConsistency(defaultConsistency);
}
}
//hbase默認採用強一致性模式
if (get.getConsistency() == Consistency.STRONG) {
// Good old call.
// 回調函數需要final類型的
final Get getReq = get;
//構造一個RegionServerCallable<Result>實例,其會默認調用他的prepare方法,
RegionServerCallable<Result> callable = new RegionServerCallable<Result>(this.connection,
getName(), get.getRow()) {
@Override
public Result call(int callTimeout) throws IOException {
ClientProtos.GetRequest request =
RequestConverter.buildGetRequest(getLocation().getRegionInfo().getRegionName(), getReq);
PayloadCarryingRpcController controller = rpcControllerFactory.newController();
controller.setPriority(tableName);
controller.setCallTimeout(callTimeout);
try {
ClientProtos.GetResponse response = getStub().get(controller, request);
if (response == null) return null;
return ProtobufUtil.toResult(response.getResult());
} catch (ServiceException se) {
throw ProtobufUtil.getRemoteException(se);
}
}
};
return rpcCallerFactory.<Result>newCaller(rpcTimeout).callWithRetries(callable,
this.operationTimeout);
}
// Call that takes into account the replica
RpcRetryingCallerWithReadReplicas callable = new RpcRetryingCallerWithReadReplicas(
rpcControllerFactory, tableName, this.connection, get, pool,
connConfiguration.getRetriesNumber(),
operationTimeout,
connConfiguration.getPrimaryCallTimeoutMicroSecond());
return callable.call();
}
初始化好連接,表名,row
public RegionServerCallable(Connection connection, TableName tableName, byte [] row) {
this.connection = connection;
this.tableName = tableName;
this.row = row;
}
/**
* Prepare for connection to the server hosting region with row from tablename. Does lookup
* to find region location and hosting server.
* @param reload Set this to true if connection should re-find the region
* @throws IOException e
*/
@Override
public void prepare(final boolean reload) throws IOException {
//首先根據表名獲取一個
try (RegionLocator regionLocator = connection.getRegionLocator(tableName)) {
//根據表名,row,是否使用緩存來獲取相應的hregion位置
this.location = regionLocator.getRegionLocation(row, reload);
}
if (this.location == null) {
throw new IOException("Failed to find location, tableName=" + tableName +
", row=" + Bytes.toString(row) + ", reload=" + reload);
}
//獲取了位置信息後,構建一個rpc連接準備獲取數據
setStub(getConnection().getClient(this.location.getServerName()));
}
上面的getRegionLocator實際是實例化了一個RegionLocator
@Override
public RegionLocator getRegionLocator(TableName tableName) throws IOException {
return new HRegionLocator(tableName, this);
}
終於進入主題,看如何找到對應的region
@Override
public HRegionLocation getRegionLocation(final byte [] row, boolean reload)
throws IOException {
//tableName是已經在前面的操作中初始化好了的
return connection.getRegionLocation(tableName, row, reload);
}
@Override
public HRegionLocation getRegionLocation(final TableName tableName,
final byte [] row, boolean reload)
throws IOException {
//reload true表示不使用緩存,false表示使用緩存
return reload? relocateRegion(tableName, row): locateRegion(tableName, row);
}
我們看使用緩存的情況,即調用
@Override
public HRegionLocation locateRegion(
final TableName tableName, final byte[] row) throws IOException{
RegionLocations locations = locateRegion(tableName, row, true, true);
return locations == null ? null : locations.getRegionLocation();
}
這裏說一下放的RegionLocations類有個成員變量,維護的是數組index-HRegionLocation的映射關係,這裏傳入的index爲0。
private final HRegionLocation[] locations; // replicaId -> HRegionLocation.
而HRegionLocation就是真正放表,rowkey到region的地方
public class HRegionLocation implements Comparable<HRegionLocation> {
private final HRegionInfo regionInfo; 當前row對應的region信息
private final ServerName serverName; 服務名
private final long seqNum; rowkey對應的位置編號
。。。。。。。。。省略
}
繼續看調用方法
@Override
public RegionLocations locateRegion(final TableName tableName,
final byte [] row, boolean useCache, boolean retry)
throws IOException {
return locateRegion(tableName, row, useCache, retry, RegionReplicaUtil.DEFAULT_REPLICA_ID);
}
繼續調用
@Override
public RegionLocations locateRegion(final TableName tableName,
final byte [] row, boolean useCache, boolean retry, int replicaId)
throws IOException {
if (this.closed) throw new IOException(toString() + " closed");
if (tableName== null || tableName.getName().length == 0) {
throw new IllegalArgumentException(
"table name cannot be null or zero length");
}
//如果請求表就是hbase:meta表
if (tableName.equals(TableName.META_TABLE_NAME)) {
return locateMeta(tableName, useCache, replicaId);
} else {
//如果不是meta表,cache中沒有,需要訪問meta RS,調用locateRegionInMeta()方法進行定位
// Region not in the cache - have to go to the meta RS
return locateRegionInMeta(tableName, row, useCache, retry, replicaId);
}
}
定位region
private RegionLocations locateRegionInMeta(TableName tableName, byte[] row,
boolean useCache, boolean retry, int replicaId) throws IOException {
// If we are supposed to be using the cache, look in the cache to see if
// we already have the region.
//如果我們使用緩存,先充緩存中找,後面我們再看是怎麼在緩存裏面獲取的
if (useCache) {
RegionLocations locations = getCachedLocation(tableName, row);
if (locations != null && locations.getRegionLocation(replicaId) != null) {
return locations;
}
}
//如果緩存中沒有,自己構建一個metakey,
// build the key of the meta region we should be looking for.
// the extra 9's on the end are necessary to allow "exact" matches
// without knowing the precise region names.
byte[] metaKey = HRegionInfo.createRegionName(tableName, row, HConstants.NINES, false);
//構建一個Scan對象,可以看出get操作最終也會轉爲scan操作
Scan s = new Scan();
s.setReversed(true);
//scan操作的開始行即爲剛纔構建的Metakey
s.setStartRow(metaKey);
s.setSmall(true);
s.setCaching(1);
if (this.useMetaReplicas) {
s.setConsistency(Consistency.TIMELINE);
}
//獲取重試次數,默認35次,感覺是不是太多了
int localNumRetries = (retry ? numTries : 1){
for (int tries = 0; true; tries++) {
if (tries >= localNumRetries) {
throw new NoServerForRegionException("Unable to find region for "
+ Bytes.toStringBinary(row) + " in " + tableName +
" after " + localNumRetries + " tries.");
}
//在重試的過程中,再一次檢查緩存是否有數據,因爲很可能在重試的時候其他線程往緩存寫了
if (useCache) {
RegionLocations locations = getCachedLocation(tableName, row);
if (locations != null && locations.getRegionLocation(replicaId) != null) {
return locations;
}
} else {
// If we are not supposed to be using the cache, delete any existing cached location
// so it won't interfere.
//如果不使用緩存,在重試的過程中,清空緩存數據,保證再重試的過程中不干擾查詢請求
metaCache.clearCache(tableName, row);
}
// Query the meta region
try {
Result regionInfoRow = null;
ReversedClientScanner rcs = null;
try {
//構造一個ReversedClientScanner 實例,這裏方法調整層次很深,實際上它還是會去查hbase:meta表獲取表,rowkey和region對應的映射關係,
//看構造函數的第三個參數
rcs = new ClientSmallReversedScanner(conf, s, TableName.META_TABLE_NAME, this,
rpcCallerFactory, rpcControllerFactory, getMetaLookupPool(), 0);
//獲取該實例的下一行
regionInfoRow = rcs.next();
} finally {
if (rcs != null) {
rcs.close();
}
}
if (regionInfoRow == null) {
throw new TableNotFoundException(tableName);
}
//將我們查詢返回的Result轉爲RegionLocations
// convert the row result into the HRegionLocation we need!
RegionLocations locations = MetaTableAccessor.getRegionLocations(regionInfoRow);
if (locations == null || locations.getRegionLocation(replicaId) == null) {
throw new IOException("HRegionInfo was null in " +
tableName + ", row=" + regionInfoRow);
}
HRegionInfo regionInfo = locations.getRegionLocation(replicaId).getRegionInfo();
//如果根據數組index沒有找到對應的region對應信息
if (regionInfo == null) {
throw new IOException("HRegionInfo was null or empty in " +
TableName.META_TABLE_NAME + ", row=" + regionInfoRow);
}
//如果找到表和請求表不一樣
// possible we got a region of a different table...
if (!regionInfo.getTable().equals(tableName)) {
throw new TableNotFoundException(
"Table '" + tableName + "' was not found, got: " +
regionInfo.getTable() + ".");
}
//如果找到的region正在進行拆分操作
if (regionInfo.isSplit()) {
throw new RegionOfflineException("the only available region for" +
" the required row is a split parent," +
" the daughters should be online soon: " +
regionInfo.getRegionNameAsString());
}
//如果找到region已經下線
if (regionInfo.isOffline()) {
throw new RegionOfflineException("the region is offline, could" +
" be caused by a disable table call: " +
regionInfo.getRegionNameAsString());
}
ServerName serverName = locations.getRegionLocation(replicaId).getServerName();
if (serverName == null) {
throw new NoServerForRegionException("No server address listed " +
"in " + TableName.META_TABLE_NAME + " for region " +
regionInfo.getRegionNameAsString() + " containing row " +
Bytes.toStringBinary(row));
}
如果找到HRegionServer已經掛了
if (isDeadServer(serverName)){
throw new RegionServerStoppedException("hbase:meta says the region "+
regionInfo.getRegionNameAsString()+" is managed by the server " + serverName +
", but it is dead.");
}
//上面的檢查都OK的情況,緩存tableName,rowkey到hregion的關係
// Instantiate the location
cacheLocation(tableName, locations);
return locations;
} catch (TableNotFoundException e) {
// if we got this error, probably means the table just plain doesn't
// exist. rethrow the error immediately. this should always be coming
// from the HTable constructor.
throw e;
} catch (IOException e) {
ExceptionUtil.rethrowIfInterrupt(e);
if (e instanceof RemoteException) {
e = ((RemoteException)e).unwrapRemoteException();
}
if (tries < localNumRetries - 1) {
if (LOG.isDebugEnabled()) {
LOG.debug("locateRegionInMeta parentTable=" +
TableName.META_TABLE_NAME + ", metaLocation=" +
", attempt=" + tries + " of " +
localNumRetries + " failed; retrying after sleep of " +
ConnectionUtils.getPauseTime(this.pause, tries) + " because: " + e.getMessage());
}
} else {
throw e;
}
// Only relocate the parent region if necessary
if(!(e instanceof RegionOfflineException ||
e instanceof NoServerForRegionException)) {
relocateRegion(TableName.META_TABLE_NAME, metaKey, replicaId);
}
}
try{
Thread.sleep(ConnectionUtils.getPauseTime(this.pause, tries));
} catch (InterruptedException e) {
throw new InterruptedIOException("Giving up trying to location region in " +
"meta: thread is interrupted.");
}
}
}
至此,一個tableName、rowkey是如何定位到存儲他的region的分析大體已經ok,這裏面還有很多分支沒有涉及,比如從緩存取和存,還有如果tablename就是Meta表的時候是怎麼處理的,這裏我總結下
1、如果tableName是非meta表,其最終還是會去查Meta表獲取非meta表和region相關的映射關係,找到region的位置
2、位置找到後,構建一個rpc請求對應的region獲取數據
3、緩存tablename,rowkey和region位置的對應關係,下次來的時候直接從緩存裏面取
4、如果發生了HRegionServer宕機,客戶端緩存的地址將不可用的時候,會再次到zk上進行尋址,然後緩存到客戶端。