遷移hbase的快照到新集羣后RegionServer無法啓動,報錯failed open of region

錯誤日誌摘要:

2018-03-12 17:05:29,608 ERROR [RS_OPEN_REGION-our_ambari_clustergn-a05044c6-core-1-003:16020-15] handler.OpenRegionHandler: Failed open of region=market:KYLIN_YEDCQ82BF3,16F87C792E626990D57DDABF161A3B4E,1519847061679.736e53b17b220aed9aa9233ddffd952a., starting to roll back the global memstore size.

【背景】老集羣和新集羣使用的Hbase版本都是1.1.2;老集羣的hadoop是2.7.1,新集羣的hadoop版本是2.7.3

出錯之前我做的操作:
從老集羣通過copy snapshot把該表的快照遷移到新集羣后,在新集羣新建一張同樣的表,再從此快照恢復這個表。
其他的表都是通過hadoop distcp過來的,遷過來後修復一下元數據、把表數據分配到有關的regionServer就OK了。

解決辦法——

【1】刪除新集羣上zookeeper上有關該表的節點,【2】清除新集羣hdfs上和該表有關的數據,【3】重啓新集羣上的所有RegionServer


【1】刪除新集羣上zookeeper上有關該表的節點

[zk: localhost:2181(CONNECTED) 2] ls /hbase/table
[ksai:usertb, hbase:meta, ksai:wps_pc_active_user_domain_info, KYLIN_YEDCQ82BF3, hbase:namespace, ksai:weekly-installed-android-apps, ksai:wps_android_active_user_domain_info, ksai:test_zz]
[zk: localhost:2181(CONNECTED) 4] get /hbase/table/KYLIN_YEDCQ82BF3
�master:16000ڐ����APBUF
cZxid = 0x3000b1907
ctime = Fri Mar 16 11:31:41 CST 2018
mZxid = 0x3000b7432
mtime = Fri Mar 16 15:17:38 CST 2018
pZxid = 0x3000b1907
cversion = 0
dataVersion = 12
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 31
numChildren = 0
[zk: localhost:2181(CONNECTED) 5] rmr /hbase/table/KYLIN_YEDCQ82BF3
[zk: localhost:2181(CONNECTED) 6] ls /hbase/table/KYLIN_YEDCQ82BF3 
Node does not exist: /hbase/table/KYLIN_YEDCQ82BF3

【2】清除新集羣hdfs上和該表有關的數據
[hdfs@our_ambari_clustergn-a05044c6-master-1-001 root]$ hdfs dfs -ls -R /apps/hbase/ | grep --color KYLIN_YEDCQ82BF3
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:09 /apps/hbase/data/.hbase-snapshot/KYLIN_YEDCQ82BF3_snapshot_20180302
-rw-r--r--   3 hbase hdfs         65 2018-03-12 15:09 /apps/hbase/data/.hbase-snapshot/KYLIN_YEDCQ82BF3_snapshot_20180302/.snapshotinfo
-rw-r--r--   3 hbase hdfs        863 2018-03-12 15:09 /apps/hbase/data/.hbase-snapshot/KYLIN_YEDCQ82BF3_snapshot_20180302/data.manifest
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:10 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1/.links-9cdedadf1d1e4d36ac92b8f2b7a79432
-rw-r--r--   3 hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1/.links-9cdedadf1d1e4d36ac92b8f2b7a79432/ea63ee219b00bf26b8ecdefcf244738f.KYLIN_YEDCQ82BF3
-rw-r--r--   3 hbase hdfs   20061969 2018-03-12 15:10 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1/9cdedadf1d1e4d36ac92b8f2b7a79432
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/.tabledesc
-rw-r--r--   3 hbase hdfs        767 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/.tabledesc/.tableinfo.0000000005
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/.tmp
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f
-rw-r--r--   3 hbase hdfs         51 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/.regioninfo
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1
-rw-r--r--   3 hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1/KYLIN_YEDCQ82BF3=ea63ee219b00bf26b8ecdefcf244738f-9cdedadf1d1e4d36ac92b8f2b7a79432

# 此快照目錄下只有這一個快照,所以我圖省事直接從他的父目錄刪除了

[hdfs@our_ambari_clustergn-a05044c6-master-1-001 root]$ hdfs dfs -rm -R /apps/hbase/data/.hbase-snapshot

18/03/16 17:23:12 INFO fs.TrashPolicyDefault: Moved: 'hdfs://tony_hdfs_ha/apps/hbase/data/.hbase-snapshot' to trash at: hdfs://tony_hdfs_ha/user/hdfs/.Trash/Current/apps/hbase/data/.hbase-snapshot

You have new mail in /var/spool/mail/root


# 再查出其他含有此表名的hdfs目錄,再刪除它即可
[hdfs@our_ambari_clustergn-a05044c6-master-1-001 root]$ hdfs dfs -ls -R /apps/hbase/ | grep --color KYLIN_YEDCQ82BF3
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:10 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1/.links-9cdedadf1d1e4d36ac92b8f2b7a79432
-rw-r--r--   3 hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1/.links-9cdedadf1d1e4d36ac92b8f2b7a79432/ea63ee219b00bf26b8ecdefcf244738f.KYLIN_YEDCQ82BF3
-rw-r--r--   3 hbase hdfs   20061969 2018-03-12 15:10 /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1/9cdedadf1d1e4d36ac92b8f2b7a79432
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/.tabledesc
-rw-r--r--   3 hbase hdfs        767 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/.tabledesc/.tableinfo.0000000005
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/.tmp
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f
-rw-r--r--   3 hbase hdfs         51 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/.regioninfo
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1
-rw-r--r--   3 hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1/KYLIN_YEDCQ82BF3=ea63ee219b00bf26b8ecdefcf244738f-9cdedadf1d1e4d36ac92b8f2b7a79432


# 繼續刪除含有此表名的hdfs目錄

[hdfs@our_ambari_clustergn-a05044c6-master-1-001 root]$ hdfs dfs -rm -R /apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3
18/03/16 17:23:36 INFO fs.TrashPolicyDefault: Moved: 'hdfs://tony_hdfs_ha/apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3' to trash at: hdfs://tony_hdfs_ha/user/hdfs/.Trash/Current/apps/hbase/data/archive/data/default/KYLIN_YEDCQ82BF3
# 再查出其他含有此表名的hdfs目錄,刪除它即可
[hdfs@our_ambari_clustergn-a05044c6-master-1-001 root]$ hdfs dfs -ls -R /apps/hbase/ | grep --color KYLIN_YEDCQ82BF3
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/.tabledesc
-rw-r--r--   3 hbase hdfs        767 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/.tabledesc/.tableinfo.0000000005
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/.tmp
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f
-rw-r--r--   3 hbase hdfs         51 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/.regioninfo
drwxr-xr-x   - hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1

-rw-r--r--   3 hbase hdfs          0 2018-03-12 15:17 /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3/ea63ee219b00bf26b8ecdefcf244738f/F1/KYLIN_YEDCQ82BF3=ea63ee219b00bf26b8ecdefcf244738f-9cdedadf1d1e4d36ac92b8f2b7a79432


# 繼續刪除含有此表名的hdfs目錄
[hdfs@our_ambari_clustergn-a05044c6-master-1-001 root]$ hdfs dfs -rm -R /apps/hbase/data/data/default/KYLIN_YEDCQ82BF3
18/03/16 17:23:58 INFO fs.TrashPolicyDefault: Moved: 'hdfs://tony_hdfs_ha/apps/hbase/data/data/default/KYLIN_YEDCQ82BF3' to trash at: hdfs://tony_hdfs_ha/user/hdfs/.Trash/Current/apps/hbase/data/data/default/KYLIN_YEDCQ82BF3


【3】重啓新集羣上的所有RegionServer

# 到Ambari重啓hbase的REGION SERVER後,此表不在zk上了
[zk: localhost:2181(CONNECTED) 11] ls /hbase/table
[ksai:usertb, hbase:meta, ksai:wps_pc_active_user_domain_info, hbase:namespace, ksai:weekly-installed-android-apps, ksai:wps_android_active_user_domain_info, ksai:test_zz]

# 但是這個表還是在hbase shell能查到,於是又重啓整個hbase集羣,再到hbase shell上面查看,消失了。而且各個RegionServer也能成功啓動了。


【後續】看完以後,如果你有什麼想法,可以給我留言,大家討論一下本文的這種現象。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章