HA/HDFS重要參數實操

前面有篇博文已經介紹了HA得原理，這篇主要來介紹HA的實操，HA原理： https://blog.csdn.net/czz1141979570/article/details/104856251

NN切換：

切換前的正常狀態爲：hadoop101：active hadoop102：standby

現在使用命令kill -9進行人工干預：

test成功，再重新將hadoop101上nn啓動

haadmin getServiceState：

言外之意就是獲取serviceid狀態，

[hadoop@hadoop101 sbin]$ hdfs haadmin -getServiceState nn1
20/03/19 19:16:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
standby

[hadoop@hadoop102 ~]$ hdfs haadmin -getServiceState nn2
20/03/19 19:25:50 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
active

當然，也可以在同一臺機器上查看所有NN節點的工作狀態

這個參數主要是用來爲後期寫shell腳本監控NN1和NN2的運行狀態，後面會專門寫篇博文介紹。

hdfs getconf(get config values from configuration):

言外之意便是根據指定的key可以獲取相應的value，而且是從當前運行的集羣xml配置文件中動態獲取，這個參數也是主要用來監控，後面再介紹，先看下如何使用

[hadoop@hadoop101 sbin]$ hdfs getconf
20/03/19 19:31:13 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
hdfs getconf is utility for getting configuration information from the config file.

hadoop getconf
   [-namenodes]           gets list of namenodes in the cluster.
   [-secondaryNameNodes]           gets list of secondary namenodes in the cluster.
   [-backupNodes]           gets list of backup nodes in the cluster.
   [-includeFile]           gets the include file path that defines the datanodes that can join the cluster.
   [-excludeFile]           gets the exclude file path that defines the datanodes that need to decommissioned.
   [-nnRpcAddresses]           gets the namenode rpc addresses
   [-confKey [key]]           gets a specific key from the configuration

For Example:

[hadoop@hadoop101 sbin]$ hdfs getconf -confKey dfs.nameservices
20/03/19 19:33:32 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
liuyi
[hadoop@hadoop101 sbin]$ hdfs getconf -confKey dfs.blocksize
20/03/19 19:33:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
134217728

hdfs fsck：

在HDFS中，提供了fsck命令，用於檢查HDFS上文件和目錄的健康狀態、獲取文件的block塊信息和位置信息等。

具體命令介紹：
-move: 移動損壞的文件到/lost+found目錄下
-delete: 刪除損壞的文件 注意：慎用！
-openforwrite: 輸出檢測中的正在被寫的文件
-list-corruptfileblocks: 輸出損壞的塊及其所屬的文件
-files: 輸出正在被檢測的文件
-blocks: 輸出block的詳細報告（需要和-files參數一起使用）
-locations: 輸出block的位置信息（需要和-files參數一起使用）
-racks: 輸出文件塊位置所在的機架信息（需要和-files參數一起使用）
例如要查看HDFS中某個文件的block塊的具體分佈，可以這樣寫：
hadoop fsck /your_file_path -files -blocks -locations -racks

for example:

[hadoop@hadoop101 sbin]$ hdfs fsck /
Connecting to namenode via http://hadoop102:50070/fsck?ugi=hadoop&path=%2F
FSCK started by hadoop (auth:SIMPLE) from /192.168.1.101 for path / at Thu Mar 19 19:59:42 CST 2020
Status: HEALTHY
Total size:   0 B
Total dirs:   7
Total files:   0
Total symlinks:       0
Total blocks (validated):   0
Minimally replicated blocks:   0
Over-replicated blocks:   0
Under-replicated blocks:   0
Mis-replicated blocks:       0
Default replication factor:   3
Average block replication:   0.0
Corrupt blocks:       0
Missing replicas:       0
Number of data-nodes:       2
Number of racks:       1
FSCK ended at Thu Mar 19 19:59:42 CST 2020 in 3 milliseconds

主要關注標紅部分。

這裏只簡單介紹下，後面會有fsck故障恢復重演博文。

--------------------------

用人品去感動別人，用行動去帶動別人，用陽光去照耀別人，用堅持去贏得別人，要求自己每天都去做與目標有關的事情，哪怕每天只進步一點點，堅持下來你就是最優秀卓越的！歡迎大家加入大數據qq交流羣：725967421 微信羣：flyfish運維實操一起交流，一起進步！！
--------------------------

HA/HDFS重要參數實操

NN切換：

haadmin getServiceState：

hdfs getconf(get config values from configuration):

hdfs fsck：

HA/HDFS重要參數實操

HDFS HA 架構解析

YARN HA解析

HDFS 文件讀寫流程剖析

Prometheus監控系列之二：Prometheus運行框架和數據格式介紹

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結