最近部門合併,兩個部門的集羣需要同步到一起,自然用的是【distcp】,因爲兩個集羣的版本不一致,用hdfs可能會有問題,所以通過http端口來傳輸。因爲兩個集羣都配置了HA,無法確定什麼時候哪個name node處於active狀態,所以需要先每次傳輸前先獲取active node。
方式是通過JMX來獲取集羣信息。
解析返回的json,獲取到active node後退出,開始傳輸數據。
#!/bin/sh
namenodes="
192.168.2.103
192.168.2.101
"
for nn in $namenodes
do
echo "node: $nn"
status=`curl "http://${nn}:50070/jmx?qry=Hadoop:service=NameNode,name=NameNodeStatus"`
if [[ $? -eq 0 ]]
then
state=`echo $status | jq '.beans'[0] | jq '.State' | sed 's/"//g'`
echo "state: ${state}"
if [[ "active" == ${state} ]]
then
activeNode=$nn
break
fi
elif [[ $? -eq 1 ]]
then
echo "Error: can not conncet to host"
else
echo "Error $?"
fi
done
echo "active: ${activeNode}"
返回的json格式
{
"beans": [
{
"name": "Hadoop:service=NameNode,name=NameNodeStatus",
"modelerType": "org.apache.hadoop.hdfs.server.namenode.NameNode",
"SecurityEnabled": false,
"NNRole": "NameNode",
"HostAndPort": "cdh01:8020",
"LastHATransitionTime": 0,
"State": "active"
}
]
}