rabbitmq高可用集羣搭建踩坑
搭建rabbtmq集羣時,執行 rabbitmqctl join_cluster rabbit@rabbit-node1報錯
Clustering node rabbit@slave1 with rabbit@rabbit-node1 Error: unable
to perform an operation on node ‘rabbit@rabbit-node1’. Please see
diagnostics information and suggestions below.
Most common reasons for this are:
Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)
CLI tool fails to authenticate with the server (e.g. due to CLI tool’s Erlang cookie not matching that of the server)
Target node is not running
In addition to the diagnostics info below:See the CLI, clustering and networking guides on http://rabbitmq.com/documentation.html to learn more
Consult server logs on node rabbit@rabbit-node1
DIAGNOSTICS
attempted to contact: [‘rabbit@rabbit-node1’]
rabbit@rabbit-node1: * connected to epmd (port 4369) on rabbit-node1 * epmd reports node ‘rabbit’ uses port 25672 for
inter-node and CLI tool traffic * TCP connection succeeded but
Erlang distribution failedHostname mismatch: node “rabbit@master” believes its host is different. Please ensure that hostnames resolve the same way locally
and on “rabbit@master”Current node details: * node name: ‘rabbitmqcli-14907-rabbit@slave1’ * effective user’s home
directory: /var/lib/rabbitmq * Erlang cookie hash:
N9VmcKjlLemcjmGbsPIdkw==
定位問題
文中有這樣的提示:
Most common reasons for this are:
- Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)
- CLI tool fails to authenticate with the server (e.g. due to CLI tool’s Erlang cookie not matching that of the server)
- Target node is not running
- 1.檢查防火牆和網絡連接:發現防火牆是關閉的,3臺機之間ping hostname可以ping通
- 2.檢查cookie文件:本人是使用rpm安裝的,cookie文件的路徑:/var/lib/rabbitmq/.erlang.cookie,3臺機的.erlang.cookie文件都是一樣的,且權限都是400
- 3.檢查rabbit-node1節點上rabbitmq-server狀態:目標節點運行正常
- 發現問題不在這裏,往下看
Hostname mismatch: node “rabbit@master” believes its host is
different. Please ensure that hostnames resolve the same way locally
and on “rabbit@master”
- 於是修改rabbitmq-env.conf配置文件(rabbitmq默認路徑:/etc/rabbitmq/rabbitmq-env.conf)
- 在集羣每臺機器上執行 vim /etc/rabbitmq/rabbitmq-env.conf(該文件默認不存在,需手動添加),添加配置如下
[root@master rabbitmq]# vim /etc/rabbitmq/rabbitmq-env.conf
RABBITMQ_NODENAME=rabbit@rabbit-node1
~
~
~
- rabbit@後面是rabbit集羣每臺機器hosts中配置的hostname,如:
192.168.72.127 rabbit-node1
192.168.72.128 rabbit-node2
192.168.72.129 rabbit-node3
每臺機器配置好後,執行ps -aux|grep mq 查看所有rabbitmq進程,然後kill -9 殺死所有RabbitMQ進程
[root@slave2 home]# ps -aux|grep mq
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
rabbitmq 8106 0.0 0.2 10968 560 ? S 02:33 0:00 /usr/lib64/erlang/erts-10.2.1/bin/epmd -daemon
root 24738 0.0 0.0 108700 136 pts/1 S 05:21 0:00 /bin/sh /etc/init.d/rabbitmq-server start
root 24810 0.0 0.2 108208 472 pts/1 S 05:21 0:00 /bin/bash -c ulimit -S -c 0 >/dev/null 2>&1 ; /usr/sbin/rabbitmq-server
root 24812 0.0 0.2 130732 524 pts/1 S 05:21 0:00 /sbin/runuser -s /bin/sh -- rabbitmq /usr/lib/rabbitmq/bin/rabbitmq-server
rabbitmq 24842 0.0 0.2 106108 484 pts/1 S 05:21 0:00 sh /usr/lib/rabbitmq/bin/rabbitmq-server
rabbitmq 25010 0.4 18.4 1813924 41988 pts/1 Sl 05:21 0:19 /usr/lib64/erlang/erts-10.2.1/bin/beam.smp -W w -A 64 -MBas ageffcbf -MHas ageffcbf -MBlmbcs 512 -MHlmbcs 512 -MMmcs 30 -P 1048576 -t 5000000 -stbt db -zdbbl 128000 -K true -B i -- -root /usr/lib64/erlang -progname erl -- -home /var/lib/rabbitmq -- -pa /usr/lib/rabbitmq/lib/rabbitmq_server-3.7.9/ebin -noshell -noinput -s rabbit boot -sname rabbit@slave2 -boot start_sasl -kernel inet_default_connect_options [{nodelay,true}] -sasl errlog_type error -sasl sasl_error_logger false -rabbit lager_log_root "/var/log/rabbitmq" -rabbit lager_default_file "/var/log/rabbitmq/[email protected]" -rabbit lager_upgrade_file "/var/log/rabbitmq/rabbit@slave2_upgrade.log" -rabbit enabled_plugins_file "/etc/rabbitmq/enabled_plugins" -rabbit plugins_dir "/usr/lib/rabbitmq/plugins:/usr/lib/rabbitmq/lib/rabbitmq_server-3.7.9/plugins" -rabbit plugins_expand_dir "/var/lib/rabbitmq/mnesia/rabbit@slave2-plugins-expand" -os_mon start_cpu_sup false -os_mon start_disksup false -os_mon start_memsup false -mnesia dir "/var/lib/rabbitmq/mnesia/rabbit@slave2" -kernel inet_dist_listen_min 25672 -kernel inet_dist_listen_max 25672
rabbitmq 25108 0.0 0.1 4064 388 ? Ss 05:21 0:00 erl_child_setup 1024
rabbitmq 25135 0.0 0.1 10800 448 ? Ss 05:21 0:00 inet_gethost 4
rabbitmq 25136 0.0 0.3 17128 696 ? S 05:21 0:00 inet_gethost 4
重啓RabbitMQ服務加入集羣
[root@slave1 rabbitmq]# service rabbitmq-server start
Starting rabbitmq-server: SUCCESS
rabbitmq-server.
[root@slave1 rabbitmq]# rabbitmqctl stop_app
Stopping rabbit application on node rabbit@rabbit-node2 ...
[root@slave1 rabbitmq]# rabbitmqctl join_cluster rabbit@rabbit-node1
Clustering node rabbit@rabbit-node2 with rabbit@rabbit-node1
[root@slave1 rabbitmq]# rabbitmqctl start_app
Starting node rabbit@rabbit-node2 ...
completed with 3 plugins.
查看集羣狀態
[root@slave1 ~]# rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit-node2 ...
[{nodes,[{disc,['rabbit@rabbit-node1','rabbit@rabbit-node2',
'rabbit@rabbit-node3']}]},
{running_nodes,['rabbit@rabbit-node3','rabbit@rabbit-node1',
'rabbit@rabbit-node2']},
{cluster_name,<<"rabbit@slave2">>},
{partitions,[]},
{alarms,[{'rabbit@rabbit-node3',[]},
{'rabbit@rabbit-node1',[]},
{'rabbit@rabbit-node2',[]}]}]
可以看到rabbit-node2和rabbit-node3已成功加入集羣,問題解決。