docker hang 異常處置

1 consul maint維護模式 /home/qboxserver/consul/current/bin/consul maint -enable
2、嘗試重啓 docker
3、2成功→4; 2失敗→ 5
4、重啓cadvisor 和 mesos-agent
5、重啓機器,照機器重啓的步驟恢復
6、supervisorctl restart dockerd
7、supervisorctl restart boots-cadvisor mesos-agent

退出維護模式: /home/qboxserver/consul/current/bin/consul maint -disable

該腳本啓動方式爲screen
SESSION_NAME="kill_hulk_app"; screen -ls | grep "${SESSION_NAME}" > /dev/null; [ "$?" != "0" ] && screen -d -m -S "${SESSION_NAME}" bash /root/kill_hulk_app.sh; screen -ls; echo
處理腳本
sed -i '10s/20/17/g' /root/kill_hulk_app.sh &&cat /root/kill_hulk_app.sh
#清理對應screen內進程pid
screen -ls | grep kill_hulk_app | cut -d. -f1 |xargs kill
#重新啓動腳本
SESSION_NAME="kill_hulk_app"; screen -ls | grep "${SESSION_NAME}" > /dev/null; [ "$?" != "0" ] && screen -d -m -S "${SESSION_NAME}" bash /root/kill_hulk_app.sh; screen -ls; echo && screen -ls | grep kill_hulk_app | cut -d. -f1
腳本內容:

cat kill_hulk_app.sh
#!/bin/bash

function log() {
    echo $(date +"%Y-%m-%d %H:%M:%S"): "$@" | tee -a kill_hulk_app.log
}

while true; do
    log "start"

    HULKS=$(docker ps -s --format '{{.ID}} {{.Size}}' | awk -F ' ' '$3=="GB" && $2 > 17 {print $1}')
    log "find hulks: " ${HULKS}
    for HULK in ${HULKS}; do
        log "container id to stop: " ${HULK}
    log $(docker inspect ${HULK} | grep instance_id)
    log $(docker ps -s --format '{{.ID}} {{.Size}}' | grep ${HULK})
        if [ "${HULK}" != "" ]; then
            log "stop && rm : " ${HULK}
            docker stop ${HULK} && docker rm ${HULK}
            log $?
        fi
    done

    log "sleep 300s"
    sleep 300
done
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章