AWS發佈運維儀表盤的最佳實踐指南

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最近,"},{"type":"link","attrs":{"href":"https:\/\/www.infoq.com\/aws\/","title":null,"type":null},"content":[{"type":"text","text":"AWS"}],"marks":[{"type":"underline"}]},{"type":"text","text":"在"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/builders-library\/","title":null,"type":null},"content":[{"type":"text","text":"Amazon"}],"marks":[{"type":"underline"}]},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/builders-library\/","title":null,"type":null},"content":[{"type":"text","text":"構建者庫"}],"marks":[{"type":"underline"}]},{"type":"text","text":"(Amazon Builders' Library)中添加了構建儀表盤的"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/builders-library\/building-dashboards-for-operational-visibility\/","title":null,"type":null},"content":[{"type":"text","text":"最佳實踐指南"}],"marks":[{"type":"underline"}]},{"type":"text","text":"。儀表盤用於實現運維的可見性。文檔中詳細闡明瞭Amazon現有的各類儀表盤,並探討了創建儀表盤的最佳設計實踐。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"AWS首席工程師"},{"type":"link","attrs":{"href":"https:\/\/www.linkedin.com\/in\/joshea","title":null,"type":null},"content":[{"type":"text","text":"John O'Shea"}],"marks":[{"type":"underline"}]},{"type":"text","text":"負責撰寫這些構建者庫中的新添文檔。O'Shea指出,AWS的服務狀態告知機制是通過儀表盤實現的,儀表盤向用戶提供系統運行視圖。但O'Shea也闡明,“我們發現只要運維過程需要人工檢查儀表盤,那麼無論多麼頻繁地檢查儀表盤狀態,也會由於人爲錯誤而導致失敗”。爲解決這個問題,他們專注於創建一種自動化的告警機制,以評估系統運行所產生的最重要的數據。在某些情況下,報警會觸發自動修復工作流。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Amazon還對隨時待命(oncall)事件使用了儀表盤。運維人員可以使用儀表盤定位並隔離問題。O'Shea給出的一個主要應用場景,就是在每週例行運維審覈會議上使用。此類會議的與會者包括一些企業高層、高級管理人員和高級工程師。會議中使用一種稱爲“幸運轉盤”("},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/blogs\/opensource\/the-wheel\/","title":null,"type":null},"content":[{"type":"text","text":"wheel of fortune"}],"marks":[{"type":"underline"}]},{"type":"text","text":")的工具,隨機選擇某個團隊的儀表盤,基於此討論用戶體驗和SLO問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲設計一致且實用的儀表盤,Amazon創建了一系列需遵循的通用設計原則,並給出了效果測定方式,以改進和推進這些原則。測定方法之一,就是新的運維人員是否能快速地理解和使用儀表盤。這種度量指標驅動的方法完全符合最近"},{"type":"link","attrs":{"href":"https:\/\/www.linkedin.com\/in\/camille-fournier-9011812\/","title":null,"type":null},"content":[{"type":"text","text":"Camille Fournier"}],"marks":[{"type":"underline"}]},{"type":"text","text":"在"},{"type":"link","attrs":{"href":"https:\/\/www.infoq.com\/news\/2020\/08\/fournier-internal-platform","title":null,"type":null},"content":[{"type":"text","text":"接受InfoQ採訪"}],"marks":[{"type":"underline"}]},{"type":"text","text":"中提出的技術和策略。在這次採訪中,她介紹了Amazon內部平臺團隊是如何交付更有效的產品。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原則之一是應從最終用戶預期的角度回推工作,確保儀表盤符合用戶的需求。O'Shea指出,“對儀表盤創建者而言,構建一個自己完全理解的儀表盤是非常容易的。但這樣的儀表盤可能對最終用戶是毫無價值的”。他們發現,用戶傾向於重點解讀最新渲染出的圖表,而傳統設計理念是將最重要的圖表置於儀表盤的最頂部。對於Web Service,通常最重要的是可用性的聚合圖或彙總圖,以及端到端延遲的百分比圖表。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其他設計原則包括:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"確保時區的一致性,並顯示在儀表盤上。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在圖表的佈局上,需遷就預期的最小顯示分辨率;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"確保提供可調整採集度量指標週期和時間間隔的功能;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在圖上標註報警的閾值和目標值。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"報警狀態、簡單數值和時序圖組件可用於適當位置。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"O'Shea還說明了Amazon在用的各類儀表盤,其中最重要並廣爲使用的是用戶體驗儀表盤。此類儀表盤設計適用於各種利益相關者的需求,從服務運維者到管理人員。儀表盤展示服務的整體健康狀態,以及多種當前進度情況的度量指標。所展示的數據可回答“受影響的客戶數量”、“受影響最大的客戶”等問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/72\/dc\/7237fd7b07cd106a3c54272eb59927dc.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"各類儀表盤是如何爲不同系統層級提供視圖(圖片來源:"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/builders-library\/building-dashboards-for-operational-visibility\/","title":null,"type":null},"content":[{"type":"text","text":"Amazon官方網站"}],"marks":[{"type":"underline"}]},{"type":"text","text":")"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在系統和服務層面也應創建儀表盤,提供多種系統和服務運行狀態視圖,用於審計跨各區域的服務。系統層儀表盤上應包含足夠的信息,支持查看系統任一端點的運行狀態,服務層儀表盤應深入到所有的單一服務實例,爲精準定位更深層次的問題提供視圖。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"指南最後探討了儀表盤的維護問題。O'Shea寫道:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"儀表盤的維護和更新,是集成於我們的開發過程中的。在完成變更前的代碼審覈期間,我們的開發人員會問,“是否有需要我們更新的儀表盤?”。因此我們授權開發人員,在部署變更前更新儀表盤。指南意在將儀表盤的創建和維護潛移默化到文化中。正如近期Tyler Treat在接受InfoQ採訪時分享的,“文化是許多工作的出發點。我們必須提升可觀察性的文化。如果團隊並未將儀表盤展示作爲系統的首要關注點,那麼構建其它工具的意義也不大。”"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"此外,指南鼓勵團隊在事後剖析(post-mortem)中討論是否需要改進儀表盤和自動化報警,以防患於未然,或是更快地發現問題。儀表盤的更改應使用與服務部署同樣的工具,包括作爲核心實踐的版本控制和IaaS。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/builders-library\/building-dashboards-for-operational-visibility\/","title":null,"type":null},"content":[{"type":"text","text":"最佳實踐指南的全文"}],"marks":[{"type":"underline"}]},{"type":"text","text":"已加入到"},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/builders-library\/","title":null,"type":null},"content":[{"type":"text","text":"Amazon"}],"marks":[{"type":"underline"}]},{"type":"link","attrs":{"href":"https:\/\/aws.amazon.com\/builders-library\/","title":null,"type":null},"content":[{"type":"text","text":"構建者庫"}],"marks":[{"type":"underline"}]},{"type":"text","text":"中。資料庫中包含了一系列的文檔,闡述並探討了Amazon構建、維護和操作軟件的機制。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"原文鏈接:"},{"type":"link","attrs":{"href":"https:\/\/www.infoq.com\/news\/2020\/10\/aws-dashboards\/","title":null,"type":null},"content":[{"type":"text","text":"AWS Publishes Best Practices Guide for Operational Dashboards"}],"marks":[{"type":"underline"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章