破解千行代碼缺陷率引發的“血案”:研發效能度量是一把標尺嗎?

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"人們常常認爲軟件研發度量爲管理者提供了一把標尺,可以簡單丈量出團隊乃至個人的表現,但這個隱喻背後其實包含了對研發效能度量的一些誤解。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"度量分兩種,一種是物理度量,一種是統計度量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"物理度量追求極致精確,測定目標物理量的接近絕對精確的數值。從秦始皇統一度量衡,到如今普遍使用的激光測距儀,物理學家已經將距離測定推進到了原子級,把質量測定推進到至少10的27次方分之一克。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"而統計度量是通過觀察樣本,對目標統計量做出一種近似但科學的推斷。比如,父母希望知道孩子的發育情況,那麼可以參考國家兒童體格發育調查報告。其中的度量就是衛生部門對各個年齡和地區兒童的身高、體重進行大範圍抽樣,計算出他們的分佈和均值等統計量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"大家經常引用彼得 · 德魯克的名言,“you can't manage what you can't measure”(你如果無法度量就無法管理),但這裏的度量到底是什麼樣的含義呢?鑑於物理課從初中開始就是主科,而統計學可能是大學裏掛科最多的課程之一,大家對度量的樸素理解往往是偏向物理度量的。而當我們建立了上述兩類區分,不難看出,管理學裏講的是統計度量——它對所謂“精確”的要求、對結果的解讀都與物理度量有本質不同。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"當我們談論研發效能度量時,我們談論的是統計度量——這是一個對正確理解和管理研發效能有很大影響但卻常被忽視的基礎性認知。"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"管理學中的度量,包括研發效能度量,追求的不是絕對精確,不是在任何情況下沒有反例,而是數據中反映出來的共性規律和潛在問題。對統計度量的使用,需要團隊和管理者進行更多系統思考。物理度量簡單直接,統計度量則需要輔以分析和調研。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"從缺陷度量案例講起"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"下面我們通過一個案例,具體理解研發效能度量的統計意義和系統思維。騰訊技術專家茹炳晟老師在文章"},{"type":"link","attrs":{"href":"https:\/\/www.infoq.cn\/article\/bd47xfxWLfFf6GfNg0U0","title":null,"type":null},"content":[{"type":"text","text":"《研發效能度量引發的血案》"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"中舉了一個用“千行代碼缺陷率”度量代碼質量的反例。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/52\/52304b5fa34d811dcb8c7db10e12a650.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"針對這個案例,我們想做以下幾點補充:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"首先,雖然代碼行數和代碼質量之間不存在因果性,但這並不會讓“度量的大前提”失效,因爲相關性足以支撐統計度量的現實意義:我們都希望軟件缺陷少,因爲缺陷很可能會影響用戶使用,阻礙產品的價值實現,帶來更多測試和開發成本;而缺陷數量和代碼規模相關,所以要想獲得對缺陷情況的全面認知,必然會綜合考慮兩者的關係,否則就會變成“多幹多錯、少幹少錯”。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"其次,從千行代碼缺陷率推導出“我們不相信你能夠寫出高質量的代碼”、“我們不鼓勵技術提升階段的陣痛”和“我們歡迎那些平庸的程序員”這些錯誤價值觀的根本原因,是沒能理解統計度量固有的灰度。缺乏度量會使效能問題無法被發現,但度量時套用錯誤的“理工科思維”,試圖依賴單一標尺得出精確結論,甚至是削足適履,可能更加危險。團隊如果囿於這樣的思維,那麼換任何其他的度量指標都是枉然。下一節我們會展開闡述系統思維如何破解這裏的問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"最後,降低統計度量中的噪音,設計制衡機制,對度量指標的合理使用非常重要。茹老師提到深度代碼分析指標“開發當量”,可通過計算抽象語法樹(AST)的複雜度來估計工作量,消除源代碼級別的噪音干擾(如換行、註釋等)。如果沒有類似制衡機制,有人就容易抵制不住走捷徑的誘惑;反過來說,不能因爲擔心有人“在錯誤的地方花費更多時間和精力”而不做制衡,因噎廢食。實際上,當與系統博弈的成本大於通過正確行爲獲益的成本,大家就會被引導到正確的行爲上。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"系統化破解“血案”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"代碼簡潔、缺陷少、可維護性好的大牛被認爲是測試不充分、工作不飽和,在技術提升陣痛期的工程師被批代碼質量不高,平庸的工程師反而樂得逍遙——與其說這是指標的原罪,不如說是缺乏系統思維導致的惡果。我們經常擔心某些度量指標的“負向牽引作用”,但“負向牽引”是如何產生的呢?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"讓我們用系統思維重新思考一下前面的案例:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"平平無奇工程師 A 的千行代碼缺陷率雖然落在安全範圍,但每需求或每故事點代碼行數\/當量卻異常偏高,說明代碼規模有冗餘;從缺陷停留時間看,一般需要很長時間才能定位並解決問題,維護成本確實偏高;進而從軟件工程質量的角度看,A 的代碼中可能有大量可複用邏輯沒有被抽象,架構上也有優化的空間;從評審環節看,A 的代碼要經過的平均評審輪次也可能偏高——綜合起來,雖然工程師 A 不一定會被揪着修復缺陷,但可能需要在設計模式或架構設計方面補補課。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"大牛工程師 B 的評價涉及對“缺陷”的多維度理解,如果同時參照後續測試中的故障和線上的事故,那麼就能說明 B 的代碼質量確實過硬,沒有理由被責令加強自測;相反,如果缺少數據支撐,碰上不了解情況的領導反而會百口莫辯。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"成長中的工程師 C,因爲數據呈現出實際的不足,這本身有什麼問題呢?團隊如果有牴觸“技術提升階段的陣痛”的文化,只一味掩蓋或隱藏“陣痛”,又如何能夠改善呢?C 的技術追求需要團隊的認可與支持,但在代碼缺陷率方面的提升空間同樣應該暴露出來。這不僅是對團隊工作成果負責,也能爲 C 的成長提供反饋和指引。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"在複雜體系的度量中,任何單一指標被過度寬泛地解讀、被過度簡化地歸因、被過度粗暴地使用,都是危險的。"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"而控制這類風險,確保度量的牽引力不跑偏,需要通過系統化設計才能實現。數據扮演了驅動的角色,但需要組織動起來,這正是“數據驅動”軟件工程的要義。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"研發效能度量的系統方法"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"爲了讓軟件效能度量在合適的土壤中生根發芽,我們建立了一套 MARI(measure-analyze-review-improve,即度量-分析-調研-改進,讀作“碼睿”)四步循環方法框架,幫助團隊避坑。該方法的系統性不僅體現在多維指標共同構成度量體系,也體現在度量和後續實踐的緊密結合。度量如果只止步於數字,就很難避免“爲了度量而度量,爲了提升而提升”的教條主義。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/7e\/7ed6b20cdcd508c62d88b3c420927f23.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"研發效能的度量和提升實踐可以歸納爲上圖的閉環:從特定的認知需求出發,通過度量獲得客觀數據,通過分析定位潛在的問題,通過調研挖掘問題本質,通過改進解決問題。這樣層層推進,持續循環,獲取自反饋,才能讓度量有響應、有落地。MARI 方法可以幫助團隊體系化地認知研發效能度量,避免簡單粗暴地誤用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"本文闡述了研發效能度量的統計意義和系統思維,分享了我們對這些底層邏輯的理解,以及怎樣在完整框架的支持下實現度量的價值落地。後續的幾篇內容已經在籌備中,將詳細展開具體的度量指標體系和 MARI 實踐方法。非常期待與關心研發效能的朋友們多多交流、互相學習!"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"作者介紹:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"任晶磊,清華大學計算機系博士,前微軟亞洲研究院研究員,斯坦福大學、卡內基梅隆大學訪問學者;現任思碼逸CEO,通過打造基於深度代碼分析技術的研發大數據平臺,以數據驅動軟件工程,助力企業和開發團隊提升研發效能。在程序分析、軟件工程領域具備多年前沿研究經驗,多篇論文發表在FSE、OSDI 等頂級國際學術會議上,參與過微軟下一代服務器架構的設計與實施,也是一位積極的開源貢獻者。近期作爲專家組成員參與《軟件研發效能度量規範》標準編制。"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章