Having a dev set and metric speeds up iterations

大步前進——利用開發集和指標

我們事先難以知道那個方案能夠最好的解決問題。甚至老練的機器學習研究人員都需要踏破鐵鞋才能發現些令人滿意的東西。當我在構建一個機器學習算法的時候，我會經常這麼做：

制定如何構建系統的計劃
將計劃用代碼實現
做實驗，然後區別那個方案做得好哪些做得不好。然後在根據結論，繼續提出更多的計劃，如此反覆。

It is very difficult to know in advance what approach will work best for a new problem. Even experienced machine learning researchers will usually try out many dozens of ideas before they discover something satisfactory. When building a machine learning system, I will often:

Start off with some idea on how to build the system.
Implement the idea in code.
Carry out an experiment which tells me how well the idea worked. (Usually my first few ideas don’t work!) Based on these learnings, go back to generate more ideas, and keep on iterating.

這是一個循環的過程。如果你每一輪都做得很快，那麼你將迅速地取得進步。這就是未掃描擁有一個開發或者測試集如此重要的原因了：每當你提出點子的時候，你都是依據它在測試集或開發集上的表現來衡量這樣子修改是否正確。或者，這個方向是否是可行的。

This is an iterative process. The faster you can go round this loop, the faster you will make progress. This is why having dev/test sets and a metric are important: Each time you try an idea, measuring your idea’s performance on the dev set lets you quickly decide if you’re heading in the right direction.

相反，假設你沒有特定的開發集和評價指標。每當你的團隊開發出一個新的喵咪分類器的時候，要想測試它是否有用，你得把它裝到手機上，讓後再玩上個把小時，來看看這個新的分類器是否在原來的基礎上有所提升。難以想象，這樣子效率該有多低！另外，如果你的團隊將分類器的準確性提高了0.1%，那麼，你在怎麼在手機上玩，你都不會察覺到這細微的提升的。然而，很多時候，系統的提升通常是通過無數0.1%積累而來。因此，擁有一個開發集和指標能讓你更快的察覺到，那些點子會給你的方案帶來提高，然後，你就能迅速地知道，哪些點子能繼續走下去，哪些點子需要被無情的拋棄。

In contrast, suppose you don’t have a specific dev set and metric. So each time your team develops a new cat classifier, you have to incorporate it into your app, and play with the app for a few hours to get a sense of whether the new classifier is an improvement. This would be incredibly slow! Also, if your team improves the classifier’s accuracy from 95.0% to 95.1%, you might not be able to detect that 0.1% improvement from playing with the app. Yet a lot of progress in your system will be made by gradually accumulating dozens of these 0.1% improvements. Having a dev set and metric allows you to very quickly detect which ideas are successfully giving you small (or large) improvements, and therefore lets you quickly decide what ideas to keep refining, and which ones to discard.

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

machine learning yearning 第十章

Having a dev set and metric speeds up iterations

大步前進——利用開發集和指標

.Net 8.0 下的新RPC，IceRPC之試試的新玩法"打洞"

完美替代postman的軟件

Vue mockjs mock.js

關於遊戲付費的一點想法

我通過CKA和CKS啦！

《最新出爐》系列入門篇-Python+Playwright自動化測試-42-強大的可視化追蹤利器Trace Viewer

大數據怎麼學？對大數據開發領域及崗位的詳細解讀，完整理解大數據開發領域技術體系

雙變量的t檢驗

方差分析與單因素方差分析

配對變量t檢驗

參數檢驗之t檢驗

Dijkstra解決TSP問題

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結