MPP大規模並行處理架構詳解

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"面試官:說下你知道的MPP架構的計算引擎?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"這個問題不少小夥伴在面試時都遇到過,因爲對MPP這個概念瞭解較少,不少人都卡殼了,但是我們常用的大數據計算引擎有很多都是MPP架構的,","attrs":{}},{"type":"text","marks":[{"type":"italic","attrs":{}},{"type":"color","attrs":{"color":"#3976f2","name":"user"}}],"text":"像我們熟悉的Impala、ClickHouse、Druid、Doris等都是MPP架構","attrs":{}},{"type":"text","text":"。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"採用MPP架構的很多OLAP引擎號稱:","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"億級秒開","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"。","attrs":{}}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#6a737d","name":"user"}}],"text":"本文分爲三部分講解,第一部分","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#6a737d","name":"user"}},{"type":"strong","attrs":{}}],"text":"詳解MPP架構","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#6a737d","name":"user"}}],"text":",第二部分剖析","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#6a737d","name":"user"}},{"type":"strong","attrs":{}}],"text":"MPP架構與批處理架構的異同點","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#6a737d","name":"user"}}],"text":",第三部分是","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#6a737d","name":"user"}},{"type":"strong","attrs":{}}],"text":"採用MPP架構的OLAP引擎介紹","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#6a737d","name":"user"}}],"text":"。","attrs":{}}]}],"attrs":{}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#48b378","name":"user"}}],"text":"一、MPP架構","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MPP是系統架構角度的一種服務器分類方法。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"目前商用的服務器分類大體有三種:","attrs":{}}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"SMP(對稱多處理器結構)","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"NUMA(非一致存儲訪問結構)","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic","attrs":{}},{"type":"color","attrs":{"color":"#3976f2","name":"user"}},{"type":"strong","attrs":{}}],"text":"MPP(大規模並行處理結構)","attrs":{}}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們今天的主角是 MPP,因爲","attrs":{}},{"type":"text","marks":[{"type":"italic","attrs":{}},{"type":"color","attrs":{"color":"#3976f2","name":"user"}}],"text":"隨着分佈式、並行化技術成熟應用,MPP引擎逐漸表現出強大的高吞吐、低時延計算能力,有很多采用MPP架構的引擎都能達到“億級秒開”","attrs":{}},{"type":"text","text":"。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"先了解下這三種結構:","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"1. SMP","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"即對稱多處理器結構,就是指服務器的多個CPU對稱工作,無主次或從屬關係。","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"SMP服務器的主要特徵是共享","attrs":{}},{"type":"text","text":",系統中的所有資源(如CPU、內存、I/O等)都是共享的。也正是由於這種特徵,導致了SMP服務器的主要問題,即","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"擴展能力非常有限","attrs":{}},{"type":"text","text":"。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"2. NUMA","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"即非一致存儲訪問結構。這種結構就是爲了解決SMP擴展能力不足的問題,利用NUMA技術,可以把幾十個CPU組合在一臺服務器內。NUMA的基本特徵是擁有多個CPU模塊,節點之間可以通過互聯模塊進行連接和信息交互,所以,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"每個CPU可以訪問整個系統的內存","attrs":{}},{"type":"text","text":"(這是與MPP系統的重要區別)。但是訪問的速度是不一樣的,因爲CPU訪問本地內存的速度遠遠高於系統內其他節點的內存速度,這也是非一致存儲訪問NUMA的由來。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這種結構也有一定的缺陷,由於訪問異地內存的時延遠遠超過訪問本地內存,因此,當CPU數量增加時,系統性能無法線性增加。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"3. MPP","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"即大規模並行處理結構。MPP的系統擴展和NUMA不同,","attrs":{}},{"type":"text","marks":[{"type":"italic","attrs":{}},{"type":"color","attrs":{"color":"#3976f2","name":"user"}}],"text":"MPP是由多臺SMP服務器通過一定的節點互聯網絡進行連接","attrs":{}},{"type":"text","text":",協同工作,完成相同的任務,從用戶的角度來看是一個服務器系統。每個節點只訪問自己的資源,所以是一種","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"完全無共享(Share Nothing)結構","attrs":{}},{"type":"text","text":"。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MPP結構擴展能力最強,理論可以無限擴展。由於MPP是多臺SPM服務器連接的,每個節點的CPU不能訪問另一個節點內存,所以也不存在異地訪問的問題。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MPP架構圖:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/1e/1e623ac7f6ea6c7469d074ed7189e21f.png","alt":"MPP架構","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"MPP架構","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"每個節點內的CPU不能訪問另一個節點的內存,節點之間的信息交互是通過節點互聯網絡實現的,這個過程稱爲","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"數據重分配","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"但是MPP服務器需要一種複雜的機制來調度和平衡各個節點的負載和並行處理過程。目前,一些基於MPP技術的服務器往往通過系統級軟件(如數據庫)來屏蔽這種複雜性。舉個例子,Teradata就是基於MPP技術的一個關係數據庫軟件(這是最早採用MPP架構的數據庫),基於此數據庫來開發應用時,不管後臺服務器由多少節點組成,開發人員面對的都是同一個數據庫系統,而無需考慮如何調度其中某幾個節點的負載。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"MPP架構特徵:","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"任務並行執行;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"數據分佈式存儲(本地化);","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"分佈式計算;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"高併發,單個節點併發能力大於300用戶;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"橫向擴展,支持集羣節點的擴容;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"Shared Nothing(完全無共享)架構","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"。","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic","attrs":{}},{"type":"color","attrs":{"color":"#3976f2","name":"user"}}],"text":"NUMA和MPP區別:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"二者有許多相似之處,首先NUMA和MPP都是由多個節點組成的;其次每個節點都有自己的CPU,內存,I/O等;都可以都過節點互聯機制進行信息交互。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"那它們的區別是什麼呢,首先是","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"節點互聯機制不同","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":",NUMA的節點互聯是在同一臺物理服務器內部實現的,MPP的節點互聯是在不同的SMP服務器外部通過I/O實現的。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"其次是","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"內存訪問機制不同","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":",在NUMA服務器內部,任何一個CPU都可以訪問整個系統的內存,但異地內存訪問的性能遠遠低於本地內存訪問,因此,在開發應用程序時應該儘量避免異地內存訪問。而在MPP服務器中,每個節點只訪問本地內存,不存在異地內存訪問問題。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#48b378","name":"user"}}],"text":"二、批處理架構和MPP架構","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"批處理架構(如 MapReduce)與MPP架構的異同點,以及它們各自的優缺點是什麼呢?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic","attrs":{}},{"type":"color","attrs":{"color":"#3976f2","name":"user"}}],"text":"相同點:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"首先相同點,","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"批處理架構與MPP架構都是分佈式並行處理","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":",將任務並行的分散到多個服務器和節點上,在每個節點上計算完成後,將各自部分的結果彙總在一起得到最終的結果。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic","attrs":{}},{"type":"color","attrs":{"color":"#3976f2","name":"user"}}],"text":"不同點:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"批處理架構和MPP架構的不同點可以舉例來說:我們執行一個任務,首先這個任務會被分成多個task執行,","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"對於MapReduce來說,這些tasks被隨機的分配在空閒的Executor上;而對於MPP架構的引擎來說,每個處理數據的task被綁定到持有該數據切片的指定Executor上","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"正是由於以上的不同,使得兩種架構有各自優勢也有各自缺陷:","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic","attrs":{}},{"type":"color","attrs":{"color":"#3976f2","name":"user"}},{"type":"strong","attrs":{}}],"text":"批處理的優勢:","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"對於批處理架構來說,","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"如果某個Executor執行過慢,那麼這個Executor會慢慢分配到更少的task執行","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":",批處理架構有個","attrs":{}},{"type":"text","marks":[{"type":"italic","attrs":{}},{"type":"color","attrs":{"color":"#3976f2","name":"user"}}],"text":"推測執行","attrs":{}},{"type":"text","text":"策略,推測出某個Executor執行過慢或者有故障,則在接下來分配task時就會較少的分配給它或者直接不分配。","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic","attrs":{}},{"type":"color","attrs":{"color":"#3976f2","name":"user"}},{"type":"strong","attrs":{}}],"text":"MPP的缺陷:","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"對於MPP架構來說,因爲task和Executor是綁定的,如果某個Executor執行過慢或故障,將會導致","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"整個集羣的性能就會受限於這個故障節點的執行速度","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"(所謂木桶的短板效應),所以MPP架構的最大缺陷就是——","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"短板效應","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"。另一點,集羣中的節點越多,則某個節點出現問題的概率越大,而一旦有節點出現問題,對於MPP架構來說,將導致整個集羣性能受限,所以一般實際生產中","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"MPP架構的集羣節點不易過多","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"。","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic","attrs":{}},{"type":"color","attrs":{"color":"#3976f2","name":"user"}},{"type":"strong","attrs":{}}],"text":"批處理的缺陷:","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"任何事情都是有代價的,對於批處理而言,","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"會將中間結果寫入到磁盤中","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":",這嚴重限制了處理數據的性能。","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic","attrs":{}},{"type":"color","attrs":{"color":"#3976f2","name":"user"}},{"type":"strong","attrs":{}}],"text":"MPP的優勢:","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"MPP架構不需要將中間數據寫入磁盤","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":",因爲一個單一的Executor只處理一個單一的task,因此可以簡單直接將數據stream到下一個執行階段。這個過程稱爲","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"pipelining","attrs":{}}],"marks":[{"type":"color","attrs":{"color":"#28ca71","name":"user"}}],"attrs":{}},{"type":"text","text":",它提供了很大的性能提升。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic","attrs":{}},{"type":"color","attrs":{"color":"#3976f2","name":"user"}}],"text":"舉個例子","attrs":{}},{"type":"text","text":":要實現兩個大表的join操作,對於批處理而言,如Spark將會寫磁盤3次(第一次寫入:表1根據","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"join key","attrs":{}}],"marks":[{"type":"color","attrs":{"color":"#28ca71","name":"user"}}],"attrs":{}},{"type":"text","text":"進行shuffle;第二次寫入:表2根據","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"join key","attrs":{}}],"marks":[{"type":"color","attrs":{"color":"#28ca71","name":"user"}}],"attrs":{}},{"type":"text","text":"進行shuffle;第三次寫入:Hash表寫入磁盤), 而MPP只需要一次寫入(Hash表寫入)。","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"這是因爲MPP將mapper和reducer同時運行,而MapReduce將它們分成有依賴關係的tasks(DAG),這些task是異步執行的,因此必須通過寫入中間數據共享內存來解決數據的依賴","attrs":{}},{"type":"text","text":"。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic","attrs":{}},{"type":"color","attrs":{"color":"#3976f2","name":"user"}}],"text":"批處理架構和MPP架構融合:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"兩個架構的優勢和缺陷都很明顯,並且它們有互補關係,如果我們能將二者結合起來使用,是不是就能發揮各自最大的優勢。目前批處理和MPP也確實正在逐漸走向融合,也已經有了一些設計方案,一旦技術成熟,可能會風靡大數據領域,我們拭目以待!","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#48b378","name":"user"}}],"text":"三、 MPP架構的OLAP引擎","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"採用MPP架構的OLAP引擎有很多,下面只選擇常見的幾個引擎對比下,可爲公司的技術選型提供參考。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"採用MPP架構的OLAP引擎分爲兩類,一類是自身不存儲數據,只負責計算的引擎;一類是自身既存儲數據,也負責計算的引擎","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"1)只負責計算,不負責存儲的引擎","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"1. Impala","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"Apache Impala是採用MPP架構的查詢引擎,本身不存儲任何數據,","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"直接使用內存進行計算","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":",兼顧數據倉庫,具有實時,批處理,多併發等優點。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"提供了類SQL(類Hsql)語法,在多用戶場景下也能擁有較高的響應速度和吞吐量。它是由Java和C++實現的,Java提供的查詢交互的接口和實現,C++實現了查詢引擎部分。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"Impala支持共享Hive Metastore,但沒有再使用緩慢的 Hive+MapReduce 批處理,而是通過使用與商用並行關係數據庫中類似的分佈式查詢引擎(由 Query Planner、Query Coordinator 和 Query Exec Engine 三部分組成),可以直接從 HDFS 或 HBase 中用 SELECT、JOIN 和統計函數查詢數據,從而大大降低了延遲。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"Impala經常搭配存儲引擎Kudu一起提供服務,這麼做最大的優勢是查詢比較快,並且支持數據的Update和Delete。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"2. Presto","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"Presto是一個分佈式的採用MPP架構的查詢引擎,本身並不存儲數據,但是可以接入多種數據源,並且","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"支持跨數據源的級聯查詢","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"。Presto是一個OLAP的工具,擅長對海量數據進行復雜的分析;但是對於OLTP場景,並不是Presto所擅長,所以不要把Presto當做數據庫來使用。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"Presto是一個低延遲高併發的","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"內存計算引擎","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"。需要從其他數據源獲取數據來進行運算分析,它可以連接多種數據源,包括Hive、RDBMS(Mysql、Oracle、Tidb等)、Kafka、MongoDB、Redis等。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"2)既負責計算,又負責存儲的引擎","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"1. ClickHouse","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"ClickHouse是近年來備受關注的開源","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"列式數據庫","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":",主要用於數據分析(OLAP)領域。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"它自包含了存儲和計算能力,完全自主實現了高可用,而且","attrs":{}},{"type":"text","marks":[{"type":"italic","attrs":{}},{"type":"color","attrs":{"color":"#3976f2","name":"user"}}],"text":"支持完整的SQL語法包括JOIN等","attrs":{}},{"type":"text","text":",技術上有着明顯優勢。相比於hadoop體系,以數據庫的方式來做大數據處理更加簡單易用,學習成本低且靈活度高。當前社區仍舊在迅猛發展中,並且在國內社區也非常火熱,各個大廠紛紛跟進大規模使用。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"ClickHouse在計算層做了非常細緻的工作,","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"竭盡所能榨乾硬件能力,提升查詢速度","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"。它實現了單機多核並行、分佈式計算、向量化執行與SIMD指令、代碼生成等多種重要技術。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"ClickHouse從OLAP場景需求出發,定製開發了一套全新的高效列式存儲引擎,並且","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"實現了數據有序存儲、主鍵索引、稀疏索引、數據Sharding、數據Partitioning、TTL、主備複製等豐富功能","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"。以上功能共同爲ClickHouse極速的分析性能奠定了基礎。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"2. Doris","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"Doris是百度主導的,根據Google Mesa論文和Impala項目改寫的一個大數據分析引擎,是一個","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"海量分佈式 KV 存儲系統","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":",其設計目標是支持中等規模高可用可伸縮的 KV 存儲集羣。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"Doris可以實現海量存儲,線性伸縮、平滑擴容,自動容錯、故障轉移,高併發,且運維成本低。部署規模,建議部署4-100+臺服務器。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"Doris3 的主要架構: DT(Data Transfer)負責數據導入、DS(Data Seacher)模塊負責數據查詢、DM(Data Master)模塊負責集羣元數據管理,數據則存儲在 Armor 分佈式 Key-Value 引擎中。Doris3 依賴 ZooKeeper 存儲元數據,從而其他模塊依賴 ZooKeeper 做到了無狀態,進而整個系統能夠做到無故障單點。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"3. Druid","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"Druid是一個","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"開源、分佈式、面向列式存儲的實時分析數據存儲系統","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"Druid的關鍵特性如下:","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"亞秒級的OLAP查詢分析:採用了列式存儲、倒排索引、位圖索引等關鍵技術;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"在亞秒級別內完成海量數據的過濾、聚合以及多維分析等操作;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"實時流數據分析:Druid提供了實時流數據分析,以及高效實時寫入;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"實時數據在亞秒級內的可視化;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"豐富的數據分析功能:Druid提供了友好的可視化界面;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"SQL查詢語言;","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"高可用性與高可拓展性:Druid工作節點功能單一,不相互依賴;Druid集羣在管理、容錯、災備、擴容都很容易;","attrs":{}}]}]}],"attrs":{}},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"4. TiDB","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"TiDB 是 PingCAP 公司自主設計、研發的開源","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"分佈式關係型數據庫","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":",是一款同時支持OLTP與OLAP的融合型分佈式數據庫產品。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"TiDB 兼容 MySQL 5.7 協議和 MySQL 生態等重要特性。目標是爲用戶提供一站式 OLTP 、OLAP 、HTAP 解決方案。","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"TiDB 適合高可用、強一致要求較高、數據規模較大","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"等各種應用場景。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"5. Greenplum","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"Greenplum 是在開源的 PostgreSQL 的基礎上採用了MPP架構的性能非常強大的","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}},{"type":"strong","attrs":{}}],"text":"關係型分佈式數據庫","attrs":{}},{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"。爲了兼容Hadoop生態,又推出了HAWQ,分析引擎保留了Greenplum的高性能引擎,下層存儲不再採用本地硬盤而改用HDFS,規避本地硬盤可靠性差的問題,同時融入Hadoop生態。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"3)常用的引擎對比","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"一張圖總結下常用的OLAP引擎對比:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/68/68800c99bff66db2fc07b4f4f6c15149.png","alt":"常見OLAP引擎對比","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#3e3e3e","name":"user"}}],"text":"常見OLAP引擎對比","attrs":{}}]},{"type":"horizontalrule","attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章