【配置CDH和管理服務】關閉DataNode前HDFS的調優

配置CDH和管理服務

關閉DataNode前HDFS的調優

角色要求:配置員、集羣管理員、完全管理員

 

當一個DataNode關閉,NameNode確保每一個DataNode中的每一個塊根據複製係數(the replication factor)跨集羣仍然是可用的。這個過程涉及到DataNode間小批次的塊複製。在這種情況下,一個DataNode有成千上萬個塊,關閉後集羣間還原備份數可能需要幾個小時。關閉DataNode的主機之前,你應該首先調整HDFS:

 

1提高DataNode的堆棧大小。DataNode應該至少有4 GB的堆棧大小,以允許迭代的增加和最大的流

    a去HDFS服務頁面;

    b單擊配置(Configuration)選項卡;

    c在每個DataNode角色組(DataNode默認組和額外的DataNode角色組)去資源管理(ResourceManagement)類別,並設置DataNode的Java堆棧大小(字節)(Java Heap Size of DataNode in Bytes)

    d、點擊保存更改(Save Changes提交更改。

 

2設置DataNode平衡帶寬

    a展開DataNode默認組(DataNode Default Group) > 性能(Performance)類別;

    b根據你的磁盤和網絡性能配置DataNode平衡帶寬(DataNode Balancing Bandwidth

c點擊保存更改(Save Changes提交更改。

 

3提高依據迭代設置複製工作乘數器的數值(默認值是2,然而推薦值是10)

    a展開NameNode默認組(NameNode Default Group) >高級(Advanced)類別;

    b將配置依據迭代設置複製工作乘數器(Replication Work Multiplier Per Iteration)設置爲10

c點擊保存更改(Save Changes提交更改。

 

4增加複製的最大線程數和最大複製線程的限制數

    a展開NameNode默認組(NameNode Default Group) >高級(Advanced)類別;

    b配置Datanode複製線程的最大數量(Maximumnumber of replication threads on a Datanode)和Datanod複製線程的數量的限制數(Hardlimit on the number of replication threads on a Datanod)分別爲50和100;

    c點擊保存更改(Save Changes提交更改。

 

5重新啓動HDFS服務。


翻譯水平有限,以下是手打英文原文:

Configuring CDH and Managed Services

Tuning HDFS Prior to Decommissioning DataNodes

Required Role: Configurator、 Cluster Administrator、 Full Administrator

 

When a DataNode isdecommissioned, the NameNode ensures that every that every block from the DataNodewill still be available across the cluster as dictated by the replicationfactor. This procedure involves copying blocks off the DataNode in smallbatches. In cases where a DataNode has thousands of blocks,decommissioning cantake several hours. Before decommissioning hosts with DataNodes,you shouldfirst tune HDFS:

 

1、Raise the heap size of the DataNodes.DataNodes should be configured with at least 4 GB heap size to allow for theincrease in iterations and max streams.

  a、Go to the HDFS service page.

  b、Click the Configuration tab.

  c、Under each DataNode role group (DataNodeDefault Group and additional DataNode role groups) go to the Resource Management category, and setthe Java Heap Size of DataNode in Bytesproperty as recommended.

d、Click SaveChanges to commit the changes.

 

2、Set the DataNode balancing bandwith:

  a、Expand the DataNode Default Group > Performancecategory.

  b、Configure the DataNode Balancing Bandwidth property to the bandwisth you have onyour disks and network.

  c、Click SaveChanges to commit the changes.

 

3、Increase the replication work multiplierper iteration to a larger number (the default is 2, however 10 is recommended):

a、Expand the NameNodeDefault Group > Advancedcatrgory.

b、Configure the ReplicationWork Multiplier Per Iteration property to a value such as 10.

c、Click SaveChanges to commit the changes.

 

4、 Increase the replication maximim threadsand maximum replication thread hard limits:

a、 Expand the NameNodeDefault Group > Advancedcategory.

  b、 Configure the Maximum number of replication threads on a Datanode and Hard limit on the number of replicationthreads on a Datanode properties to 50 and 100 respectively.

  c、 Click SaveChanges to commit the Changes.

 

5、Restart the HDFS service.


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章