HDFS針對多硬盤節點的存儲策略

原創

2018-09-06 05:39

http://hi.baidu.com/thinkdifferent/blog/item/95de0e2416c4da3fc89559b8.html

對於HDFS針對多硬盤節點的存儲策略，一直沒有找到比較確實的依據，只有Hadoop官網上說過一句nodes with multiple disks should be managed internally（大致如此，懶得再看了）。今天看到一篇博客，直接把代碼片段給貼上去了。現轉貼如下：

from: kzk's blog

To use multiple disks in Hadoop DataNode, you should add comma-separated directories to dfs.data.dir in hdfs-site.xml. The following is an example of using four disks.

view plain copy to clipboard print ?

<property>
<name>dfs.data.dir</name>
<value>/disk1, /disk2, /disk3, /disk4</value>
</property>

<property>     <name>dfs.data.dir</name>     <value>/disk1, /disk2, /disk3, /disk4</value>   </property>

But how to use these disks in Hadoop? I found the following code snippet in ./hdfs/org/apache/hadoop/hdfs/server/datanode/FSDataset.java at hadoop-0.20.1.

view plain copy to clipboard print ?

synchronized FSVolume getNextVolume( long blockSize) throws IOException {
int startVolume = curVolume;
while ( true ) {
FSVolume volume = volumes[curVolume];
curVolume = (curVolume + 1 ) % volumes.length;
if (volume.getAvailable() > blockSize) { return volume; }
if (curVolume == startVolume) {
throw new DiskOutOfSpaceException( "Insufficient space for an additional block" );
}
}
}

synchronized FSVolume getNextVolume(long blockSize) throws IOException {       int startVolume = curVolume;       while (true) {         FSVolume volume = volumes[curVolume];         curVolume = (curVolume + 1) % volumes.length;         if (volume.getAvailable() > blockSize) { return volume; }         if (curVolume == startVolume) {           throw new DiskOutOfSpaceException("Insufficient space for an additional block");         }       }     }

FSVolume represents the single directory specified at dfs.data.dir. This code places the blocks in round-robin fashion into multiple disks, while considering the available disk capacities.

One more thing. If the disk utilization reaches the 100%, the other important data (c,f. error log) cannot be written. To prevent this, Hadoop prepares the "dfs.datanode.du.reserved" value. When calculating the disk capacity in Hadoop, this value is always subtracted from the real capacity. Setting this value as severay hundreds of megabytes would be safe.

This is the default strategy of Hadoop, but I think considering the disk load avg would be better. If one disk is busy, Hadoop should avoid to use that disk. However, the block distribution would not be same across the disks in this method. Therefore, the read performance will drop. This is a very difficult problem. Do you come up with a better strategy?

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

HDFS針對多硬盤節點的存儲策略

工作中用到的腳本合集

24-5-18 X

disk stripping and disk mirroring

GDB功能收集

創建nickname 整理版

ObjectInputStream 死鎖問題

敏捷開發環境下的領導技能問題更受關注

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結