最近執行mapreduce的時候老出現mapreduce的task執行不穩定的情況,有時候某個任務一直在重試,導致整個mapreduce一直處於一個階段,就像卡住了一樣,重試N久,最後可能幾小時才執行完。於是乎只好查看各個目錄下的log(問題跟蹤解決http://blog.csdn.net/rzhzhz/article/details/7536285),發現datanode下出現瞭如下錯誤
2012-04-27 10:40:30,683 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.64.49.22:50010, storageID=DS-1420900310-10.64.49.22-50010-1332741432282, infoPort=50075, ipcPort=50020):DataXceiver
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:202)
at sun.nio.ch.IOUtil.read(IOUtil.java:175)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:243)
at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55)
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at java.io.DataInputStream.read(DataInputStream.java:132)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:264)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:354)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:375)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:528)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:397)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:107)
at java.lang.Thread.run(Thread.java:662)
這是官方bug頁,此問題已經處於closed狀態
官方描述如下
When a client reads data using read(), it closes the sockets after it is done.
Often it might not read till the end of a block. The datanode on the other side keeps writing data until the client connection is closed or end of the block is reached.
If the client does not read till the end of the block, Datanode writes an error message and stack trace to the datanode log. It should not.
This is not an error and it just pollutes the log and confuses the user.