map任務的輸出由ReduceTask類的方法long copyOutput(MapOutputLocation loc)實現,包括以下幾個步驟:
1.檢查是否已經拷貝,如果已經拷貝,則返回-2表示要拷貝的數據已經過期
// check if we still need to copy the output from this location
if (copiedMapOutputs.contains(loc.getTaskId()) ||
obsoleteMapIds.contains(loc.getTaskAttemptId())) {
return CopyResult.OBSOLETE;
}
2. 構造map輸出的路徑及文件名和本地用於存儲遠程數據的臨時文件路徑
//map輸出文件名output/map_任務Id.out
Path filename =
new Path(String.format(
MapOutputFile.REDUCE_INPUT_FILE_FORMAT_STRING,
TaskTracker.OUTPUT, loc.getTaskId().getId()));
// Copy the map output to a temp file whose name is unique to this attempt
//拷貝到本地的臨時文件名
Path tmpMapOutput = new Path(filename+"-"+id);
3. 執行數據的拷貝
這步主要由函數getMapOutput()實現,在下面會詳細描述這個個過程
// Copy the map output
MapOutput mapOutput = getMapOutput(loc, tmpMapOutput,
reduceId.getTaskID().getId());
4.以同步併發的機制實現以下功能
synchronized (ReduceTask.this) {}
1)再次檢查當前拷貝的數據是否已經拷貝過,如果拷貝過,則丟棄
if (copiedMapOutputs.contains(loc.getTaskId())) {
mapOutput.discard();
return CopyResult.OBSOLETE;
}
2)檢查原始map輸出數據大小是否爲0,如果爲0,則把拷貝生成的文件刪除
// Special case: discard empty map-outputs
if (bytes == 0) {
try {
mapOutput.discard();
} catch (IOException ioe) {
LOG.info("Couldn't discard output of " + loc.getTaskId());
}
// Note that we successfully copied the map-output
noteCopiedMapOutput(loc.getTaskId());
return bytes;
}
3)分別處理拷貝完成的數據,分爲內存和本地文件兩種
a.數據被拷貝到內存中,則把拷貝的內存數據句柄加入集合中
// Process map-output
if (mapOutput.inMemory) {
// Save it in the synchronized list of map-outputs
mapOutputsFilesInMemory.add(mapOutput);
}
b.數據存儲在本地文件,則把臨時文件重命名爲最終文件
// Rename the temporary file to the final file;
// ensure it is on the same partition
//把拷貝生成的臨時文件重命名爲最後
tmpMapOutput = mapOutput.file;
//把output/output/map_任務Id.out-0這樣的臨時文件重命名爲
//output/output/map_任務Id.out這樣的文件
filename = new Path(tmpMapOutput.getParent(), filename.getName());
if (!localFileSys.rename(tmpMapOutput, filename)) {
localFileSys.delete(tmpMapOutput, true);
bytes = -1;
throw new IOException("Failed to rename map output " +
tmpMapOutput + " to " + filename);
}
4)把本次拷貝的任務加入已經拷貝任務的集合中,並修改可拷貝的任務數
// Note that we successfully copied the map-output
//把此任務id加入進copiedMapOutputs
//並把還需要拷貝的map任務數置爲(總數-已經拷貝的數量)
noteCopiedMapOutput(loc.getTaskId());
此方法內部代碼爲:
/**
* Save the map taskid whose output we just copied.
* This function assumes that it has been synchronized on ReduceTask.this.
*
* @param taskId map taskid
*/
private void noteCopiedMapOutput(TaskID taskId) {
copiedMapOutputs.add(taskId);
ramManager.setNumCopiedMapOutputs(numMaps - copiedMapOutputs.size());
}
getMapOutput是數據拷貝的主實現方法,以下是這個方法的源碼解析,方法簽名爲
private MapOutput getMapOutput(MapOutputLocation mapOutputLoc,
Path filename, int reduce)
throws IOException, InterruptedException
內部實現步驟:
1.獲取map任務輸出地址的連接和輸入流
// Connect
URL url = mapOutputLoc.getOutputLocation();
URLConnection connection = url.openConnection();
InputStream input = setupSecureConnection(mapOutputLoc, connection);
2.檢查當前地址的map輸出是否是想要獲取的map輸出
// Validate header from map output
TaskAttemptID mapId = null;
try {
mapId =
TaskAttemptID.forName(connection.getHeaderField(FROM_MAP_TASK));
} catch (IllegalArgumentException ia) {
LOG.warn("Invalid map id ", ia);
return null;
}
TaskAttemptID expectedMapId = mapOutputLoc.getTaskAttemptId();
if (!mapId.equals(expectedMapId)) {
LOG.warn("data from wrong map:" + mapId +
" arrived to reduce task " + reduce +
", where as expected map output should be from " + expectedMapId);
return null;
}
如果是,則往下繼續執行,如果不是,則說明取數據的地址出現問題,則返回
3.檢查map輸出的數據大小是否大於零,包括壓縮和未壓縮的情況
//未壓縮的數據
long decompressedLength =
Long.parseLong(connection.getHeaderField(RAW_MAP_OUTPUT_LENGTH));
//壓縮的數據長度
long compressedLength =
Long.parseLong(connection.getHeaderField(MAP_OUTPUT_LENGTH));
if (compressedLength < 0 || decompressedLength < 0) {
LOG.warn(getName() + " invalid lengths in map output header: id: " +
mapId + " compressed len: " + compressedLength +
", decompressed len: " + decompressedLength);
return null;
}
4.檢查map輸出的分區是否屬於此reduce任務
//檢查是否屬於此reduce任務的輸出,我的理解是,map端的分區輸出記錄有reduce的 //任務id,需要查看map端輸出
//猜測?job在初始化任務的時候,已經創建了所有的map任務ID以及reduce任務ID
int forReduce =
(int)Integer.parseInt(connection.getHeaderField(FOR_REDUCE_TASK));
//reduce的值爲當前reduce任務id
if (forReduce != reduce) {
LOG.warn("data for the wrong reduce: " + forReduce +
" with compressed len: " + compressedLength +
", decompressed len: " + decompressedLength +
" arrived to reduce task " + reduce);
return null;
}
5.執行數據的拷貝
此步,又可以分爲以下幾個詳細的步驟:
1)檢查剩下的內存是否足夠存儲拷貝的數據
//We will put a file in memory if it meets certain criteria:
//1. The size of the (decompressed) file should be less than 25% of
// the total inmem fs
//2. There is space available in the inmem fs
// Check if this map-output can be saved in-memory
//通過檢查輸出數據沒有壓縮的大小與內存能放的最大值比較,如果小於,則可以放,如 //果大於,則不可以放內存
//最大值是mapred.job.reduce.total.mem.bytes配置的0.25倍
boolean shuffleInMemory = ramManager.canFitInMemory(decompressedLength);