由於hadoop中mapper和reducer這兩個組件相互依賴,對這種測試傳統的Junit單元測試可能不滿足我們的需求。我們可以使用Mockito代替hadoop組件來對hadoop的mapper和reducer函數進行測試。下面我展示以計算ncdc中最高氣溫的map函數進行單元測試。
1、導入Mockito的jar包,一般hadoop工程中自帶這個jar,如果沒有自己去網上下載,這裏不多說;
2、編寫Mapper行數,這裏的Mapper函數等下用來Mockito來單元測試,MaxTemperatureMapper類如下:該類實現了從ncdc氣象數據中讀取數據,並以年份爲key,氣溫爲value
package org.wucl.hadoop.maxtemperature;
import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
public class MaxTemperatureMapper extends
Mapper<LongWritable, Text, Text, IntWritable> {
private static final int MISSING = 9999;
@Override
protected void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
String id = line.substring(15, 19);
int airTemperature;
if (line.charAt(87) == '+') {
airTemperature = Integer.parseInt(line.substring(88, 92));
} else {
airTemperature = Integer.parseInt(line.substring(87, 92));
}
String quality = line.substring(92, 93);
if (airTemperature != MISSING && quality.matches("[01459]")) {
context.write(new Text(id), new IntWritable(airTemperature));
//System.out.println(airTemperature);
}
}
}
(不熟悉ncdc 氣象數據結構的tx自己百度或google,自己動手,豐衣足食)
3、構建Context實例:爲對上面的map函數就行單元測試,首先需要構件一個Context實例,可以使用Mockito的mock方法來構建:
代碼如下:
Context context = mock(Context.class);
(當然,類前需要靜態導入Mockito包這裏纔可以肆無忌憚的直接調用mock()方法,後面將貼出完整代碼)
4、調用map函數:context實例創建完成後就可以創建Mapper對象並調用map方法了:
MaxTemperatureMapper mapper = new MaxTemperatureMapper();
Text value = new Text(
"0043011990999991950051518004+68750+023550FM-12+0382" +
// Year ^^^^
"99999V0203201N00261220001CN9999999N9-00111+99999999999");
// Temperature ^^^^^
mapper.map(null, value, context);
5、斷言:和junit的Assert有點像,還是直接看代碼明白的快:
verify(context).write(new Text("1950"), new IntWritable(-11));
運行結果如下:
下面貼出Mockito完整的代碼:
package org.wucl.hadoop.maxtemperature;
import static org.mockito.Mockito.mock;
import static org.mockito.Mockito.verify;
import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper.Context;
import org.junit.Test;
public class MaxTemperatureMapperTest {
@Test
public void processesValidRecord() throws IOException, InterruptedException {
Context context = mock(Context.class);
MaxTemperatureMapper mapper = new MaxTemperatureMapper();
Text value = new Text(
"0043011990999991950051518004+68750+023550FM-12+0382" +
// Year ^^^^
"99999V0203201N00261220001CN9999999N9-00111+99999999999");
// Temperature ^^^^^
mapper.map(null, value, context);
verify(context).write(new Text("1950"), new IntWritable(-11));
}
@Test
public void processesValidRecord2() throws IOException,
InterruptedException {
Context context = mock(Context.class);
MaxTemperatureMapper mapper = new MaxTemperatureMapper();
Text value = new Text(
"0043011990999991950051518004+68750+023550FM-12+0382" +
// Year ^^^^
"99999V0203201N00261220001CN9999999N9+00201+99999999999");
// Temperature ^^^^^
mapper.map(null, value, context);
verify(context).write(new Text("1950"), new IntWritable(20));
}
}