0.前言

本文參考博客：http://www.51itong.net/eclipse-hadoop2-7-0-12448.html
搭建開發環境前保障已經搭建好hadoop的僞分佈式。可參考上個博客：
http://blog.csdn.net/xummgg/article/details/51173072

1.下載安裝eclipse

下載網址：http://www.eclipse.org/downloads/

因爲運行在ubuntu下，所以下載linux 64爲的版本（支持javaEE），下載後默認放在當前用戶的Downloads。
解壓，命令如下：

解壓後可以在/usr/local下看到：

因爲，要加入新jar包進入eclipse，所以把ecplise文件夾權限，設置高權限。

2.下載hadoop插件

2.6.4插件hadoop-eclipse-plugin-2.6.4.jar 下載地址：
http://download.csdn.net/download/tondayong1981/9437360
下載完成後，把插件放到eclipse/plugins目錄下

用sudo要輸入用戶密碼。

3.設置eclipse

運行eclipse

打開window->preferences

可以看到多了個Hadoop Map/Reduce，設置本機的hadoop目錄，我的目錄時/usr/local/hadoop/hadoop-2.6.4/ ，如下圖所示：

4.配置Map/Reduce Locations

注意：配置前先在後臺運行起hadoop，即開啓hadoop僞分佈式的dfs和yarn，參考上一個博客。

Eclipse中打開Windows—Open Perspective—Other

選擇Map/Reduce，點擊OK

在右下方看到如下圖所示

點擊Map/Reduce Location選項卡，點擊右邊藍色小象圖標，打開Hadoop Location配置窗口。

輸入Location Name，任意名稱即可.配置Map/Reduce Master和DFS Mastrer，Host和Port配置成與core-site.xml的設置一致即可。如下圖：

點擊”Finish”按鈕，關閉窗口。

點擊左側的DFSLocations—>myhadoop（上一步配置的location name），如能看到user，表示安裝成功。這樣eclipse就連接上了分佈式文件系統，可以在eclipse裏做查看，方便編程。

5.新建WordCount項目

點擊File—>Project：

選擇Map/Reduce Project，點next進入下一步：

輸入項目名稱WordCount，點finish完成：

在WordCount項目裏右鍵src新建class，包名com.xxm（請自行命明），類名爲WordCount：

代碼如下：

package com.xxm;//改爲自己的包名

import java.io.IOException;
import java.util.*;

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

/**
 * 描述：WordCount explains by xxm
 * @author xxm
 */
public class WordCount｛

 /**
 * Map類：自己定義map方法
 */
 public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
    /**
    * LongWritable, IntWritable, Text 均是 Hadoop 中實現的用於封裝 Java 數據類型的類
    * 都能夠被串行化從而便於在分佈式環境中進行數據交換，可以將它們分別視爲long,int,String 的替代品。
    */
    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();
    /**
    * Mapper類中的map方法：
    * protected void map(KEYIN key, VALUEIN value, Context context)
    * 映射一個單個的輸入k/v對到一箇中間的k/v對
    * Context類：收集Mapper輸出的<k,v>對。
    */
    public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
        String line = value.toString();
        StringTokenizer tokenizer = new StringTokenizer(line);
        while (tokenizer.hasMoreTokens()) {
            word.set(tokenizer.nextToken());
            context.write(word, one);
        }
    }
 } 

 /**
 * Reduce類：自己定義reduce方法
 */       
 public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {

    /**
    * Reducer類中的reduce方法：
    * protected void reduce(KEYIN key, Interable<VALUEIN> value, Context context)
    * 映射一個單個的輸入k/v對到一箇中間的k/v對
    * Context類：收集Reducer輸出的<k,v>對。
    */
    public void reduce(Text key, Iterable<IntWritable> values, Context context) 
      throws IOException, InterruptedException {
        int sum = 0;
        for (IntWritable val : values) {
            sum += val.get();
        }
        context.write(key, new IntWritable(sum));
    }
 }

 /**
 * main主函數
 */       
 public static void main(String[] args) throws Exception {

    Configuration conf = new Configuration();//創建一個配置對象，用來實現所有配置

    Job job = new Job(conf, "wordcount");//新建一個job，並定義名稱

    job.setOutputKeyClass(Text.class);//爲job的輸出數據設置Key類
    job.setOutputValueClass(IntWritable.class);//爲job輸出設置value類

    job.setMapperClass(Map.class); //爲job設置Mapper類
    job.setReducerClass(Reduce.class);//爲job設置Reduce類

    job.setInputFormatClass(TextInputFormat.class);//爲map-reduce任務設置InputFormat實現類
    job.setOutputFormatClass(TextOutputFormat.class);//爲map-reduce任務設置OutputFormat實現類

    FileInputFormat.addInputPath(job, new Path(args[0]));//爲map-reduce job設置輸入路徑
    FileOutputFormat.setOutputPath(job, new Path(args[1]));//爲map-reduce job設置輸出路徑
    job.waitForCompletion(true); //運行一個job，並等待其結束
 }

}

6.運行

運行前保證分佈式文件系統裏的input目錄下有文件，如果是HDFS剛格式化過，也請參考《搭建Hadoop僞分佈式》教程，創建和上傳input文件以及裏面的內容。

在WordCount的代碼區域，右鍵，點擊Run As—>Run Configurations，配置運行參數，即輸入和輸出文件夾地址參數：
　hdfs://localhost:9000/user/xxm/input hdfs://localhost:9000/user/xxm/output/wordcount3
如下圖所示：

點擊Run。
結果可以重連接myhadoop後進入output雙擊查看。也可以用HDFS命令get下來看。重連接myhadoop方法：在項目管理窗口，右鍵藍色小象，選reconnect。

到此hadoop的eclipse開發環境搭建完成。