Hadoop---MapReduce

MapReduce

一、什麼是MapReduce

---- 並行計算框架模型

Hadoop MapReduce是一個軟件框架,基於該框架能夠容易易地編寫應⽤用程序,這些應用程序能夠運行在由上千個商⽤用機器器組成的⼤大集羣上,並以一種可靠的,具有容錯能⼒力力的⽅方式並⾏行行地處理理上TB級別的海量數據集。這個定義里面有着這些關鍵詞:
一是軟件框架,二是並行處理,三是可靠且容錯,四是大規模集羣,五是海量數據集。

MapReduce長處理大數據,它爲什麼具有這種能力呢?這可由MapReduce的設計思想發覺。
MapReduce的思想就是“分而治之”。

  • Mapper負責“分”,即把複雜的任務分解爲若干個“簡單的任務”來處理理。“簡單的任務”包含三層含義:

    • 是數據或計算的規模相對原任務要⼤大縮小;
    • 是就近計算原則,即任務會分配到存放着所需數據的節點上進⾏行行計算;
    • 是這些⼩小任務可以並⾏行行計算,彼此間⼏幾乎沒有依賴關係。
  • Reducer負責對map階段的結果進行彙總。

1552610571718
在這裏插入圖片描述

二、什麼是Yarn

-----分佈式集羣的資源管理和調度平臺

https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html
Apache Hadoop YARN (Yet Another Resource Negotiator,另一種資源協調者)是一種新的 Hadoop 資源管理器,它是一個通用資源管理系統,可爲上層應⽤用提供統一的資源管理和調度,它的引入爲集羣在利用率、資源統一管理和數據共享等方面帶來了巨大好處。Hbase、Hive、Spark On Yarn mapReduce 都可以在該框架上運行

1552610746289
在這裏插入圖片描述

  • ResourceManager資源管理器 負責集羣管理和資源管理調度並接收NodeManger的彙報監控NodeManger

  • NodeManager是每臺機器器框架代理理,負責容器器,監視其資源使⽤用情況(CPU,內存,磁盤,⽹網絡)並將其報告給ResourceManager / Scheduler。

  • App Master :Master負責任務計算過程中的任務監控、故障轉移,每個Job只有一個。管理這一個MR任務

  • **Container:**表示一個計算進程容器(打包一系列的計算資源) 默認大小1G

三、架構篇


在這裏插入圖片描述

MapReduce工作流程

1.run job

2.get new application

3.copy job resouce

4.submit job

5.init container

6.init mrappmaster

7.retrieve input splits

8.allocate resource

9.init container(計算容器)

10·retrieve job resource(接受任務資源 代碼 配置 數據)

11·run map任務或者reduce任務

12·result

![](assets\Yarn 計算.png)
在這裏插入圖片描述

四、環境搭建

在HDFS環境上進行修改

  • 修改 etc/hadoop/mapred-site.xml
    [root@node1 hadoop-2.6.0]# mv etc/hadoop/mapred-site.xml.template etc/hadoop/mapred-site.xml
    
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    
  • 修改 etc/hadoop/yarn-site.xml
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>主機名(hadoop)</value>
    </property>
    
  • 啓動服務

    也需要把HDfs的服務啓動

    [root@hadoop ~]# hdfs namenode -format
    #namenode格式化只需要在初次使⽤用hadoop的時候執行,以後無需每次啓動執行
    [root@hadoop ~]# start-dfs.sh
    # 啓動hdfs
    
    # 啓動Yarn
    [root@hadoop hadoop-2.6.0]# start-yarn.sh
    

五、開發實例篇

  • Maven依賴
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-common</artifactId>
    <version>2.6.0</version>
</dependency>
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-hdfs</artifactId>
    <version>2.6.0</version>
</dependency>
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-mapreduce-client-common</artifactId>
    <version>2.6.0</version>
    </dependency>
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-mapreduce-client-core</artifactId>
    <version>2.6.0</version>
</dependency>
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-mapreduce-client-jobclient</artifactId>
    <version>2.6.0</version>
</dependency>
MapReduce使用方法:

1.創建Maven項目

2.創建Mapper程序

package com.baizhi.yarn;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

import java.io.IOException;


public class MyMapper extends Mapper<LongWritable,Text,Text,IntWritable>{
    @Override
    protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
        String[] str = value.toString().split(" ");
        for (String s : str) {
            context.write(new Text(s),new IntWritable(1));
        }
    }
}

3.創建Reducer程序

package com.baizhi.yarn;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

import java.io.IOException;

public class MyReduce extends Reducer<Text,IntWritable,Text,IntWritable> {
    @Override
    protected void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
        int sum = 0;
        for (IntWritable value : values) {
            sum = value.get();
        }
        context.write(key,new IntWritable(sum));
    }
}

4,·定製入口類

package com.baizhi.yarn;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

import java.io.IOException;

public class InitMR {
    public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
        // 1 初始化MR任務對象
        Configuration configuration = new Configuration();
        Job job = Job.getInstance(configuration, "Word COUNT");
        job.setJarByClass(InitMR.class);
        // 2 設置數據的輸入類型和輸出類型
        // inputFormat 決定了如何切割數據集 如何讀取切割後的數據
        // outputFormat 如何輸出計算結果
        job.setInputFormatClass(TextInputFormat.class);
        job.setOutputFormatClass(TextOutputFormat.class);
        //3. 設置數據集的來源和計算結果的輸出目的地
        /**
         * *************************** 一: 在虛擬機中運行
         */
        /*TextInputFormat.addInputPath(job,new Path("hdfs://hadoop:9000/WordCount.txt"));
        TextOutputFormat.setOutputPath(job,new Path("hdfs://hadoop:9000/result1"));*/

        /**
         * ************************** 一: 在idea中用主函數運行
         *  2.MR測試⽅方法⼆二:本地計算(用本地的Hadoop進行計算)+本地HDFS⽂建
         */
        /*TextInputFormat.addInputPath(job,new Path("file:///E://WordCount.txt"));
        TextOutputFormat.setOutputPath(job,new Path("file:///E://result"));
        */
        /**
         *  3.本地計算+遠程HDFS⽂文件
         */
        TextInputFormat.addInputPath(job,new Path("hdfs://hadoop:9000/WordCount.txt"));
        TextOutputFormat.setOutputPath(job,new Path("hdfs://hadoop:9000/result2"));

        //4. 設置keyOut valueOut數據類型
        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(IntWritable.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        //5. 其它
        // 設置初始化MR程序的Map任務的實現類和Reduce任務的實現類
        job.setMapperClass(MyMapper.class);
        job.setReducerClass(MyReduce.class);

        //6. 提交MR程序
        job.waitForCompletion(true);


    }
}

5.使用4種運行方式其中的一種運行

–1.在hadoop環境中通過運行jar包,測試運行行MapReduce程序

   將代碼打包 拉入jvm 中運行
 //3. 設置數據集的來源和計算結果的輸出目的地
        /**
         *  1 : 在虛擬機中運行
         */
        TextInputFormat.addInputPath(job,new Path("hdfs://hadoop:9000/WordCount.txt"));
        TextOutputFormat.setOutputPath(job,new Path("hdfs://hadoop:9000/result1"));
[root@node1 hadoop-2.6.0] bin/hadoop jar jar包路徑(/mr_demo-1.0-SNAPSHOT.jar) 主函數名稱(com.baizhi.yarn.InitMR)

  • –2.使用主函數運行
    • 本地計算+本地HDFS⽂文件

      在 InitMR.java 類中加⼊入以下代碼

      /**
               *  2.MR測試⽅方法二:本地計算(用本地的Hadoop進行計算)+本地HDFS⽂建
               */
            TextInputFormat.addInputPath(job,new Path("file:///E://WordCount.txt"));
            TextOutputFormat.setOutputPath(job,new Path("file:///E://result"));
             
      

      需要修改yar的源碼

      在項目錄新建 org.apache.hadoop.io.nativeio包創建NativeIO類替換他的NativeIO類
      修改NativeIO源碼的279行修改爲true return access0(path, desiredAccess.accessRight()); 修改爲 return true;

      //
      // Source code recreated from a .class file by IntelliJ IDEA
      // (powered by Fernflower decompiler)
      //
      
      package org.apache.hadoop.io.nativeio;
      
      import com.google.common.annotations.VisibleForTesting;
      import java.io.Closeable;
      import java.io.File;
      import java.io.FileDescriptor;
      import java.io.FileInputStream;
      import java.io.FileOutputStream;
      import java.io.IOException;
      import java.io.RandomAccessFile;
      import java.lang.reflect.Field;
      import java.nio.ByteBuffer;
      import java.nio.MappedByteBuffer;
      import java.nio.channels.FileChannel;
      import java.util.Map;
      import java.util.concurrent.ConcurrentHashMap;
      import org.apache.commons.logging.Log;
      import org.apache.commons.logging.LogFactory;
      import org.apache.hadoop.classification.InterfaceAudience.Private;
      import org.apache.hadoop.classification.InterfaceStability.Unstable;
      import org.apache.hadoop.conf.Configuration;
      import org.apache.hadoop.fs.HardLink;
      import org.apache.hadoop.io.IOUtils;
      import org.apache.hadoop.io.SecureIOUtils.AlreadyExistsException;
      import org.apache.hadoop.util.NativeCodeLoader;
      import org.apache.hadoop.util.PerformanceAdvisory;
      import org.apache.hadoop.util.Shell;
      import sun.misc.Cleaner;
      import sun.misc.Unsafe;
      import sun.nio.ch.DirectBuffer;
      
      @Private
      @Unstable
      public class NativeIO {
          private static boolean workaroundNonThreadSafePasswdCalls = false;
          private static final Log LOG = LogFactory.getLog(NativeIO.class);
          private static boolean nativeLoaded = false;
          private static final Map<Long, NativeIO.CachedUid> uidCache;
          private static long cacheTimeout;
          private static boolean initialized;
      
          public NativeIO() {
          }
      
          public static boolean isAvailable() {
              return NativeCodeLoader.isNativeCodeLoaded() && nativeLoaded;
          }
      
          private static native void initNative();
      
          static long getMemlockLimit() {
              return isAvailable() ? getMemlockLimit0() : 0L;
          }
      
          private static native long getMemlockLimit0();
      
          static long getOperatingSystemPageSize() {
              try {
                  Field f = Unsafe.class.getDeclaredField("theUnsafe");
                  f.setAccessible(true);
                  Unsafe unsafe = (Unsafe)f.get((Object)null);
                  return (long)unsafe.pageSize();
              } catch (Throwable var2) {
                  LOG.warn("Unable to get operating system page size.  Guessing 4096.", var2);
                  return 4096L;
              }
          }
      
          private static String stripDomain(String name) {
              int i = name.indexOf(92);
              if (i != -1) {
                  name = name.substring(i + 1);
              }
      
              return name;
          }
      
          public static String getOwner(FileDescriptor fd) throws IOException {
              ensureInitialized();
              if (Shell.WINDOWS) {
                  String owner = NativeIO.Windows.getOwner(fd);
                  owner = stripDomain(owner);
                  return owner;
              } else {
                  long uid = NativeIO.POSIX.getUIDforFDOwnerforOwner(fd);
                  NativeIO.CachedUid cUid = (NativeIO.CachedUid)uidCache.get(uid);
                  long now = System.currentTimeMillis();
                  if (cUid != null && cUid.timestamp + cacheTimeout > now) {
                      return cUid.username;
                  } else {
                      String user = NativeIO.POSIX.getUserName(uid);
                      LOG.info("Got UserName " + user + " for UID " + uid + " from the native implementation");
                      cUid = new NativeIO.CachedUid(user, now);
                      uidCache.put(uid, cUid);
                      return user;
                  }
              }
          }
      
          public static FileInputStream getShareDeleteFileInputStream(File f) throws IOException {
              if (!Shell.WINDOWS) {
                  return new FileInputStream(f);
              } else {
                  FileDescriptor fd = NativeIO.Windows.createFile(f.getAbsolutePath(), 2147483648L, 7L, 3L);
                  return new FileInputStream(fd);
              }
          }
      
          public static FileInputStream getShareDeleteFileInputStream(File f, long seekOffset) throws IOException {
              if (!Shell.WINDOWS) {
                  RandomAccessFile rf = new RandomAccessFile(f, "r");
                  if (seekOffset > 0L) {
                      rf.seek(seekOffset);
                  }
      
                  return new FileInputStream(rf.getFD());
              } else {
                  FileDescriptor fd = NativeIO.Windows.createFile(f.getAbsolutePath(), 2147483648L, 7L, 3L);
                  if (seekOffset > 0L) {
                      NativeIO.Windows.setFilePointer(fd, seekOffset, 0L);
                  }
      
                  return new FileInputStream(fd);
              }
          }
      
          public static FileOutputStream getCreateForWriteFileOutputStream(File f, int permissions) throws IOException {
              FileDescriptor fd;
              if (!Shell.WINDOWS) {
                  try {
                      fd = NativeIO.POSIX.open(f.getAbsolutePath(), 193, permissions);
                      return new FileOutputStream(fd);
                  } catch (NativeIOException var3) {
                      if (var3.getErrno() == Errno.EEXIST) {
                          throw new AlreadyExistsException(var3);
                      } else {
                          throw var3;
                      }
                  }
              } else {
                  try {
                      fd = NativeIO.Windows.createFile(f.getCanonicalPath(), 1073741824L, 7L, 1L);
                      NativeIO.POSIX.chmod(f.getCanonicalPath(), permissions);
                      return new FileOutputStream(fd);
                  } catch (NativeIOException var4) {
                      if (var4.getErrorCode() == 80L) {
                          throw new AlreadyExistsException(var4);
                      } else {
                          throw var4;
                      }
                  }
              }
          }
      
          private static synchronized void ensureInitialized() {
              if (!initialized) {
                  cacheTimeout = (new Configuration()).getLong("hadoop.security.uid.cache.secs", 14400L) * 1000L;
                  LOG.info("Initialized cache for UID to User mapping with a cache timeout of " + cacheTimeout / 1000L + " seconds.");
                  initialized = true;
              }
      
          }
      
          public static void renameTo(File src, File dst) throws IOException {
              if (!nativeLoaded) {
                  if (!src.renameTo(dst)) {
                      throw new IOException("renameTo(src=" + src + ", dst=" + dst + ") failed.");
                  }
              } else {
                  renameTo0(src.getAbsolutePath(), dst.getAbsolutePath());
              }
      
          }
      
          public static void link(File src, File dst) throws IOException {
              if (!nativeLoaded) {
                  HardLink.createHardLink(src, dst);
              } else {
                  link0(src.getAbsolutePath(), dst.getAbsolutePath());
              }
      
          }
      
          private static native void renameTo0(String var0, String var1) throws NativeIOException;
      
          private static native void link0(String var0, String var1) throws NativeIOException;
      
          public static void copyFileUnbuffered(File src, File dst) throws IOException {
              if (nativeLoaded && Shell.WINDOWS) {
                  copyFileUnbuffered0(src.getAbsolutePath(), dst.getAbsolutePath());
              } else {
                  FileInputStream fis = null;
                  FileOutputStream fos = null;
                  FileChannel input = null;
                  FileChannel output = null;
      
                  try {
                      fis = new FileInputStream(src);
                      fos = new FileOutputStream(dst);
                      input = fis.getChannel();
                      output = fos.getChannel();
                      long remaining = input.size();
                      long position = 0L;
      
                      for(long transferred = 0L; remaining > 0L; position += transferred) {
                          transferred = input.transferTo(position, remaining, output);
                          remaining -= transferred;
                      }
                  } finally {
                      IOUtils.cleanup(LOG, new Closeable[]{output});
                      IOUtils.cleanup(LOG, new Closeable[]{fos});
                      IOUtils.cleanup(LOG, new Closeable[]{input});
                      IOUtils.cleanup(LOG, new Closeable[]{fis});
                  }
              }
      
          }
      
          private static native void copyFileUnbuffered0(String var0, String var1) throws NativeIOException;
      
          static {
              if (NativeCodeLoader.isNativeCodeLoaded()) {
                  try {
                      initNative();
                      nativeLoaded = true;
                  } catch (Throwable var1) {
                      PerformanceAdvisory.LOG.debug("Unable to initialize NativeIO libraries", var1);
                  }
              }
      
              uidCache = new ConcurrentHashMap();
              initialized = false;
          }
      
          private static class CachedUid {
              final long timestamp;
              final String username;
      
              public CachedUid(String username, long timestamp) {
                  this.timestamp = timestamp;
                  this.username = username;
              }
          }
      
          public static class Windows {
              public static final long GENERIC_READ = 2147483648L;
              public static final long GENERIC_WRITE = 1073741824L;
              public static final long FILE_SHARE_READ = 1L;
              public static final long FILE_SHARE_WRITE = 2L;
              public static final long FILE_SHARE_DELETE = 4L;
              public static final long CREATE_NEW = 1L;
              public static final long CREATE_ALWAYS = 2L;
              public static final long OPEN_EXISTING = 3L;
              public static final long OPEN_ALWAYS = 4L;
              public static final long TRUNCATE_EXISTING = 5L;
              public static final long FILE_BEGIN = 0L;
              public static final long FILE_CURRENT = 1L;
              public static final long FILE_END = 2L;
              public static final long FILE_ATTRIBUTE_NORMAL = 128L;
      
              public Windows() {
              }
      
              public static native FileDescriptor createFile(String var0, long var1, long var3, long var5) throws IOException;
      
              public static native long setFilePointer(FileDescriptor var0, long var1, long var3) throws IOException;
      
              private static native String getOwner(FileDescriptor var0) throws IOException;
      
              private static native boolean access0(String var0, int var1);
      
              public static boolean access(String path, NativeIO.Windows.AccessRight desiredAccess) throws IOException {
                  // hadoop源碼的錯誤
                  return true;
              }
      
              public static native void extendWorkingSetSize(long var0) throws IOException;
      
              static {
                  if (NativeCodeLoader.isNativeCodeLoaded()) {
                      try {
                          NativeIO.initNative();
                          NativeIO.nativeLoaded = true;
                      } catch (Throwable var1) {
                          PerformanceAdvisory.LOG.debug("Unable to initialize NativeIO libraries", var1);
                      }
                  }
      
              }
      
              public static enum AccessRight {
                  ACCESS_READ(1),
                  ACCESS_WRITE(2),
                  ACCESS_EXECUTE(32);
      
                  private final int accessRight;
      
                  private AccessRight(int access) {
                      this.accessRight = access;
                  }
      
                  public int accessRight() {
                      return this.accessRight;
                  }
              }
          }
      
          public static class POSIX {
              public static final int O_RDONLY = 0;
              public static final int O_WRONLY = 1;
              public static final int O_RDWR = 2;
              public static final int O_CREAT = 64;
              public static final int O_EXCL = 128;
              public static final int O_NOCTTY = 256;
              public static final int O_TRUNC = 512;
              public static final int O_APPEND = 1024;
              public static final int O_NONBLOCK = 2048;
              public static final int O_SYNC = 4096;
              public static final int O_ASYNC = 8192;
              public static final int O_FSYNC = 4096;
              public static final int O_NDELAY = 2048;
              public static final int POSIX_FADV_NORMAL = 0;
              public static final int POSIX_FADV_RANDOM = 1;
              public static final int POSIX_FADV_SEQUENTIAL = 2;
              public static final int POSIX_FADV_WILLNEED = 3;
              public static final int POSIX_FADV_DONTNEED = 4;
              public static final int POSIX_FADV_NOREUSE = 5;
              public static final int SYNC_FILE_RANGE_WAIT_BEFORE = 1;
              public static final int SYNC_FILE_RANGE_WRITE = 2;
              public static final int SYNC_FILE_RANGE_WAIT_AFTER = 4;
              private static final Log LOG = LogFactory.getLog(NativeIO.class);
              private static boolean nativeLoaded = false;
              private static boolean fadvisePossible = true;
              private static boolean syncFileRangePossible = true;
              static final String WORKAROUND_NON_THREADSAFE_CALLS_KEY = "hadoop.workaround.non.threadsafe.getpwuid";
              static final boolean WORKAROUND_NON_THREADSAFE_CALLS_DEFAULT = true;
              private static long cacheTimeout = -1L;
              private static NativeIO.POSIX.CacheManipulator cacheManipulator = new NativeIO.POSIX.CacheManipulator();
              private static final Map<Integer, NativeIO.POSIX.CachedName> USER_ID_NAME_CACHE;
              private static final Map<Integer, NativeIO.POSIX.CachedName> GROUP_ID_NAME_CACHE;
              public static final int MMAP_PROT_READ = 1;
              public static final int MMAP_PROT_WRITE = 2;
              public static final int MMAP_PROT_EXEC = 4;
      
              public POSIX() {
              }
      
              public static NativeIO.POSIX.CacheManipulator getCacheManipulator() {
                  return cacheManipulator;
              }
      
              public static void setCacheManipulator(NativeIO.POSIX.CacheManipulator cacheManipulator) {
                  cacheManipulator = cacheManipulator;
              }
      
              public static boolean isAvailable() {
                  return NativeCodeLoader.isNativeCodeLoaded() && nativeLoaded;
              }
      
              private static void assertCodeLoaded() throws IOException {
                  if (!isAvailable()) {
                      throw new IOException("NativeIO was not loaded");
                  }
              }
      
              public static native FileDescriptor open(String var0, int var1, int var2) throws IOException;
      
              private static native NativeIO.POSIX.Stat fstat(FileDescriptor var0) throws IOException;
      
              private static native void chmodImpl(String var0, int var1) throws IOException;
      
              public static void chmod(String path, int mode) throws IOException {
                  if (!Shell.WINDOWS) {
                      chmodImpl(path, mode);
                  } else {
                      try {
                          chmodImpl(path, mode);
                      } catch (NativeIOException var3) {
                          if (var3.getErrorCode() == 3L) {
                              throw new NativeIOException("No such file or directory", Errno.ENOENT);
                          }
      
                          LOG.warn(String.format("NativeIO.chmod error (%d): %s", var3.getErrorCode(), var3.getMessage()));
                          throw new NativeIOException("Unknown error", Errno.UNKNOWN);
                      }
                  }
      
              }
      
              static native void posix_fadvise(FileDescriptor var0, long var1, long var3, int var5) throws NativeIOException;
      
              static native void sync_file_range(FileDescriptor var0, long var1, long var3, int var5) throws NativeIOException;
      
              static void posixFadviseIfPossible(String identifier, FileDescriptor fd, long offset, long len, int flags) throws NativeIOException {
                  if (nativeLoaded && fadvisePossible) {
                      try {
                          posix_fadvise(fd, offset, len, flags);
                      } catch (UnsupportedOperationException var8) {
                          fadvisePossible = false;
                      } catch (UnsatisfiedLinkError var9) {
                          fadvisePossible = false;
                      }
                  }
      
              }
      
              public static void syncFileRangeIfPossible(FileDescriptor fd, long offset, long nbytes, int flags) throws NativeIOException {
                  if (nativeLoaded && syncFileRangePossible) {
                      try {
                          sync_file_range(fd, offset, nbytes, flags);
                      } catch (UnsupportedOperationException var7) {
                          syncFileRangePossible = false;
                      } catch (UnsatisfiedLinkError var8) {
                          syncFileRangePossible = false;
                      }
                  }
      
              }
      
              static native void mlock_native(ByteBuffer var0, long var1) throws NativeIOException;
      
              static void mlock(ByteBuffer buffer, long len) throws IOException {
                  assertCodeLoaded();
                  if (!buffer.isDirect()) {
                      throw new IOException("Cannot mlock a non-direct ByteBuffer");
                  } else {
                      mlock_native(buffer, len);
                  }
              }
      
              public static void munmap(MappedByteBuffer buffer) {
                  if (buffer instanceof DirectBuffer) {
                      Cleaner cleaner = ((DirectBuffer)buffer).cleaner();
                      cleaner.clean();
                  }
      
              }
      
              private static native long getUIDforFDOwnerforOwner(FileDescriptor var0) throws IOException;
      
              private static native String getUserName(long var0) throws IOException;
      
              public static NativeIO.POSIX.Stat getFstat(FileDescriptor fd) throws IOException {
                  NativeIO.POSIX.Stat stat = null;
                  if (!Shell.WINDOWS) {
                      stat = fstat(fd);
                      stat.owner = getName(NativeIO.POSIX.IdCache.USER, stat.ownerId);
                      stat.group = getName(NativeIO.POSIX.IdCache.GROUP, stat.groupId);
                  } else {
                      try {
                          stat = fstat(fd);
                      } catch (NativeIOException var3) {
                          if (var3.getErrorCode() == 6L) {
                              throw new NativeIOException("The handle is invalid.", Errno.EBADF);
                          }
      
                          LOG.warn(String.format("NativeIO.getFstat error (%d): %s", var3.getErrorCode(), var3.getMessage()));
                          throw new NativeIOException("Unknown error", Errno.UNKNOWN);
                      }
                  }
      
                  return stat;
              }
      
              private static String getName(NativeIO.POSIX.IdCache domain, int id) throws IOException {
                  Map<Integer, NativeIO.POSIX.CachedName> idNameCache = domain == NativeIO.POSIX.IdCache.USER ? USER_ID_NAME_CACHE : GROUP_ID_NAME_CACHE;
                  NativeIO.POSIX.CachedName cachedName = (NativeIO.POSIX.CachedName)idNameCache.get(id);
                  long now = System.currentTimeMillis();
                  String name;
                  if (cachedName != null && cachedName.timestamp + cacheTimeout > now) {
                      name = cachedName.name;
                  } else {
                      name = domain == NativeIO.POSIX.IdCache.USER ? getUserName(id) : getGroupName(id);
                      if (LOG.isDebugEnabled()) {
                          String type = domain == NativeIO.POSIX.IdCache.USER ? "UserName" : "GroupName";
                          LOG.debug("Got " + type + " " + name + " for ID " + id + " from the native implementation");
                      }
      
                      cachedName = new NativeIO.POSIX.CachedName(name, now);
                      idNameCache.put(id, cachedName);
                  }
      
                  return name;
              }
      
              static native String getUserName(int var0) throws IOException;
      
              static native String getGroupName(int var0) throws IOException;
      
              public static native long mmap(FileDescriptor var0, int var1, boolean var2, long var3) throws IOException;
      
              public static native void munmap(long var0, long var2) throws IOException;
      
              static {
                  if (NativeCodeLoader.isNativeCodeLoaded()) {
                      try {
                          Configuration conf = new Configuration();
                          NativeIO.workaroundNonThreadSafePasswdCalls = conf.getBoolean("hadoop.workaround.non.threadsafe.getpwuid", true);
                          NativeIO.initNative();
                          nativeLoaded = true;
                          cacheTimeout = conf.getLong("hadoop.security.uid.cache.secs", 14400L) * 1000L;
                          LOG.debug("Initialized cache for IDs to User/Group mapping with a  cache timeout of " + cacheTimeout / 1000L + " seconds.");
                      } catch (Throwable var1) {
                          PerformanceAdvisory.LOG.debug("Unable to initialize NativeIO libraries", var1);
                      }
                  }
      
                  USER_ID_NAME_CACHE = new ConcurrentHashMap();
                  GROUP_ID_NAME_CACHE = new ConcurrentHashMap();
              }
      
              private static enum IdCache {
                  USER,
                  GROUP;
      
                  private IdCache() {
                  }
              }
      
              private static class CachedName {
                  final long timestamp;
                  final String name;
      
                  public CachedName(String name, long timestamp) {
                      this.name = name;
                      this.timestamp = timestamp;
                  }
              }
      
              public static class Stat {
                  private int ownerId;
                  private int groupId;
                  private String owner;
                  private String group;
                  private int mode;
                  public static final int S_IFMT = 61440;
                  public static final int S_IFIFO = 4096;
                  public static final int S_IFCHR = 8192;
                  public static final int S_IFDIR = 16384;
                  public static final int S_IFBLK = 24576;
                  public static final int S_IFREG = 32768;
                  public static final int S_IFLNK = 40960;
                  public static final int S_IFSOCK = 49152;
                  public static final int S_IFWHT = 57344;
                  public static final int S_ISUID = 2048;
                  public static final int S_ISGID = 1024;
                  public static final int S_ISVTX = 512;
                  public static final int S_IRUSR = 256;
                  public static final int S_IWUSR = 128;
                  public static final int S_IXUSR = 64;
      
                  Stat(int ownerId, int groupId, int mode) {
                      this.ownerId = ownerId;
                      this.groupId = groupId;
                      this.mode = mode;
                  }
      
                  Stat(String owner, String group, int mode) {
                      if (!Shell.WINDOWS) {
                          this.owner = owner;
                      } else {
                          this.owner = NativeIO.stripDomain(owner);
                      }
      
                      if (!Shell.WINDOWS) {
                          this.group = group;
                      } else {
                          this.group = NativeIO.stripDomain(group);
                      }
      
                      this.mode = mode;
                  }
      
                  public String toString() {
                      return "Stat(owner='" + this.owner + "', group='" + this.group + "'" + ", mode=" + this.mode + ")";
                  }
      
                  public String getOwner() {
                      return this.owner;
                  }
      
                  public String getGroup() {
                      return this.group;
                  }
      
                  public int getMode() {
                      return this.mode;
                  }
              }
      
              @VisibleForTesting
              public static class NoMlockCacheManipulator extends NativeIO.POSIX.CacheManipulator {
                  public NoMlockCacheManipulator() {
                  }
      
                  public void mlock(String identifier, ByteBuffer buffer, long len) throws IOException {
                      NativeIO.POSIX.LOG.info("mlocking " + identifier);
                  }
      
                  public long getMemlockLimit() {
                      return 1125899906842624L;
                  }
      
                  public long getOperatingSystemPageSize() {
                      return 4096L;
                  }
      
                  public boolean verifyCanMlock() {
                      return true;
                  }
              }
      
              @VisibleForTesting
              public static class CacheManipulator {
                  public CacheManipulator() {
                  }
      
                  public void mlock(String identifier, ByteBuffer buffer, long len) throws IOException {
                      NativeIO.POSIX.mlock(buffer, len);
                  }
      
                  public long getMemlockLimit() {
                      return NativeIO.getMemlockLimit();
                  }
      
                  public long getOperatingSystemPageSize() {
                      return NativeIO.getOperatingSystemPageSize();
                  }
      
                  public void posixFadviseIfPossible(String identifier, FileDescriptor fd, long offset, long len, int flags) throws NativeIOException {
                      NativeIO.POSIX.posixFadviseIfPossible(identifier, fd, offset, len, flags);
                  }
      
                  public boolean verifyCanMlock() {
                      return NativeIO.isAvailable();
                  }
              }
          }
      }
      
      
      

      右鍵運行main方法,測試運⾏行行MapReduce程序

    • 本地計算+遠程HDFS文件

      在 InitMR.java 類中加⼊入以下代碼

       /**
               *  3.本地計算+遠程HDFS⽂文件
               */
              TextInputFormat.addInputPath(job,new Path("hdfs://hadoop:9000/WordCount.txt"));
              TextOutputFormat.setOutputPath(job,new Path("hdfs://hadoop:9000/result2"));
      

      可能出現權限問題

      • 關閉HDFS權限檢查
        • 修改hdfs-site.xml
          <property>
              <name>dfs.permissions.enabled</name>
              <value>false</value>
          </property>
          
        • 或指定虛擬機參數
      -DHADOOP_USER_NAME=root
      
  • 遠程計算+遠程HDFS⽂文件

  • 在 InitMR.java 類中加入以下代碼,並將項目打成jar包 運行主函數

public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    //===============================================================
    conf.set("fs.defaultFS", "hdfs://hadoop:9000/");
    conf.set("mapreduce.job.jar", "file:///E:\\訓練營備課
    \\20180313_hadoop\\mr_demo\\target\\mr_demo-1.0-SNAPSHOT.jar");
    conf.set("mapreduce.framework.name", "yarn");
    conf.set("yarn.resourcemanager.hostname", "hadoop");
    conf.set("yarn.nodemanager.aux-services", "mapreduce_shuffle");
    conf.set("mapreduce.app-submission.cross-platform", "true");
    conf.set("dfs.replication", "1");
    //===============================================================
    // ......
    // MR測試⽅方法四:遠程計算+遠程HDFS⽂文件
    FileInputFormat.setInputPaths(job, "/user/word.txt");
    FileOutputFormat.setOutputPath(job, new Path("/user/result"));
  
}

練習實例

wordCount 統計單詞出現的次數
flow 流量統計案列
自定義Writable

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章