目錄
Standalone Operation(單機模式),官方grep案例
Pseudo-Distributed Operation(僞分佈式模式)
軟件清單
- jdk1.8
- hadoop-2.7.2.tar.gz
解壓到指定目錄
[fengling@hadoop129 software]$ tar -zxvf hadoop-2.7.2.tar.gz -C /opt/module/
將Hadoop添加到環境變量
- 獲取hadoop路徑
[fengling@hadoop129 hadoop-2.7.2]$ pwd
/opt/module/hadoop-2.7.2
- 打開/ect/profile文件
[fengling@hadoop129 hadoop-2.7.2]$sudo vim /etc/profile
- 末尾加入配置信息
#HADOOP_HOME
export HADOOP_HOME=/opt/module/hadoop-2.7.2
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
保存,退出
- 立即生效
[fengling@hadoop129 hadoop-2.7.2]$ source /etc/profile
- 測試是否安裝成功
[fengling@hadoop129 hadoop-2.7.2]$ hadoop version
Hadoop 2.7.2
Subversion Unknown -r Unknown
Compiled by root on 2017-05-22T10:49Z
Compiled with protoc 2.5.0
From source with checksum d0fda26633fa762bff87ec759ebe689c
This command was run using /opt/module/hadoop-2.7.2/share/hadoop/common/hadoop-common-2.7.2.jar
運行
Standalone Operation(單機模式),官方grep案例
[fengling@hadoop129 hadoop-2.7.2]$ pwd
/opt/module/hadoop-2.7.2
[fengling@hadoop129 hadoop-2.7.2]$ ls
bin etc include lib libexec LICENSE.txt NOTICE.txt README.txt sbin share
[fengling@hadoop129 hadoop-2.7.2]$ mkdir input
[fengling@hadoop129 hadoop-2.7.2]$ cp etc/hadoop/*.xml input
[fengling@hadoop129 hadoop-2.7.2]$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar grep input output 'dfs[a-z.]+'
Pseudo-Distributed Operation(僞分佈式模式)
配置集羣
創建數據存儲臨時目錄
[fengling@hadoop129 hadoop-2.7.2]$ mkdir -p data/tmp
[fengling@hadoop129 tmp]$ pwd
/opt/module/hadoop-2.7.2/data/tmp
$ vim etc/hadoop/core-site.xml
<configuration>
<!-- 指定HDFS中NameNode的地址 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop129:9000</value>
</property>
<!-- 指定Hadoop運行時產生文件的存儲目錄 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/module/hadoop-2.7.2/data/tmp</value>
</property>
</configuration>
$ vim etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
$ vim etc/hadoop/hadoop-env.sh
[fengling@hadoop129 hadoop-2.7.2]$ echo $JAVA_HOME
/opt/module/jdk1.8.0_161
# The java implementation to use.
export JAVA_HOME=/opt/module/jdk1.8.0_161
如果不配置,則有可能會出現如下異常
免密登錄
# 執行如下命令
$ ssh localhost
# 上一步如果需要輸入密碼,則執行如下命令
$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ chmod 0600 ~/.ssh/authorized_keys
本地運行一個MapReduce job
1.格式化NameNode
$ bin/hdfs namenode -format
打印如圖信息即格式化完成
2.啓動
$ sbin/start-dfs.sh
看到如上信息即啓動完成。
3.瀏覽器訪問地址:http://hadoop129:50070 此處hadoop129需要修改成你hdfs NameNode的地址,
出現如下頁面,離成功又進一步啦
4.HDFS上創建MapReduce輸入目錄
[fengling@hadoop129 hadoop-2.7.2]$ bin/hdfs dfs -mkdir -p /user/fengling/input
5.上傳文件到input目錄
[fengling@hadoop129 hadoop-2.7.2]$ bin/hdfs dfs -put etc/hadoop/ input
6.運行官方grep例子:
[fengling@hadoop129 hadoop-2.7.2]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar grep input/hadoop output_201909251418 'dfs[a-z.]+'
7.查看運行結果
[fengling@hadoop129 hadoop-2.7.2]$ bin/hdfs dfs -cat /user/fengling/output_201909251418/*
6 dfs.audit.logger
4 dfs.class
3 dfs.server.namenode.
2 dfs.period
2 dfs.audit.log.maxfilesize
2 dfs.audit.log.maxbackupindex
1 dfsmetrics.log
1 dfsadmin
1 dfs.servers
1 dfs.replication
1 dfs.file