我的大數據之旅-Hadoop單節點集羣

目錄

軟件清單

解壓到指定目錄

將Hadoop添加到環境變量

運行

Standalone Operation(單機模式),官方grep案例

Pseudo-Distributed Operation(僞分佈式模式)



軟件清單

  • jdk1.8
  • hadoop-2.7.2.tar.gz

解壓到指定目錄

[fengling@hadoop129 software]$ tar -zxvf hadoop-2.7.2.tar.gz -C /opt/module/

將Hadoop添加到環境變量

  • 獲取hadoop路徑
[fengling@hadoop129 hadoop-2.7.2]$ pwd
/opt/module/hadoop-2.7.2
  • 打開/ect/profile文件
[fengling@hadoop129 hadoop-2.7.2]$sudo vim /etc/profile
  • 末尾加入配置信息
#HADOOP_HOME
export HADOOP_HOME=/opt/module/hadoop-2.7.2
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin

保存,退出

  • 立即生效
[fengling@hadoop129 hadoop-2.7.2]$ source /etc/profile
  • 測試是否安裝成功
[fengling@hadoop129 hadoop-2.7.2]$ hadoop version
Hadoop 2.7.2
Subversion Unknown -r Unknown
Compiled by root on 2017-05-22T10:49Z
Compiled with protoc 2.5.0
From source with checksum d0fda26633fa762bff87ec759ebe689c
This command was run using /opt/module/hadoop-2.7.2/share/hadoop/common/hadoop-common-2.7.2.jar

運行

Standalone Operation(單機模式),官方grep案例

[fengling@hadoop129 hadoop-2.7.2]$ pwd
/opt/module/hadoop-2.7.2
[fengling@hadoop129 hadoop-2.7.2]$ ls
bin  etc  include  lib  libexec  LICENSE.txt  NOTICE.txt  README.txt  sbin  share
[fengling@hadoop129 hadoop-2.7.2]$ mkdir input
[fengling@hadoop129 hadoop-2.7.2]$ cp etc/hadoop/*.xml input
[fengling@hadoop129 hadoop-2.7.2]$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar grep input output 'dfs[a-z.]+'


Pseudo-Distributed Operation(僞分佈式模式)

配置集羣

創建數據存儲臨時目錄

[fengling@hadoop129 hadoop-2.7.2]$ mkdir -p data/tmp
[fengling@hadoop129 tmp]$ pwd
/opt/module/hadoop-2.7.2/data/tmp

$ vim etc/hadoop/core-site.xml

<configuration>
    <!-- 指定HDFS中NameNode的地址 -->
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://hadoop129:9000</value>
    </property>
    <!-- 指定Hadoop運行時產生文件的存儲目錄 -->
    <property>
	    <name>hadoop.tmp.dir</name>
	    <value>/opt/module/hadoop-2.7.2/data/tmp</value>
    </property>
</configuration>

$ vim etc/hadoop/hdfs-site.xml


<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

$ vim etc/hadoop/hadoop-env.sh

[fengling@hadoop129 hadoop-2.7.2]$ echo $JAVA_HOME
/opt/module/jdk1.8.0_161
# The java implementation to use.
export JAVA_HOME=/opt/module/jdk1.8.0_161

如果不配置,則有可能會出現如下異常

 

免密登錄

# 執行如下命令
 $ ssh localhost

# 上一步如果需要輸入密碼,則執行如下命令

 $ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
 $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
 $ chmod 0600 ~/.ssh/authorized_keys

本地運行一個MapReduce job

1.格式化NameNode

$ bin/hdfs namenode -format

打印如圖信息即格式化完成

2.啓動

$ sbin/start-dfs.sh

看到如上信息即啓動完成。

3.瀏覽器訪問地址:http://hadoop129:50070   此處hadoop129需要修改成你hdfs NameNode的地址,

出現如下頁面,離成功又進一步啦

4.HDFS上創建MapReduce輸入目錄

[fengling@hadoop129 hadoop-2.7.2]$ bin/hdfs dfs -mkdir -p /user/fengling/input

5.上傳文件到input目錄

[fengling@hadoop129 hadoop-2.7.2]$ bin/hdfs dfs -put etc/hadoop/ input

6.運行官方grep例子:

[fengling@hadoop129 hadoop-2.7.2]$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar grep input/hadoop output_201909251418 'dfs[a-z.]+'

7.查看運行結果

[fengling@hadoop129 hadoop-2.7.2]$ bin/hdfs dfs -cat /user/fengling/output_201909251418/*
6       dfs.audit.logger
4       dfs.class
3       dfs.server.namenode.
2       dfs.period
2       dfs.audit.log.maxfilesize
2       dfs.audit.log.maxbackupindex
1       dfsmetrics.log
1       dfsadmin
1       dfs.servers
1       dfs.replication
1       dfs.file

 

 

 

 


 

 

 

 

 

 

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章