軟件準備信息,詳見Spark2.2.0集羣搭建部署之【軟件準備篇】
基礎配置信息,詳見Spark2.2.0集羣搭建部署之【基礎配置篇】
SSH無密訪問,詳見park2.2.0集羣搭建部署之【無密訪問篇】
將hadoop-2.7.7.tar.gz 進行解壓
tar -zxvf hadoop-2.7.7.tar.gz
配置環境變量信息,vi /etc/profile
export HADOOP_HOME=/root/xdb/hadoop-2.7.7
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_ROOT_LOGGER=INFO,console
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
刷新配置, source /etc/profile,以上配置在各個機器均做此操作。
修改$HADOOP_HOME/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk
即使環境變量中已經配置,此處必須修改,否則會報“JAVA_HOME is not set and could not be found.”。
修改$HADOOP_HOME/etc/hadoop/slaves
slave1
修改$HADOOP_HOME/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/root/xdb/hadoop-2.7.7/tmp</value>
</property>
</configuration>
修改$HADOOP_HOME/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:50090</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/root/xdb/hadoop-2.7.7/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/root/xdb/hadoop-2.7.7/hdfs/data</value>
</property>
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
</configuration>
修改$HADOOP_HOME/etc/hadoop/mapred-site.xml (cp mapred-site.xml.template mapred-site.xml)
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:19888</value>
</property>
</configuration>
修改$HADOOP_HOME/etc/hadoop/yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8088</value>
</property>
</configuration>
複製master節點的hadoop文件夾到slave1上。
scp -r hadoop-2.7.7 slave1:/root/xdb/
在master節點啓動集羣,啓動之前格式化一下namenode
hadoop namenode -format
啓動:
$HADOOP_HOME/sbin/start-all.sh
檢查,各節點執行 jps
NameNode顯示
DataNode顯示
Hadoop管理界面, http://192.168.195.129:8088即可訪問,可查看到active nodes節點有一條數據,即slave1.