1.安裝環境說明
- VMware 12
- ubuntukylin-15.10-desktop-amd64
- jdk1.8.0_73
- scala-2.10.5
- hadoop-2.6.0
- spark-1.6.0-bin-hadoop2.6
2.安裝VMware
(略)
3.安裝ubuntu
安裝過程(略)
配置ubuntu
安裝好一個後複製三個,分別爲 work1,work2,work3
其中 work1爲master,work2,work3爲 slave
vim /etc/hosts
192.168.44.138 work1
192.168.44.139 work2
192.168.44.137 work3
爲方便操作,修改每臺機的名字分別爲 work1 work2 work3
vim /etc/hostname
4.配置ssh無需密碼登錄
5.安裝jdk
(安裝過程略)
設置環境變量
vim /etc/profile
export JAVA_HOME=/usr/local/spark/jdk1.8.0_73
export PATH=
6.安裝scala
(安裝過程略)
vim /etc/profile
export SCALA_HOME=/usr/local/spark/scala-2.10.5
export PATH=
7.安裝hadoop
(安裝過程略)
vim /etc/profile
export HADOOP_HOME=/usr/local/spark/hadoop-2.6.0
export PATH=
8.安裝spark
(安裝過程略)
vim /etc/profile
export SPARK_HOME=/usr/local/spark/spark-1.6.0-bin-hadoop2.6
export PATH=
7.配置hadoop集羣
(1)配置etc/hadoop/hadoop-env.sh
設置java和hadoop的路徑到環境變量中
export JAVA_HOME=/usr/local/spark/jdk1.8.0_73
export HADOOP_PREFIX=/usr/local/spark/hadoop-2.6.0
(2)配置etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://work1:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/spark/hadoop-2.6.0/tmp</value>
</property>
</configuration>
(3)配置etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>work1:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/spark/hadoop-2.6.0/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/spark/hadoop-2.6.0/dfs/data</value>
</property>
<property>
(4)配置etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
(6)配置etc/hadoop/yarn-env.sh
設置java路徑
export JAVA_HOME=/usr/local/spark/jdk1.8.0_73
(7)配置etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>work1</value>
</property>
</configuration>
(8)配置slaves
work1
work2
work3
8.啓動hadoop集羣
格式化文件系統
啓動 NameNode 和 DateNode
啓動 ResourceManager 和 NodeManager
9.安裝spark
配置環境變量
配置spark-env.sh
配置slaves
10.啓動spark
啓動 Master 節點
運行
start-master.sh
啓動所有 Worker 節點
start-slaves.sh
瀏覽器查看 Spark 集羣信息
./start-history-server.sh
運行自帶的例子
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://work1:7077 ./lib/spark-examples-1.6.0-hadoop2.6.0.jar 1000
wgyw
./bin/spark-submit \
–class
–master \
–deploy-mode \
–conf = \
… # other options
\
[application-arguments]
11.停止spark集羣
停止 Master 節點
stop-master.sh
停止 Worker 節點
stop-slaves.sh
停止 Hadoop 集羣