參考地址:http://blog.csdn.net/jediael_lu/article/details/45310321
根據本機環境和安裝時遇到的問題,本文進行了補充和修改。
以下介紹安裝spark單機環境的方法,可用於測試及開發。主要分成以下5部分:
(1)環境準備
(2)安裝scala
(3)安裝spark
(4)驗證安裝情況
(5)可能遇到的問題
1、環境準備
(1)配套軟件版本要求:Spark runs on Java 6+ and Python 2.6+. For the Scala API, Spark 1.3.1 uses Scala 2.10. You will need to use a compatible Scala version (2.10.x).
(2)安裝好linux、jdk、python, 一般linux均會自帶安裝好jdk與python,但注意jdk默認爲openjdk,建議重新安裝oracle jdk。
(3)IP:192.168.198.130 hostname:corp
2、安裝scala
(1)下載scala
wget http://downloads.typesafe.com/scala/2.10.5/scala-2.10.5.tgz
(2)解壓文件
tar -zxvf scala-2.10.5.tgz
(3)配置環境變量
$vi /etc/profile
#SCALA VARIABLES START
export SCALA_HOME=/home/corey/setup/scala-2.10.5
export PATH=$PATH:$SCALA_HOME/bin
#SCALA VARIABLES END
$ source /etc/profile
$ scala -version
Scala code runner version 2.10.5 -- Copyright 2002-2013, LAMP/EPFL
(4)驗證scala
$ scala
Welcome to Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_51).
Type in expressions to have them evaluated.
Type :help for more information.
scala> 9*9
res0: Int = 81
3、安裝spark
(1)下載spark
wget http://mirror.bit.edu.cn/apache/spark/spark-1.3.1/spark-1.3.1-bin-hadoop2.6.tgz
(2)解壓spark
tar -zxvf http://mirror.bit.edu.cn/apache/spark/spark-1.3.1/spark-1.3.1-bin-hadoop2.6.tgz
(3)配置環境變量
$ vim /etc/profile
#SPARK VARIABLES START
export SPARK_HOME=/home/corey/setup/spark-1.3.1-bin-hadoop2.6
export PATH=$PATH:$SPARK_HOME/bin
#SPARK VARIABLES END
$ source /etc/profile
(4)配置spark
$ pwd
/home/corey/setup/spark-1.3.1-bin-hadoop2.6/conf
$ mv spark-env.sh.template spark-env.sh
$ vim spark-env.sh
export SCALA_HOME=/home/corey/setup/scala-2.10.5
export JAVA_HOME=/usr/bin/jvm/jdk1.7.0_79
export SPARK_MASTER_IP=192.168.198.130
export SPARK_WORKER_MEMORY=512m
export master=spark://192.168.198.130:7070
$vi slaves
master
(5)啓動spark
$ pwd
/home/corey/setup/spark-1.3.1-bin-hadoop2.6/sbin
$ ./start-all.sh
注意,hadoop也有start-all.sh腳本,因此必須進入具體目錄執行腳本
$ jps
30302 Worker
30859 Jps
30172 Master
4、驗證安裝情況
(1)運行自帶示例
$ bin/run-example org.apache.spark.examples.SparkPi
(2)查看集羣環境
http://master:8080/
(3)進入spark-shell
$spark-shell
(4)查看jobs等信息
http://master:4040/jobs/
5、可能遇到的問題
(1)無法啓動spark
$ vim spark-env.sh
export master=spark://192.168.198.130:7070
將這裏的corp修改爲你的hostname
corey@corp:~$ hostname
corp
或者在hosts中添加master的IP映射
corey@corp:~$ sudo vim /etc/hosts
192.168.198.130 master
(2)無法找到jps
jps存在於$JAVA_HOME/bin目錄下,將其加入PATH
$ vim /etc/profile
export PATH=$PATH:=/usr/bin/jvm/jdk1.7.0_79/bin
source /etc/profile
(3)在IDEA中Run遇到ClassNotFoundException。例如
15/05/30 19:00:25 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1, 192.168.163.128): java.lang.ClassNotFoundException: com.spark.demo.SparkPi$$anonfun$1
通過setJars將Jar添加入SparkConf