spark集群环境搭建

1、运行环境配置

1、软件环境

1
2
scala-2.11.7.tgz
spark-1.6.0-bin-hadoop2.6.tgz

2、解压到 /usr/local目录下

1
2
3
4
sudo tar -zxvf scala-2.11.7.tgz -C /usr/local/
sudo tar -zxvf spark-1.6.0-bin-hadoop2.6.tgz -C /usr/local/
sudo mv /usr/local/scala-2.11.7 /usr/local/scala
sudo mv /usr/local/spark-1.6.0 /usr/local/spark

3、修改所有者

1
2
sudo chown -R jiutian:jiutian /usr/local/scala
sudo chown -R jiutian:jiutian /usr/local/spark

4、环境变量 修改/etc/environment文件,添加scala、spark信息。

1
2
3
4
5
JAVA_HOME=/usr/local/jdk1.8.0
HADOOP_INSTALL=/usr/local/hadoop
SCALA_HOME=/usr/local/scala
SPARK_HOME=/usr/local/spark
PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/local/jdk1.8.0/bin:/usr/local/hadoop/bin:/usr/local/scala/bini:/usr/local/spark/bin"

5、环境测试

1
2
scala -version   //查看scala版本
run-example SparkPi 10 //运行SparkPi例程

2、集群环境设置

1、修改spark/conf目录下的slaves文件,添加集群机器

1
2
3
# A Spark Worker will be started on each of the machines listed below.
slave1
slave2

2、将所有安装文件、配置文件同步到slave机器上。 3、开启spark

1
2
#需要先开启hadoop
./sbin/start-all.sh

4、查看jps进程

1
2
3
4
#master
Master //进程
#slave
Worker //进程

5、开启spark-shell