1.先下载spark 2.3.1
下载地址:http://spark.apache.org/downloads.html
2.安装spark 2.3.1
上传到 /usr/spark 目录下
解压安装 :
tar -zxvf spark-2.3.1-bin-hadoop2.7.tgz
3.修改/etc/hosts文件如下:
vim /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.2.185 sky1
修改/etc/sysconfig/network文件如下:
vim /etc/sysconfig/network NETWORKING=yes HOSTNAME=sky1 GATEWAY=192.168.2.1
4. 修改spark 配置文件(以4台机器为例)
conf/slaves
vim conf/slaves sky1 sky2 sky3 sky4
conf/spark-env.sh
vim conf/spark-env.sh export JAVA_HOME=/usr/java/jdk export SPARK_MASTER_HOST=sky1 export SPARK_MASTER_PORT=7077 export SPARK_WORKER_CORES=1 export SPARK_WORKER_MEMORY=1g
5.修改完成后,把spark cp 到其它机器
scp -r /usr/spark/spark-2.3.1-bin-hadoop2.7 root@sky2:/usr/spark
6.启动spark
启动注意关闭防火墙(service iptables stop)
./sbin/start-all.sh
其它启动命令(http://spark.apache.org/docs/latest/spark-standalone.html):
sbin/start-master.sh - Starts a master instance on the machine the script is executed on. sbin/start-slaves.sh - Starts a slave instance on each machine specified in the conf/slaves file. sbin/start-slave.sh - Starts a slave instance on the machine the script is executed on. sbin/start-all.sh - Starts both a master and a number of slaves as described above. sbin/stop-master.sh - Stops the master that was started via the sbin/start-master.sh script. sbin/stop-slaves.sh - Stops all slave instances on the machines specified in the conf/slaves file. sbin/stop-all.sh - Stops both the master and the slaves as described above.
7.查看启动情况:
http://IP:8080/ 查看spark web控制台
netstat -antlp :查看spark 端口监听情况
8. 测试(http://spark.apache.org/docs/latest/submitting-applications.html)
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://sky1:7077 examples/jars/spark-examples_2.11-2.3.1.jar 10000
其它
# Run application locally on 8 cores ./bin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --master local[8] \ /path/to/examples.jar \ 100 # Run on a Spark standalone cluster in client deploy mode ./bin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --master spark://207.184.161.138:7077 \ --executor-memory 20G \ --total-executor-cores 100 \ /path/to/examples.jar \ 1000 # Run on a Spark standalone cluster in cluster deploy mode with supervise ./bin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --master spark://207.184.161.138:7077 \ --deploy-mode cluster \ --supervise \ --executor-memory 20G \ --total-executor-cores 100 \ /path/to/examples.jar \ 1000 # Run on a YARN cluster export HADOOP_CONF_DIR=XXX ./bin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --master yarn \ --deploy-mode cluster \ # can be client for client mode --executor-memory 20G \ --num-executors 50 \ /path/to/examples.jar \ 1000 # Run a Python application on a Spark standalone cluster ./bin/spark-submit \ --master spark://207.184.161.138:7077 \ examples/src/main/python/pi.py \ 1000 # Run on a Mesos cluster in cluster deploy mode with supervise ./bin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --master mesos://207.184.161.138:7077 \ --deploy-mode cluster \ --supervise \ --executor-memory 20G \ --total-executor-cores 100 \ http://path/to/examples.jar \ 1000 # Run on a Kubernetes cluster in cluster deploy mode ./bin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --master k8s://xx.yy.zz.ww:443 \ --deploy-mode cluster \ --executor-memory 20G \ --num-executors 50 \ http://path/to/examples.jar \ 1000
相关推荐
spark 2.3.1 支持Hive Yarn Hadoop 2.7 已编译版本 可以直接使用
使用maven重新编译spark2.3.1源码,用以实现hive on spark
spark2.3.1-with-hive编译版本,
spark-2.3.1-bin-hadoop2.7.zip
spark2.3.1源码包官网给了两种编译方式,个人喜欢打包的方式,因为打包完成后可以根据自己的需要去部署spark环境,所以也推荐打包编译方式
Spark-2.3.1源码解读。 Spark Core源码阅读 Spark Context 阅读要点 Spark的缓存,变量,shuffle数据等清理及机制 Spark-submit关于参数及部署模式的部分解析 GroupByKey VS ReduceByKey OrderedRDDFunctions...
免费提供spark-2.3.1版本安装文件,免安装,解压后放到想要安装的目录下,配置下环境变量即可。
sprak2.3.1版本的linux平台安装包spark-2.3.1-bin-hadoop2.6.tgz
使用maven重新编译spark2.3.1源码,用以实现hive on spark
spark2.3.1官方文档离线版 包含http://spark.apache.org/docs/2.3.1/ 下面的所有页面和所有api文档 并附有下载方法,可适用与其他版本文档下载
当使用 StreamingContext 时,Spark Web UI 会额外显示一个 Streaming 选项卡,用来显示正在运行的 Receivers 的统
浪尖带着你阅读spark原始码 Spark Core源码阅读 Spark Sql源码阅读 Spark Streaming源码阅读 更多大数据文章请关注浪尖微信公众号:Spark学习技巧 浪尖和阿里大神一起创造了知识星球-spark技术学院,欢迎大家扫码...
自己弄了将近一天的成果。主要是修改MethodInvokingJobDetailFactoryBean这个类。
linux的spark新版本,匹配hadoop2.7版本,spark-3.2.1-bin-hadoop2.7.tgz
nacos windows安装包 2.3.1,适合小白用户搭建springcloudalibaba使用
hackerbar 2.3.1免费版本火狐插件
memcached-2.3.1 api文档 memcached-2.3.1 api文档 memcached-2.3.1 api文档
HL7 2.3.1英文原版 从事医疗行业信息化的同学必备工具书。
mongodb-spark官方连接器,运行spark-submit --packages org.mongodb.spark:mongo-spark-connector_2.11:1.1.0可以自动下载,国内网络不容易下载成功,解压后保存到~/.ivy2目录下即可。
nacos2.3.1 windows版本