Using a Mesos Master URL
The Master URLs for Mesos are in the form mesos://host:5050
for a single-master Mesos cluster, ormesos://zk://host:2181
for a multi-master Mesos cluster using ZooKeeper.
The driver also needs some configuration in spark-env.sh
to interact properly with Mesos:
- In
spark.env.sh
set some environment variables:-
export MESOS_NATIVE_LIBRARY=<path to libmesos.so>
. This path is typically<prefix>/lib/libmesos.so
where the prefix is/usr/local
by default. See Mesos installation instructions above. On Mac OS X, the library is calledlibmesos.dylib
instead oflibmesos.so
. -
export SPARK_EXECUTOR_URI=<URL of spark-1.0.1.tar.gz uploaded above>
.
-
- Also set
spark.executor.uri
to<URL of spark-1.0.1.tar.gz>
.
Now when starting a Spark application against the cluster, pass a mesos://
URL as the master when creating aSparkContext
. For example:
val conf = new SparkConf()
.setMaster("mesos://HOST:5050")
.setAppName("My app")
.set("spark.executor.uri", "<path to spark-1.0.1.tar.gz uploaded above>")
val sc = new SparkContext(conf)
(You can also use spark-submit
and configure spark.executor.uri
in the conf/spark-defaults.conf file. Note that spark-submit
currently only supports deploying the Spark driver in client
mode for Mesos.)
When running a shell, the spark.executor.uri
parameter is inherited from SPARK_EXECUTOR_URI
, so it does not need to be redundantly passed in as a system property.
./bin/spark-shell --master mesos://host:5050
Mesos Run Modes
Spark can run over Mesos in two modes: “fine-grained” (default) and “coarse-grained”.
In “fine-grained” mode (default), each Spark task runs as a separate Mesos task. This allows multiple instances of Spark (and other frameworks) to share machines at a very fine granularity, where each application gets more or fewer machines as it ramps up and down, but it comes with an additional overhead in launching each task. This mode may be inappropriate for low-latency requirements like interactive queries or serving web requests.
The “coarse-grained” mode will instead launch only one long-running Spark task on each Mesos machine, and dynamically schedule its own “mini-tasks” within it. The benefit is much lower startup overhead, but at the cost of reserving the Mesos resources for the complete duration of the application.
To run in coarse-grained mode, set the spark.mesos.coarse
property in your SparkConf:
conf.set("spark.mesos.coarse", "true")
In addition, for coarse-grained mode, you can control the maximum number of resources Spark will acquire. By default, it will acquire all cores in the cluster (that get offered by Mesos), which only makes sense if you run just one application at a time. You can cap the maximum number of cores using conf.set("spark.cores.max", "10")
(for example).
启动spark:
./bin/spark-shell --master mesos://127.0.1.1:5050
#测试
scala> val file = sc.textFile("hdfs://hadoop-master:9000/tmp/WifiScan_None_20140723.csv")
scala> val count=file.flatMap(line => line.split(" ")).map(word => (word,1)).reduceByKey(_+_)
scala> count.count()http://spark.apache.org/docs/latest/running-on-mesos.html
相关推荐
Spark 入门实战系列,适合初学者,文档包括十部分内容,质量很好,为了感谢文档作者,也为了帮助更多的人入门,传播作者的心血,特此友情转贴: 1.Spark及其生态圈简介.pdf 2.Spark编译与部署(上)--基础环境搭建....
Spark入门完整版 PDF版,从生态圈 部署安装 编程模型 运行框架 stream sql mllib graphx tachyon安装部署介绍
这是一位大数据分析开发者的一本Spark入门学习用的总结
spark相关文档
storm spark 入门项目storm spark 入门项目storm spark 入门项目storm spark 入门项目storm spark 入门项目
spark入门级资料推荐,开始是介绍简装 编译运行等基础介绍,从第五章开始 Hive介绍及部署 Hive实战 SparkSQL介绍 实战 深入了解运行计划及调优 SparkStreaming原理介绍 实战 SparkMLlib机器学习及SparkMLlib简介 ...
spark入门,官网上的介绍,用python实现
SPARK入门教程,主要介绍spark生态圈、平台部署,hive,实时流计算,机器学习库等
scala编程详解,Spark核心编程,Spark内核源码深度剖析,Spark性能优化,Spark SQL
spark入门实战
2021贺岁大数据入门spark3.0入门到精通资源简介: 本课程中使用官方在2020年9月8日发布的Spark3.0系列最新稳定版:Spark3.0.1。共课程包含9个章节:Spark环境搭建,SparkCore,SparkStreaming,SparkSQL,...
【Spark研究】极简 Spark 入门笔记——安装和第一个回归程序
storm和spark入门项目final,以此为准,前面几个都不正确
Spark入门与大数据分析实战.pptx
包含spark快速入门、spark-sql介绍和应用、spark-streaming介绍和应用、spark项目实战等
Spark在2013年6月进入Apache成为孵化项目,8个月后成为Apache顶级项目,速度之快足见过人之处,Spark以其先进的设计理念,迅速成为社区的热门项目,围绕着Spark推出了Spark SQL、Spark Streaming、MLLib和GraphX等...
spark入门知识,非常不错的起点。包含spark主要概念的讲解
大数据与云计算教程课件 优质大数据课程 32.Spark入门之Scala(共173页).pptx 大数据与云计算教程课件 优质大数据课程 33.Spark入门(共40页).pptx 大数据与云计算教程课件 优质大数据课程 34.SparkSQL(共15页)....
大数据 Spark Storm 流计算 storm基本概念及架构 案例讲解及开发实践 spark基本概念与架构
大数据与云计算教程课件 优质大数据课程 32.Spark入门之Scala(共173页).pptx 大数据与云计算教程课件 优质大数据课程 33.Spark入门(共40页).pptx 大数据与云计算教程课件 优质大数据课程 34.SparkSQL(共15页)....