[spark-src-core] 3.3 run spark in standalone(cluster) mode

leibnitz

浏览: 274378 次
性别:
来自: 广州

最近访客更多访客>>

eternal1025

bneliao

adapterofcoms

caipeijun666

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

spark

simiar to the prevous article,this one is focused on cluster mode.

1.issue command

./bin/spark-submit  --class org.apache.spark.examples.JavaWordCount --deploy-mode cluster --master spark://gzsw-02:6066 lib/spark-examples-1.4.1-hadoop2.4.0.jar hdfs://host02:/user/hadoop/input.txt

note:1) the deploy-mode is necessary to specify by 'cluster'.

2) then the 'master' param is rest-url,ie,

REST URL: spark://gzsw-02:6066 (cluster mode)

which shown in spark master ui page,since spark will use rest.RestSubmissionClient to submit jobs.

2.run logs in user side(it's brief,as this is cluster mode)

Spark Command: /usr/local/jdk/jdk1.6.0_31/bin/java -cp /home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/conf/:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/spark-assembly-1.4.1-hadoop2.4.0.jar:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar:/usr/local/hadoop/hadoop-2.5.2/etc/hadoop/ -XX:MaxPermSize=256m org.apache.spark.deploy.SparkSubmit --master spark://gzsw-02:6066 --deploy-mode cluster --class org.apache.spark.examples.JavaWordCount lib/spark-examples-1.4.1-hadoop2.4.0.jar hdfs://hd02:/user/hadoop/input.txt
========================================
-executed cmd retruned by Main.java:/usr/local/jdk/jdk1.6.0_31/bin/java -cp /home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/conf/:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/spark-assembly-1.4.1-hadoop2.4.0.jar:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar:/usr/local/hadoop/hadoop-2.5.2/etc/hadoop/ -XX:MaxPermSize=256m org.apache.spark.deploy.SparkSubmit --master spark://gzsw-02:6066 --deploy-mode cluster --class org.apache.spark.examples.JavaWordCount lib/spark-examples-1.4.1-hadoop2.4.0.jar hdfs://host02:/user/hadoop/input.txt
Running Spark using the REST application submission protocol.
16/09/19 11:26:06 INFO rest.RestSubmissionClient: Submitting a request to launch an application in spark://gzsw-02:6066.
16/09/19 11:26:07 INFO rest.RestSubmissionClient: Submission successfully created as driver-20160919112607-0001. Polling submission state...
16/09/19 11:26:07 INFO rest.RestSubmissionClient: Submitting a request for the status of submission driver-20160919112607-0001 in spark://gzsw-02:6066.
16/09/19 11:26:07 INFO rest.RestSubmissionClient: State of driver driver-20160919112607-0001 is now RUNNING.
16/09/19 11:26:07 INFO rest.RestSubmissionClient: Driver is running on worker worker-20160914175456-192.168.100.14-36693 at 192.168.100.14:36693.
16/09/19 11:26:07 INFO rest.RestSubmissionClient: Server responded with CreateSubmissionResponse:
{
  "action" : "CreateSubmissionResponse",
  "message" : "Driver successfully submitted as driver-20160919112607-0001",
  "serverSparkVersion" : "1.4.1",
  "submissionId" : "driver-20160919112607-0001",
  "success" : true
}
16/09/19 11:26:07 INFO util.Utils: Shutdown hook called

so we know,driver is running on worker 192.168.100.14:36693(not local host)

3.FAQ

1) in cluser mode,the driver info will show in spark master ui page(but not for client mode)

(app-0000/0001 both are run in cluster mode,so the corresponding drivers are shown in 'completed drivers' block)

2) can't open the application detail ui.ie when you click the app which run in cluster mode,similar errors will compain about:

Application history not found (app-20160919151936-0000)
No event logs found for application JavaWordCount in file:/home/hadoop/spark/spark-eventlog/. Did you specify the correct logging directory?

this msg is present as in cluster mode,the driver will run on other worker instead of master local host,so a request to master will find nothing about this app.

workaround:use the hdfs fs instead of local fs,ie

spark.eventLog.dir=hdfs://host02:8020/user/hadoop/spark-eventlog

3) applications disappear after restart spark

eventhrough you set a distributed filesystem to 'spark.eventlog.dir' mentioned above,you will see nothgin when restart spark,that means spark master will keep apps info in mem when it's alive,but when restarts.there is a spark-history-server.sh to figure out this problem[1]

ref:

[1]Spark History Server配置使用

[spark-src-core] 3.2.run spark in standalone(client) mode

查看图片附件

0
顶

0
踩

分享到：

[spark-src-core] 4.1 spark on yarn | [spark-src-core] 3.2.run spark in standa ...

2016-09-19 12:30
浏览 860
评论(0)
分类:编程语言
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论