`

[spark-src-core] 3.3 run spark in standalone(cluster) mode

 
阅读更多

  simiar to the prevous article,this one is focused on cluster mode.

1.issue command

./bin/spark-submit  --class org.apache.spark.examples.JavaWordCount --deploy-mode cluster --master spark://gzsw-02:6066 lib/spark-examples-1.4.1-hadoop2.4.0.jar hdfs://host02:/user/hadoop/input.txt

   note:1) the deploy-mode is necessary to specify by 'cluster'.

   2) then the 'master' param is rest-url,ie,

REST URL: spark://gzsw-02:6066 (cluster mode)

   which shown in spark master ui page,since spark will use rest.RestSubmissionClient to submit jobs.

   

2.run logs in user side(it's brief,as this is cluster mode)

Spark Command: /usr/local/jdk/jdk1.6.0_31/bin/java -cp /home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/conf/:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/spark-assembly-1.4.1-hadoop2.4.0.jar:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar:/usr/local/hadoop/hadoop-2.5.2/etc/hadoop/ -XX:MaxPermSize=256m org.apache.spark.deploy.SparkSubmit --master spark://gzsw-02:6066 --deploy-mode cluster --class org.apache.spark.examples.JavaWordCount lib/spark-examples-1.4.1-hadoop2.4.0.jar hdfs://hd02:/user/hadoop/input.txt
========================================
-executed cmd retruned by Main.java:/usr/local/jdk/jdk1.6.0_31/bin/java -cp /home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/conf/:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/spark-assembly-1.4.1-hadoop2.4.0.jar:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar:/usr/local/hadoop/hadoop-2.5.2/etc/hadoop/ -XX:MaxPermSize=256m org.apache.spark.deploy.SparkSubmit --master spark://gzsw-02:6066 --deploy-mode cluster --class org.apache.spark.examples.JavaWordCount lib/spark-examples-1.4.1-hadoop2.4.0.jar hdfs://host02:/user/hadoop/input.txt
Running Spark using the REST application submission protocol.
16/09/19 11:26:06 INFO rest.RestSubmissionClient: Submitting a request to launch an application in spark://gzsw-02:6066.
16/09/19 11:26:07 INFO rest.RestSubmissionClient: Submission successfully created as driver-20160919112607-0001. Polling submission state...
16/09/19 11:26:07 INFO rest.RestSubmissionClient: Submitting a request for the status of submission driver-20160919112607-0001 in spark://gzsw-02:6066.
16/09/19 11:26:07 INFO rest.RestSubmissionClient: State of driver driver-20160919112607-0001 is now RUNNING.
16/09/19 11:26:07 INFO rest.RestSubmissionClient: Driver is running on worker worker-20160914175456-192.168.100.14-36693 at 192.168.100.14:36693.
16/09/19 11:26:07 INFO rest.RestSubmissionClient: Server responded with CreateSubmissionResponse:
{
  "action" : "CreateSubmissionResponse",
  "message" : "Driver successfully submitted as driver-20160919112607-0001",
  "serverSparkVersion" : "1.4.1",
  "submissionId" : "driver-20160919112607-0001",
  "success" : true
}
16/09/19 11:26:07 INFO util.Utils: Shutdown hook called

    so we know,driver is running on worker 192.168.100.14:36693(not local host)

 

3.FAQ

1) in cluser mode,the driver info will show in spark master ui page(but not for client mode)

 

  (app-0000/0001 both are run in cluster mode,so the corresponding drivers are shown in 'completed drivers' block)

 

2) can't open the application detail ui.ie when you click the app which run in cluster mode,similar errors will compain about:

Application history not found (app-20160919151936-0000)
No event logs found for application JavaWordCount in file:/home/hadoop/spark/spark-eventlog/. Did you specify the correct logging directory?

   this msg is present as in cluster mode,the driver will run on other worker instead of master local host,so a request to master will find nothing about this app.

  workaround:use the hdfs  fs instead of local fs,ie

spark.eventLog.dir=hdfs://host02:8020/user/hadoop/spark-eventlog

 

3) applications disappear after restart spark

  eventhrough you set a distributed filesystem to 'spark.eventlog.dir' mentioned above,you will see nothgin when restart spark,that means spark master will keep apps info in mem when it's alive,but when restarts.there is a spark-history-server.sh to figure out this problem[1]

 

ref:

[1]Spark History Server配置使用

[spark-src-core] 3.2.run spark in standalone(client) mode

  • 大小: 161.1 KB
0
0
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics