`
liuchangshui
  • 浏览: 10840 次
  • 性别: Icon_minigender_1
  • 来自: 北京
社区版块
存档分类
最新评论

SPARK本地模式搭建与测试

 
阅读更多

转载说明出处

lcs_168@163.com

 

Spark支持多种运行模式:

    分布式部署:运行在Cluster集群中,底层资源调度可以使用Mesos或者Hadoop YARN,也可以使用Spark自带的Standalone模式

    伪分布式部署

    本地模式运行

为了入门方便并且考虑到个人学习成本(笔记本资源有限!!!),本篇介绍如何在local模式下运行Spark,接下来我们走起!!!

 

NO.1资源准备

1、  VMware10.0.1 build-1379776(我从网上下的,教程问度娘或者谷老师)

2、  CentOS6.5(给个地址,上面的资源还是蛮全的)

3、  JDK1.7CentOS自带的是OPENJDK,大家懂得,还是用正规军的好)

4、  spark-0.9.0-incubating-bin-hadoop2(我采用的是0.9.0版本---当时的最高版本,现在已经是1.0了,建议,如果是研究的话,可以下最新版本的,放地址

 

NO.2 环境搭建

1、  安装VMware

2、  安装CentOS—建议采用桥接模式,方便省心,后期做个ftpssh都是方便的不得了

3、  安装JDK1.7,安装完后务必记得配置下环境变量,否则还是使用自带的OPENJDK

4、  spark-0.9.0-incubating-bin-hadoop2.tgz上传到我们的linux环境下

 

NO.3 SPARK框架环境

1、  解压:tar –xvf spark-0.9.0-incubating-bin-hadoop2.tgz

2、  跳转到SPARKhome目录:cd spark-0.9.0-incubating-bin-hadoop2

3、  执行sbt命令 ./sbt/sbt assembly  (网速好[我家10M光纤]的话,大概半个小时完成)

4、  修改hosts文件,例如:vi /etc/hosts   加上  192.168.1.53 CentOS

5、  OK执行完成以上命令,我们的SPARK就可以在本地运行了

NO.4 环境验证

1、  进入~/ spark-0.9.0-incubating-bin-hadoop2/bin目录

 

36 [root@CentOS bin]# ./spark-shell 
37 14/06/08 06:27:47 INFO HttpServer: Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 
38 14/06/08 06:27:47 INFO HttpServer: Starting HTTP Server 
39 Welcome to 
40 ____ __ 
41 / __/__ ___ _____/ /__ 
42 _\ \/ _ \/ _ `/ __/ '_/ 
43 /___/ .__/\_,_/_/ /_/\_\ version 0.9.0 
44 /_/ 
45 
46 Using Scala version 2.10.3 (Java HotSpot(TM) Client VM, Java 1.7.0_51) 
47 Type in expressions to have them evaluated. 
48 Type :help for more information. 
49 14/06/08 06:27:51 INFO Slf4jLogger: Slf4jLogger started 
50 14/06/08 06:27:51 INFO Remoting: Starting remoting 
51 14/06/08 06:27:51 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://spark@CentOS:38659] 
52 14/06/08 06:27:51 INFO Remoting: Remoting now listens on addresses: [akka.tcp://spark@CentOS:38659] 
53 14/06/08 06:27:51 INFO SparkEnv: Registering BlockManagerMaster 
54 14/06/08 06:27:51 INFO DiskBlockManager: Created local directory at /tmp/spark-local-20140608062751-301e 
55 14/06/08 06:27:51 INFO MemoryStore: MemoryStore started with capacity 297.0 MB. 
56 14/06/08 06:27:51 INFO ConnectionManager: Bound socket to port 55885 with id = ConnectionManagerId(CentOS,55885) 
57 14/06/08 06:27:51 INFO BlockManagerMaster: Trying to register BlockManager 
58 14/06/08 06:27:51 INFO BlockManagerMasterActor$BlockManagerInfo: Registering block manager CentOS:55885 with 297.0 MB RAM 
59 14/06/08 06:27:51 INFO BlockManagerMaster: Registered BlockManager 
60 14/06/08 06:27:51 INFO HttpServer: Starting HTTP Server 
61 14/06/08 06:27:51 INFO HttpBroadcast: Broadcast server started at http://192.168.1.53:47324 
62 14/06/08 06:27:51 INFO SparkEnv: Registering MapOutputTracker 
63 14/06/08 06:27:51 INFO HttpFileServer: HTTP File server directory is /tmp/spark-d4a4b013-6a2c-4bb2-b3e6-f680cec875e7 
64 14/06/08 06:27:51 INFO HttpServer: Starting HTTP Server 
65 14/06/08 06:27:52 INFO SparkUI: Started Spark Web UI at http://CentOS:4040 
66 14/06/08 06:27:53 INFO Executor: Using REPL class URI: http://192.168.1.53:38442 
67 14/06/08 06:27:54 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
68 Created spark context.. 
69 Spark context available as sc. 
70 
71 scala> println("hello,World!!") 
72 hello,World!!

 

 

NO.5 DEMO验证

 

1 [root@CentOS bin]# ./run-example org.apache.spark.examples.SparkLR local[2] 
2 SLF4J: Class path contains multiple SLF4J bindings. 
3 SLF4J: Found binding in [jar:file:/root/spark-0.9.0-incubating-bin-hadoop2/examples/target/scala-2.10/spark-examples_2.10-assembly-0.9.0-incubating.jar!/org/slf4j/impl/StaticLoggerBinder.class] 
4 .......................省略................... 
5 4883 [spark-akka.actor.default-dispatcher-4] INFO org.apache.spark.scheduler.DAGScheduler - Completed ResultTask(4, 0) 
6 4883 [spark-akka.actor.default-dispatcher-4] INFO org.apache.spark.scheduler.DAGScheduler - Stage 4 (reduce at SparkLR.scala:64) finished in 0.075 s 
7 4884 [main] INFO org.apache.spark.SparkContext - Job finished: reduce at SparkLR.scala:64, took 0.098657134 s 
8 Final w: (5816.075967498865, 5222.008066011391, 5754.751978607454, 3853.1772062206846, 5593.565827145932, 5282.387874201054, 3662.9216051953435, 4890.78210340607, 4223.371512250292, 5767.368579668863) 
9 [root@CentOS bin]# 

 

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics