[spark-src] 1-overview

leibnitz

浏览: 274380 次
性别:
来自: 广州

最近访客更多访客>>

eternal1025

bneliao

adapterofcoms

caipeijun666

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

spark

what is

"Apache Spark™ is a fast and general engine for large-scale data processing....Run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk." stated in apache spark

in despite of it's real a fact or not, i think certain key concepts/components to support these points of view:

a.use Resilient Distributed Datasets(RDD) program modeling largely differs from common ideas,eg. mapreduce.spark uses many optimized algorithms(e.g. iterative,localization etc) spread workload to across many workers in cluster.specially in reuse of data computation.

RDD:A resilient distributed dataset (RDD) is a read-only col- lection of objects partitioned across a set of machines that can be rebuilt if a partition is lost.[1]

b.uses memory as far as possible.most of the intermediate results from spark retains in memory other than disks,so it's needles suffer from the io problem and serial-deserial cases.

in fact we use many tools to do similar stuffs ,like memocache,redis..

c.emphasizes the parallism concept.

d.degrades the jvm supervior responsibilities.eg. use one executor to hold on certain tasks instead of one container per task in yarn.

architecture

(the core component is as a platform for other components)

usages of spark

1.iterative alogrithms.eg. machine learning,clustering..

2.interactive analystics. eg. query a ton of data loaded from disk to memory to reduce the latency of io

3.batch process

program language

most of the source code are writing with scala( i think many functions,ideas are inspirated from scala;),but u can also write with java,python in it

flex integrations

many popular frameworks are supported by spark,e.g. hadoop,hbase,mesos etc

ref:

[1] some papers

[spark-src]-source reading

0
顶

0
踩

分享到：

scala- type conversion( classOf ,asInsta ... | [spark-src]-source reading

2016-03-20 16:20
浏览 562
评论(0)
分类:开源软件
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论