1. 代码:
package spark.examples.fileformat import org.apache.spark.{SparkConf, SparkContext} object SequenceFileTest { def main(args: Array[String]) { val conf = new SparkConf() conf.setAppName("SequenceFileTest") conf.setMaster("local[3]") val sc = new SparkContext(conf) val data = List(("ABC", 1), ("BCD", 2), ("CDE", 3), ("DEF", 4), ("FGH", 5)) val rdd = sc.parallelize(data, 1) val dir = "file:///D:/sequenceFile-" + System.currentTimeMillis() rdd.saveAsSequenceFile(dir) val rdd2 = sc.sequenceFile[String, Int](dir + "/part-00000") println(rdd2.collect().map(elem => (elem._1 + ", " + elem._2)).toList) } }
2. SequenceFile的内容:
3.注意:
saveAsSequenceFile是SequenceFileRDDFunctions定义的方法,但是在上面的代码中并没有显式的指定隐式转换,原因是上面的代码运行于Spark1.3中,在SparkContext中有如下的注释解释了这种行为
// The following implicit functions were in SparkContext before 1.3 and users had to // `import SparkContext._` to enable them. Now we move them here to make the compiler find // them automatically. However, we still keep the old functions in SparkContext for backward // compatibility and forward to the following functions directly. implicit def intWritableConverter(): WritableConverter[Int] = simpleWritableConverter[Int, IntWritable](_.get) implicit def longWritableConverter(): WritableConverter[Long] = simpleWritableConverter[Long, LongWritable](_.get) implicit def doubleWritableConverter(): WritableConverter[Double] = simpleWritableConverter[Double, DoubleWritable](_.get) implicit def floatWritableConverter(): WritableConverter[Float] = simpleWritableConverter[Float, FloatWritable](_.get) implicit def booleanWritableConverter(): WritableConverter[Boolean] = simpleWritableConverter[Boolean, BooleanWritable](_.get) implicit def bytesWritableConverter(): WritableConverter[Array[Byte]] = { simpleWritableConverter[Array[Byte], BytesWritable] { bw => // getBytes method returns array which is longer then data to be returned Arrays.copyOfRange(bw.getBytes, 0, bw.getLength) } } implicit def stringWritableConverter(): WritableConverter[String] = simpleWritableConverter[String, Text](_.toString) implicit def writableWritableConverter[T <: Writable](): WritableConverter[T] = new WritableConverter[T](_.runtimeClass.asInstanceOf[Class[T]], _.asInstanceOf[T]) }
而SequenceFileRDDFunctions是针对KV都是继承自Writable的PairRDD
/** * Extra functions available on RDDs of (key, value) pairs to create a Hadoop SequenceFile, * through an implicit conversion. Note that this can't be part of PairRDDFunctions because * we need more implicit parameters to convert our keys and values to Writable. * */ class SequenceFileRDDFunctions[K <% Writable: ClassTag, V <% Writable : ClassTag]( self: RDD[(K, V)], _keyWritableClass: Class[_ <: Writable], _valueWritableClass: Class[_ <: Writable]) extends Logging with Serializable {
相关推荐
11、hadoop环境下的Sequence File的读写与合并 网址:https://blog.csdn.net/chenwewi520feng/article/details/130359237 本文介绍hadoop环境下的Sequence File的读写与合并。 本文依赖:hadoop环境可用,本示例是以...
Focusing on a sequence of tutorials that deliver a working news intelligence service, you will learn about advanced Spark architectures, how to work with geographic data in Spark, and how to tune ...
在Oracle数据库中,sequence等同于序列号,每次取的时候sequence会自动增加,一般会作用于需要按序列号排序的地方。 1、Create Sequence (注释:你需要有CREATE SEQUENCE或CREATE ANY SEQUENCE权限) CREATE ...
Sequence to Sequence Learning with Neural Networksv论文PDF版
机器学习之sequence to sequence learning。(Sequence Generation-----Hung-yi Lee 李宏毅.ppt)
SequenceDiagram-3.0.5.zip
Edward Grefenstette - Beyond Sequence to Sequence with Augmented RNNs
A method for assembly sequence planning is proposed in this paper. First, two methods for assembly sequence planning are compared, which are indirect method and direct method. Then, the limits of the ...
Sequence Diagrams of UML
sequence-to-sequence-learning-with-neural-networks.
在Oracle数据库移植过程中,sequence可能失效,本资源可使失效的sequence重新恢复作用
基于循环神经网络和注意力机制的Sequence-to-Sequence模型神经网络方法在信息抽取和自动摘要生成方面发挥了重要作用。然而,该方法不能充分利用文本的语言特征信息,且生成结果中存在未登录词问题,从而影响文本摘要...
sequence.txt
oracle中sequence介绍及应用
memory networks, are extremely appealing for sequence-tosequence learning tasks. Despite their great success, they typically suffer from a fundamental shortcoming: they are prone to generate ...
DNASequence.java ssd
sequence-diagram-js 所需jar包,其中sequence-diagram-js支持自定义颜色
搜狐2017笔试题一:Kolakoski sequence完整源代码
分别编写程序,演示Sequence容器vector、list、deque的构造、插入、删除、访问、赋值、交换操。用菜单控制演示题目的各个操作,比较每个容器的删除插入等操作的时间差别,了解其内部实现。