- 浏览: 144612 次
- 性别:
- 来自: 北京
文章分类
最新评论
-
EclipseEye:
fair_jm 写道不错 蛮详细的 谢谢分享
SWT/JFace专题 --- SWT中Display和多线程 -
fair_jm:
不错 蛮详细的 谢谢分享
SWT/JFace专题 --- SWT中Display和多线程
@Public
@Stable
InputSplit represents the data to be processed by an individual Mapper.
Typically, it presents a byte-oriented view on the input and is the responsibility of RecordReader of the job to process this and present a record-oriented view.
---------------------
@InterfaceAudience.Public
@InterfaceStability.Stable
public abstract class InputSplit {
/**
* Get the size of the split, so that the input splits can be sorted by size.
* @return the number of bytes in the split
* @throws IOException
* @throws InterruptedException
*/
public abstract long getLength() throws IOException, InterruptedException;
/**
* Get the list of nodes by name where the data for the split would be local.
* The locations do not need to be serialized.
*
* @return a new array of the node nodes.
* @throws IOException
* @throws InterruptedException
*/
public abstract
String[] getLocations() throws IOException, InterruptedException;
/**
* Gets info about which nodes the input split is stored on and how it is
* stored at each location.
*
* @return list of <code>SplitLocationInfo</code>s describing how the split
* data is stored at each location. A null value indicates that all the
* locations have the data stored on disk.
* @throws IOException
*/
@Evolving
public SplitLocationInfo[] getLocationInfo() throws IOException {
return null;
}
---------------------------
/**
* A InputSplit that spans a set of rows
*/
@InterfaceStability.Evolving
public static class DBInputSplit extends InputSplit implements Writable {
private long end = 0;
private long start = 0;
/**
* Default Constructor
*/
public DBInputSplit() {
}
/**
* Convenience Constructor
* @param start the index of the first row to select
* @param end the index of the last row to select
*/
public DBInputSplit(long start, long end) {
this.start = start;
this.end = end;
}
/** {@inheritDoc} */
public String[] getLocations() throws IOException {
// TODO Add a layer to enable SQL "sharding" and support locality
return new String[] {};
}
/**
* @return The index of the first row to select
*/
public long getStart() {
return start;
}
/**
* @return The index of the last row to select
*/
public long getEnd() {
return end;
}
/**
* @return The total row count in this split
*/
public long getLength() throws IOException {
return end - start;
}
/** {@inheritDoc} */
public void readFields(DataInput input) throws IOException {
start = input.readLong();
end = input.readLong();
}
/** {@inheritDoc} */
public void write(DataOutput output) throws IOException {
output.writeLong(start);
output.writeLong(end);
}
}
---------
@Stable
InputSplit represents the data to be processed by an individual Mapper.
Typically, it presents a byte-oriented view on the input and is the responsibility of RecordReader of the job to process this and present a record-oriented view.
---------------------
@InterfaceAudience.Public
@InterfaceStability.Stable
public abstract class InputSplit {
/**
* Get the size of the split, so that the input splits can be sorted by size.
* @return the number of bytes in the split
* @throws IOException
* @throws InterruptedException
*/
public abstract long getLength() throws IOException, InterruptedException;
/**
* Get the list of nodes by name where the data for the split would be local.
* The locations do not need to be serialized.
*
* @return a new array of the node nodes.
* @throws IOException
* @throws InterruptedException
*/
public abstract
String[] getLocations() throws IOException, InterruptedException;
/**
* Gets info about which nodes the input split is stored on and how it is
* stored at each location.
*
* @return list of <code>SplitLocationInfo</code>s describing how the split
* data is stored at each location. A null value indicates that all the
* locations have the data stored on disk.
* @throws IOException
*/
@Evolving
public SplitLocationInfo[] getLocationInfo() throws IOException {
return null;
}
---------------------------
/**
* A InputSplit that spans a set of rows
*/
@InterfaceStability.Evolving
public static class DBInputSplit extends InputSplit implements Writable {
private long end = 0;
private long start = 0;
/**
* Default Constructor
*/
public DBInputSplit() {
}
/**
* Convenience Constructor
* @param start the index of the first row to select
* @param end the index of the last row to select
*/
public DBInputSplit(long start, long end) {
this.start = start;
this.end = end;
}
/** {@inheritDoc} */
public String[] getLocations() throws IOException {
// TODO Add a layer to enable SQL "sharding" and support locality
return new String[] {};
}
/**
* @return The index of the first row to select
*/
public long getStart() {
return start;
}
/**
* @return The index of the last row to select
*/
public long getEnd() {
return end;
}
/**
* @return The total row count in this split
*/
public long getLength() throws IOException {
return end - start;
}
/** {@inheritDoc} */
public void readFields(DataInput input) throws IOException {
start = input.readLong();
end = input.readLong();
}
/** {@inheritDoc} */
public void write(DataOutput output) throws IOException {
output.writeLong(start);
output.writeLong(end);
}
}
---------
发表评论
-
数据迁移相关(关系型数据库mysql,oracle和nosql数据库如hbase)
2015-04-01 15:15 722HBase数据迁移(1) http://www.importn ... -
zookeeper适用场景:如何竞选Master及代码实现
2015-04-01 14:53 769zookeeper适用场景:如何竞选Master及代码实现 h ... -
MR/hive 数据去重
2015-04-01 14:43 712海量数据去重的五大策略 http://www.ciotimes ... -
面试牛x题
2015-03-18 23:50 0hive、mr(各需三道) 1.分别使用Hadoop MapR ... -
使用shell并发上传文件到hdfs
2015-03-16 21:41 1229使用shell并发上传文件到hdfs http://mos19 ... -
hadoop集群监控工具Apache Ambari
2015-03-14 17:27 0Apache Ambari官网 http://ambari.a ... -
Hadoop MapReduce优化相关
2015-03-16 21:46 451[大牛翻译系列]Hadoop 翻译文章索引 http://ww ... -
数据倾斜问题 牛逼(1)数据倾斜之MapReduce&hive
2015-03-16 21:43 776数据倾斜总结 http://www.alidata.org/a ... -
MapReduce牛逼(4)WritableComparable接口
2015-03-12 08:57 577@Public @Stable A Writable whi ... -
MapReduce牛逼(3)(继承WritableComparable)实现自定义key键,实现二重排序
2015-03-12 08:57 626package sort; import jav ... -
MapReduce牛逼(2)MR简单实现 导入数据到hbase例子
2015-03-12 08:57 1250package cmd; /** * MapRe ... -
MapReduce牛逼(1)MR单词计数例子
2015-03-11 00:44 1185package cmd; import org. ... -
InputFormat牛逼(9)FileInputFormat实现类之SequenceFileInputFormat
2015-03-11 00:24 1384一、SequenceFileInputFormat及Seque ... -
InputFormat牛逼(8)FileInputFormat实现类之TextInputFormat
2015-03-11 00:19 560/** An {@link InputFormat} for ... -
InputFormat牛逼(6)org.apache.hadoop.mapreduce.lib.db.DBRecordReader<T>
2015-03-11 00:11 650@Public @Evolving A RecordRead ... -
InputFormat牛逼(5)org.apache.hadoop.mapreduce.lib.db.DBInputFormat<T>
2015-03-10 23:10 580@Public @Stable A InputFormat ... -
InputFormat牛逼(4)org.apache.hadoop.mapreduce.RecordReader<KEYIN, VALUEIN>
2015-03-10 22:50 345@Public @Stable The record rea ... -
InputFormat牛逼(3)org.apache.hadoop.mapreduce.InputFormat<K, V>
2015-03-10 22:46 630@Public @Stable InputFormat d ... -
InputFormat牛逼(1)org.apache.hadoop.mapreduce.lib.db.DBWritable
2015-03-10 22:07 528@Public @Stable Objects that a ... -
如何把hadoop2 的job作业 提交到 yarn平台
2015-01-08 21:09 0aaa萨芬撒点
相关推荐
Hadoop 代码使用方式 job.setInputFormatClass(SmallFileCombineTextInputFormat.class);...org.apache.hadoop.mapreduce.sample.SmallFileWordCount -Dmapreduce.input.fileinputformat.split.maxsize=10
编译环境:centos 6.4 64bit、maven 3.3.9、jdk1.7.0_79、lzo-2.09;...解决:hive报错:Cannot create an instance of InputFormat class org.apache.hadoop ....... as specified in mapredwork!
自定义inputFormat&&outputFormat1
30第3章 Hadoop组件 313.1 HDFS文件操作 313.1.1 基本文件命令 323.1.2 编程读写HDFS 353.2 剖析MapReduce程序 373.2.1 Hadoop数据类型 393.2.2 Mapper 403.2.3 Reducer 413.2.4 Partitioner:...
313.1 HDFS文件操作 313.1.1 基本文件命令 323.1.2 编程读写HDFS 353.2 剖析MapReduce程序 373.2.1 Hadoop数据类型 393.2.2 Mapper 403.2.3 Reducer 413.2.4 Partitioner:重定向Mapper输出 413.2.5 Combiner:本地...
MapReduce设计理念与基本架构2.1 Hadoop发展史2.1.1 Hadoop产生背景2.1.2 Apache Hadoop新版本的特性2.1.3 Hadoop版本变迁2.2 Hadoop MapReduce设计目标2.3 MapReduce编程模型概述2.3.1 MapReduce编程模型...
自定义MapReduce的InputFormat,实现提取指定开始与结束限定符的内容。
结合Hadoop源码,详细讲解了MapReduce开发中的InputFormat,很详细。
1.5.2 相同程序在MapReduce中的扩展 1.6 用Hadoop统计单词——运行第一个程序 1.7 Hadoop历史 1.8 小结 1.9 资源 第2章 初识Hadoop 2.1 Hadoop 的构造模块显示全部信息第一部分 Hadoop——一种分布式编程框架...
解决openOffice jodconverter-2.2.1包不能将2007及以上office转为PDF和解决txt乱码问题
map reduce编程模型把数据运算流程分成2个阶段 阶段1:读取原始数据,形成key-value数据(map方法) 阶段2:将阶段1的key-value数据按照相同key分组聚合(reduce方法) mapreduce编程模型软件的具体实现: hadoop中...
在云上运行Hadoop9.1 Amazon Web Services 简介9.2 安装AWS9.2.1 获得AWS身份认证凭据9.2.2 获得命令行工具9.2.3 准备SSH密钥对9.3 在EC2 上安装Hadoop9.3.1 配置安全参数9.3.2 配置集群类型9.4 在EC2 上运行...
MapReduce自定义InputFormat和RecordReader实现 MapReduce自定义OutputFormat和RecordWriter实现 Pig自定义LoadFunc加载和解析Apache HTTP日志事件 Pig的自定义EvalFunc使用MaxMind GEO API将IP地址转换为位置 另外...
ExcelRecordReaderMapReducehadoop mapreduce的MapReduce输入格式以读取Microsoft Excel电子表格执照Apache许可。用法1.下载并运行ant。 2.在您的环境中包括ExcelRecordReaderMapReduce-0.0.1-SNAPSHOT.jar 3.使用...
问题背景:框架默认的TextInputformat切片机制是对任务按文件规划切片,不管文件多小,都会是一个单独的切片,都会交给一个MapTask,这样如果有大量
维基百科-Hadoop 维基百科输入格式和一些有用的维基百科 Hadoop 工具。用法首先,您必须将WikiInputFormat设置为您的作业 InputFormat: job . setInputFormatClass( WikiInputFormat . class); 您的 Mappers 传入 ...
您可以在自己的Hadoop MapReduce作业中使用的 ,可从轻松使用这些InputFormat和OutputFormat类 从 LOAD和STORE到ElasticSearch的 一些用于与ElasticSearch进行交互的 ... < groupId>...
使用可拆分的多行 JSON 的 InputFormat 动机 目前似乎没有任何可以支持多行 JSON 的 JSON InputFormat 类。 执照 Apache 许可。 用法 要开始,只需: 下载并运行ant 在您的环境中包含dist/lib/json-mapreduce-1.0....
Apache Hive 的 InputFormat,在查询 SequenceFiles 时将返回 (Text) 键和 (Text) 值。 我需要在不拆分内容的情况下完整解析大量文本文件。 HDFS 在处理大型连续文件时提供最佳吞吐量,因此我使用 Apache Mahout 将...