hdfs 文件的追加

乡里伢崽

浏览: 108908 次
性别:
来自: 深圳

最近访客更多访客>>

loginboot

gaojingsong

eliot4u

benwudashi

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

hdfs

1、修改hdfs-site.xml

                 <property>
	               <name>dfs.support.append</name>
	                <value>true</value>
	         </property>

2、目前如何在命令行里面给HDFS文件中追加内容我还没找到相应的方法。但是，我们可以通过Hadoop提供的API实现文件内容追加，如何实现？这里我写了一个简单的测试程序：

                     public class AppendContent {
	public static void main(String[] args) {
			        String hdfs_path = "/sort/sort";//文件路径
			        Configuration conf = new Configuration();
			     
			        FileSystem fs = null;
			        try {
			            fs = FileSystem.get(URI.create(hdfs_path), conf);
			            //要追加的文件流，inpath为文件
			            OutputStream out = fs.append(new Path(hdfs_path));
			           Writer writer = new OutputStreamWriter(out);
			           BufferedWriter bfWriter = new BufferedWriter(writer);
			           bfWriter.write("good!!");
			           if(null != bfWriter){
			        	   bfWriter.close();
			           }
			           if(null != writer){
			        	   writer.close();
			           }
			           if(null != out){
			        	   out.close();
			           }
			            System.out.println("success!!!");
			        } catch (IOException e) {
			            e.printStackTrace();
			        }
			    }
}

3、将代码打包运行
4、如果报错：

              Exception in thread "main" java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.22.17:50010, 10.10.22.18:50010], original=[10.10.22.17:50010, 10.10.22.18:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:960)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1026)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1175)

        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:531)

5、再次修改hdfs-site.xml

               <property>
                       <name>dfs.client.block.write.replace-datanode-on-failure.policy</nam
e>
                      <value>NEVER</value>
               </property>
	       <property>
                       <name>dfs.client.block.write.replace-datanode-on-failure.enable</nam
e>
                      <value>true</value>
               </property>

6：

                dfs.client.block.write.replace-datanode-on-failure.enable=true
                在进行pipeline写数据（上传数据的方式）时，如果DN或者磁盘故障，客户端将尝试移除失败的DN，然后写到剩下的磁盘。一个结果是，pipeline中的DN减少了。这个特性是添加新的DN到pipeline。这是一个站点范围的选项。当集群规模非常小时，例如3个或者更小，集群管理者可能想要禁止掉此特性。

                 dfs.client.block.write.replace-datanode-on-failure.policy=DEFAULT
                此属性仅在dfs.client.block.write.replace-datanode-on-failure.enable设置为true时有效。
                      ALWAYS: 总是添加新的DN
                      NEVER: 从不添加新的DN
                      DEFAULT: 设r是副本数，n是要写的DN数。在r>=3并且floor(r/2)>=n或者r>n(前提是文件是hflushed/appended)时添加新的DN。

分享到：

hdfs 的分布式缓存

2015-07-19 11:07
浏览 6427
评论(0)
分类:编程语言
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

hdfs 文件的追加

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

hdfs 文件的追加

评论

发表评论

相关推荐

hdfs 的分布式缓存

hdfs 的集中式缓存

hadoop的管理命令 dfsadmin

hadoop集群balance工具详解

hadoop fsck命令详解

hdfs的回收站

最近访客更多访客>>