故障现象:
[hadoop@dtydb6 logs]$ vi hadoop-hadoop-datanode-dtydb6.log
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:233)
at org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockInputStream(FSDataset.java:1094)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:168)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:81)
at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.verifyBlock(DataBlockScanner.java:453)
at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.verifyFirstBlock(DataBlockScanner.java:519)
at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.run(DataBlockScanner.java:617)
at java.lang.Thread.run(Thread.java:722)
2013-02-17 00:00:29,023 WARN org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Second Verification failed for blk_1408462853104263034_39617. Exception : java.io.FileNotFoundException: /hadoop/logdata/current/subdir2/subdir2/blk_1408462853104263034
(Too many open files)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:233)
at org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockInputStream(FSDataset.java:1094)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:168)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:81)
at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.verifyBlock(DataBlockScanner.java:453)
at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.verifyFirstBlock(DataBlockScanner.java:519)
at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.run(DataBlockScanner.java:617)
at java.lang.Thread.run(Thread.java:722)
2013-02-17 00:00:29,023 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Reporting bad block blk_1408462853104263034_39617 to namenode.
2013-02-17 00:00:53,076 WARN org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: First Verification failed for blk_4328439663130931718_44579. Exception : java.io.FileNotFoundException: /hadoop/logdata/current/subdir9/subdir12/blk_4328439663130931718
(Too many open files)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:233)
at org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockInputStream(FSDataset.java:1094)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:168)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:81)
at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.verifyBlock(DataBlockScanner.java:453)
at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.verifyFirstBlock(DataBlockScanner.java:519)
at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.run(DataBlockScanner.java:617)
at java.lang.Thread.run(Thread.java:722)
2013-02-17 00:00:53,077 WARN org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Second Verification failed for blk_4328439663130931718_44579. Exception : java.io.FileNotFoundException: /hadoop/logdata/current/subdir9/subdir12/blk_4328439663130931718 (Too
many open files)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:233)
at org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockInputStream(FSDataset.java:1094)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:168)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:81)
at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.verifyBlock(DataBlockScanner.java:453)
at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.verifyFirstBlock(DataBlockScanner.java:519)
at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.run(DataBlockScanner.java:617)
at java.lang.Thread.run(Thread.java:722)
2013-02-17 00:00:53,077 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Reporting bad block blk_4328439663130931718_44579 to namenode.
2013-02-17 00:01:10,115 WARN org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: First Verification failed for blk_2833765807455012512_10228. Exception : java.io.FileNotFoundException: /hadoop/logdata/current/subdir63/subdir25/blk_2833765807455012512
(Too many open files)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:233)
at org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockInputStream(FSDataset.java:1094)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:168)
at org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:81)
at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.verifyBlock(DataBlockScanner.java:453)
at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.verifyFirstBlock(DataBlockScanner.java:519)
网络搜索,怀疑linux nofile超过最大限制,当前设置大小1024,默认值
[hadoop@dtydb6 logs]$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 1064960
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 1064960
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
而查看flume进程打开的文件数量为2932(这个比较奇怪,怎么超过1024了呢?)
12988 Jps
26903 JobTracker
29828 Application
26545 DataNode
27100 TaskTracker
26719 SecondaryNameNode
26374 NameNode
[root@dtydb6 ~]# lsof -p 29828|wc -l
2932
[root@dtydb6 ~]# ps -ef|grep 29828
root 13133 12914 0 14:05 pts/3 00:00:00 grep 29828
hadoop 29828 1 32 Jan22 ? 8-10:51:15 /usr/java/jdk1.7.0_07/bin/java -Xmx2048m -cp /monitor/flume-1.3/conf:/monitor/flume-1.3/lib/*:/hadoop/hadoop-1.0.4/libexec/../conf:/usr/java/jdk1.7.0_07/lib/tools.jar:/hadoop/hadoop-1.0.4/libexec/..:/hadoop/hadoop-1.0.4/libexec/../hadoop-core-1.0.4.jar:/hadoop/hadoop-1.0.4/libexec/
解决方案:
1.修改nfile配置文件,手工增加nofile的大小
vi /etc/security/limits.conf
* soft nofile 12580
* hard nofile 65536
2.重启flume进程,也就是进程29828,问题解决
参考资料:
http://eryk.iteye.com/blog/1193487
http://blog.csdn.net/rzhzhz/article/details/7577122
分享到:
相关推荐
lnmp(linux+nginx+mysql+php)安装配置及分布式系统大数据处理hadoop集群中的flume+Kafka+Storm+HDFS等实时系统搭分享
flume1.9.0+hdfs3.2.2相关jar
解决flume上传文件至HDFS报错问题。需要手动将hadoop相关jar包导入flume的lib目录下。
commons-io-2.4.jar,hadoop-auth-2.7.3.jar,hadoop-common-2.7.3.jar,hadoop-hdfs-2.7.3.jar,htrace-core-3.1.0-incubating.jar五个包是flume1.7连hdfs所需要的外部包,这是其中一个,其他四个看其他的下载。
Flume要想将数据输出到HDFS,必须持有Hadoop相关jar包:commons-configuration-1.6.jar、 hadoop-auth-2.7.2.jar、 hadoop-common-2.7.2.jar、 hadoop-hdfs-2.7.2.jar、 commons-io-2.4.jar、 htrace-core-...
Flume-ng在windows环境搭建并测试+log4j日志通过Flume输出到HDFS 11111
flume所需要的hdfs包.zip, flume所需要的hdfs包.zip, flume所需要的hdfs包.zip
flume 想要将数据输出到hdfs,必须要有hadoop相关jar包。本资源是hadoop 2.7.7版本
commons-io-2.4.jar,hadoop-auth-2.7.3.jar,hadoop-common-2.7.3.jar,hadoop-hdfs-2.7.3.jar,htrace-core-3.1.0-incubating.jar五个包是flume1.7连hdfs所需要的外部包,这是其中一个,其他四个看其他的下载。
commons-io-2.4.jar,hadoop-auth-2.7.3.jar,hadoop-common-2.7.3.jar,hadoop-hdfs-2.7.3.jar,htrace-core-3.1.0-incubating.jar五个包是flume1.7连hdfs所需要的外部包,这是其中一个,其他四个看其他的下载。
commons-io-2.4.jar,hadoop-auth-2.7.3.jar,hadoop-common-2.7.3.jar,hadoop-hdfs-2.7.3.jar,htrace-core-3.1.0-incubating.jar五个包是flume1.7连hdfs所需要的外部包,这是其中一个,其他四个看其他的下载。
flume将日志存入HDFS时需要用到的jar包 flume版本1.7.0 hadoop版本2.7.4
flume-flume-hdfs.conf
让你快速认识flume及安装和使用flume1 5传输数据 日志 到hadoop2 2 中文文档 认识 flume 1 flume 是什么 这里简单介绍一下 它是 Cloudera 的一个产品 2 flume 是干什么的 收集日志的 3 flume 如何搜集日志 我们把...
深入介绍Flume众多更加有用的组件的细节信息,包括用于即时数据记录持久化的重要的文件通道、用于缓存并将数据写到HDFS中的HDFS接收器,以及Hadoop分布式文件系统。对于Flume各个架构组件(源、通道、接收器、通道...
hadoop-hdfs-2.7.3搭建flume1.7需要用到的包,还有几个包也有提供
利用Flume将MySQL表数据准实时抽取到HDFS、MySQL、Kafka用到的jar包
flume-ng-hdfs-sink-1.7.0.jar,这个包里包含了flume和HDFS集成的所有类
通过修改flume源码实现flume向两个HA hadoop集群分发数据。
具体是flume使用hdfs sink时所用的,当你的主机没有hadoop环境的时候,添加这些jar包就能使用,前提是主机能通hdfs服务器的9000端口。 【flume版本1.7.0 hadoop版本2.7.4】 这些jar包是楼主一个个试出来的,大佬们给...