`

hadoop problems

 
阅读更多
hadoop执行stop-all.sh的时候总是出现 “no namenode to stop”


这个原因其实是因为在执行stop-all.sh时,找不到pid文件了。
在 HADOOP_HOME/conf/ hadoop-env.sh 里面,修改配置如下:
export HADOOP_PID_DIR=/home/hadoop/pids

pid文件默认在/tmp目录下,而/tmp是会被系统定期清理的,所以Pid文件被删除后就no namenode to stop”

本人在搭建hadoop集群时折腾了好几天,过程中先后遇到了以下问题,记录下来和大家分享,以免后面再碰到

(1).当在格式化namenode时出现cannot create directory /usr/local/hadoop/hdfs/name/current的时候,请将hadoop的目录权限设为当前用户可写sudo chmod -R a+w /usr/local/hadoop,授予hadoop目录的写权限



(2).当碰到192.168.30.6:chown:changing ownership of 'usr/local/hadoop ../logs':operation not permitted时,可用sudo chown -R test:test /usr/local/hadoop来解决,即将hadoop主目录授权给当前test用户



(3).如果slaves机上通过jps只有TaskTracker进程,没有datanode进程,请查看日志,是不是有“could only be replicated to 0 nodes,instedas of 1”有这样的提示错误,如果有,可能就是没有关闭防火墙所致,服务器和客户机都关闭(sudo ufw disable)后,所有进程启动正常,浏览器查看状态也正常了(ubuntu防火墙关闭)

(4).要在服务器机上把masters和slavs机的IP地址加到/etc/hosts,当然这个不影响start-all.sh启动,只是浏览器jobtacker上找不到节slaves节点

(5).ssh免密码登陆时,注意文件authorized_keys和.ssh目录权限,如果从masters机上scp authorized_keys到slaves机上出现拒绝问题,那就在salves机上也ssh-keygen,这样目录和文件就不会有权限问题了,就可以实现ssh免密码登陆了,或直接将.ssh目录拷过去,在slaves机上新建.ssh目录权限好像和masters机的目录权限不一样


运行正常后,可以用wordcount例子来做一下测试:

bin/hadoop dfs –put /home/test-in input   //将本地文件系统上的 /home/test-in 目录拷到 HDFS 的根目录上,目录名改为 input
bin/hadoop jar hadoop-examples-0.20.203.0.jar wordcount input output
#查看执行结果:
# 将文件从 HDFS 拷到本地文件系统中再查看:
$ bin/hadoop dfs -get output output
$ cat output/*
# 也可以直接查看
$ bin/hadoop dfs -cat output/*

如果在reduce过程中出现"Shuffle Error:Exceeded MAX_FAILED_UNIQUE_FETCHES;bailing-out"错误,原因有可能是/etc/hosts中没有将IP映射加进去,还有可能是关于打开文件数限制的问题,需要改vi /etc/security/limits.conf,详情请看:Hadoop常见问题及解决办法



如果hadoop启动时,slave机报连接拒绝错误,即连不上master9000端口,则namenode运行不起来时,可以先确保单机是不是可以正常运行伪分布式模式,可能是因为没有做ssh localhost,也可以查看是不是/etc/hosts文件IP映射有问题,一定注意该机的IP地址是不是唯一映射一个服务器名,不然可能也会出错

1.http://wiki.apache.org/hadoop/ConnectionRefused

2.http://ubuntuforums.org/showthread.php?t=1057604

3.http://www.hadoopor.com/thread-1056-1-1.html



如果是一个master机和一个slave机,做wordcount测试,是不会报上述错误的,不知道为什么两台slave机后,就报上述错误,解决方案如下(将masters机和其中一台slaves机中的/etc/hosts中masters机的IP映射地址改成服务器的名字,而不是IP地址):

1.http://hi.baidu.com/daodaowuhen/blog/item/b299b486f6c8aea86c811996.html

2.http://blog.sina.com.cn/s/blog_6e7e94bc0100pcjw.html



如果需要在Eclipse上运行MapReduce程序,需要将hadoop目录下的eclipse插件(hadoop/contrib/eclipse-plugin)拷到eclipse安装目录下的plugins下。此时可以把hadoop的测试例子wordcount从src/examples中拷到eclipse工程中,在open run dialog的arguments中设置好输入和输出参数,然后run as->java application就可以了



如果要run as->run on hadoop,即要在集群上运行MapReduce程序,则需要设置hadoop location,设置好master的namenode和jobtracker的IP及端口号,能正确连到DFS Locations就表明可以分布式运行MapReduce程序了,本人在此过程中,先后遇到以下问题:

1.Connecting to DFS test has encountered a problem

这个原因是因为直接采用hadoop0.20.203.0和相应包下的eclipse插件,此时MapReduce viewer是出现了,但根本连不上DFS,查了很多资料,很多说是因为hadoop下的eclipse插件和eclipse版本不匹配



参考资料1:http://lucene.472066.n3.nabble.com/about-eclipse-plugin-td3047557.html

参考资料2:自己动手编译对应hadoop版本的eclipse plugin 



按照上面两条解决方案,我选择了第一种,虽然前一种hadoop是0.20.2的版本,但已经有编译好的可以用的eclipse插件,还有o.20.203这个版本是今年5月份出来的,还比较新,问题暴露的还少,而第二种解决方案需要自己配置编译,好像还要用ant命令,这个工具还没用过,所以免得麻烦



2.error:call to failed on local exception:java.io.EOFException

原因如下:eclipse-plugin和Hadoop版本对不上(因为刚开始不想换hadoop版本,就直接把那个插件(hadoop-plugin-0.20.3-snapshot.jar)拷到0.20.203版本里跑了一下,结果报上面的错,后来将hadoop版本换成hadoop0.20.2就好了)

参考资料:http://lucene.472066.n3.nabble.com/Call-to-namenode-fails-with-java-io-EOFException-td2933149.html



3.Plug-in org.apache.hadoop.eclipse was unable to load class
org.apache.hadoop.eclipse.launch.HadoopApplicationLaunchShortcut.

可以尝试将终端切到eclipse目录下用./eclipse clean来将以前载过的hadoop插件清除掉,我遇到这个错误后,这样操作后就好了。

参考资料:http://www.cloudobjects.de/2010/11/running-hadoop-jobs-using-eclipse.html



上述三种错误,是我在这个过程中遇到的错误,如果遇到类似错误,可以试着用上述方法来排除一下,因为hadoop在搭建的过程中遇到的问题实在千奇百怪,稍微一步有差池,说不准后面运行时就报错,就算报相同错误提示的, 也可能是其它原因导致的,上述只提供其中一种解决方案



参考资料:

1.ubuntu server上java安装

2.http://www.infoq.com/cn/articles/hadoop-config-tip

3.http://www.ibm.com/developerworks/cn/opensource/os-cn-hadoop3/

4.http://www.iteye.com/topic/891735

5.http://www.iteye.com/topic/284849

6.http://muxiaolin.iteye.com/blog/1075575
分享到:
评论

相关推荐

    Hadoop.Essentials.1784396680

    If you are a system or application developer interested in learning how to solve practical problems using the Hadoop framework, then this book is ideal for you. This book is also meant for Hadoop ...

    Hadoop: The Definitive Guide

    Complete with case studies that illustrate how Hadoop solves specific problems, this book helps you: * Use the Hadoop Distributed File System (HDFS) for storing large datasets, and run distributed ...

    Hadoop in 24 Hours, Sams Teach Yourself

    Apache Hadoop is the technology at the heart of the Big Data revolution, and ... By the time you're finished, you'll be comfortable using Apache Hadoop to solve a wide spectrum of Big Data problems.

    Hadoop_Data Processing and Modelling-Packt Publishing(2016).pdf

    Hadoop 2.x is spreading its wings to cover a variety of application paradigms and solve a wider range of data problems. It is rapidly becoming a general-purpose cluster platform for all data ...

    hadoop 权威指南(中文版)

    It also provides illuminating case studies that illustrate how Hadoop is used to solve specific problems. Looking to get the most out of your data? This is your book. Use the Hadoop Distributed File ...

    [Hadoop] Hadoop 专业解决方案 (英文版)

    It explains how MapReduce works, and shows you how to reformulate specific business problems in MapReduce. Throughout the pages, you'll find in-depth Java code examples that you can use, derived from...

    Apache Hadoop 3 Quick Start Guide

    You will see how the parallel programming paradigm, such as MapReduce, can solve many complex data processing problems. The book also covers the important aspects of the big data software development...

    Hadoop.in.Practice.2nd.Edition

    Alex Holmes works on tough big-data problems. He is a software engineer, author, speaker, and blogger specializing in large-scale Hadoop projects. Table of Contents Part 1: Background and ...

    Hadoop.Beginners.Guide

    problems. It's a really exciting time to work with data processing technologies such as Hadoop. The ability to apply complex analytics to large data sets—once the monopoly of large corporations and...

    Hadoop权威掼.pdf

    The book includes case studies that illustrate how Hadoop solves specific problems. Organizations large and small are adopting Apache Hadoop to deal with huge application datasets. Hadoop: The ...

    海量高效数据索引·hadoop·JPA·data mining

    To resolve problems of globalization, random-write and duration in Hadoop, a data indexing approach on Hadoop using the Java Persistence API (JPA) is elaborated in the implementation of a KD-tree ...

    Elasticsearch for Hadoop

    Knowing the problems Solution approaches Approach 1 – Preaggregate the results Approach 2 – Aggregate the results at query-time Writing the NetworkLogsMapper job Writing the mapper class ...

    Big Data, MapReduce, Hadoop, and Spark with Python

    Later, the technology was adopted into an open-source framework called Hadoop, and then Spark emerged as a new big data framework which addressed some problems with MapReduce. In this book we will ...

    Hadoop The Definitive Guide 3rd Edition

    You’ll find illuminating case studies that demonstrate how Hadoop is used to solve specific problems. This third edition covers recent changes to Hadoop, including material on the new MapReduce API, ...

    Hadoop MapReduce Cookbook

    How to install Hadoop MapReduce and HDFS to begin running examples How to configure and administer Hadoop and HDFS securely Understanding the internals of Hadoop and how Hadoop can be extended to suit...

    Data-intensive Systems: Principles and Fundamentals using Hadoop and Spark

    This means that the content in the chapters is focused on developing solutions to simplified, but still realistic problems using data-intensive technologies and approaches. The reader follows one ...

    Learning.Hadoop.2

    If you are a system or application developer interested in learning how to solve practical problems using the Hadoop framework, then this book is ideal for you. You are expected to be familiar with ...

    Hadoop: The Definitive Guide 3rd_edition

    Hadoop: The Definitive Guide, Third Edition by Tom White 2012-01-27 Early release revision 1 Hadoop got its start in Nutch. A few of us were attempting to build an open source web search engine and ...

Global site tag (gtag.js) - Google Analytics