由于Hadoop的部分操作需要用到Linux的shell命令,所以在Windows下安装,需要安装一个Linux的运行时环境。然后,需要配置无密钥通信协议。配置完后,需要配置Hadoop的xml文件。
安装Cygwin
http://cygwin.com/install.html
Cygwin中配置sshd
http://docs.oracle.com/cd/E24628_01/install.121/e22624/preinstall_req_cygwin_ssh.htm#CBHIAFGI
伪分布式配置
配置文档路径: hadoop-1.1.0/docs/single_node_setup.html
bin/hadoop namenode -format bin/start-all.sh bin/stop-all.sh http://localhost:50030 http://localhost:50070
遇到的问题及解决:
在真正运行的时刻会遇到几个问题:
1、设置的路径并非使用cygwin linux的路径。
hadoop.tmp.dir在/tmp目录下面,理论上应该在C:\cygwin\tmp,但实际的路径确实C:\tmp
路径不同意,我们就设置自己的目录就可以了
<property> <name>hadoop.tmp.dir</name> <value>/cygwin/home/Winseliu/cloud</value> </property>
2、启动datanode和jobtracker,以及tasktacker时会有路径权限的问题
2012-11-25 13:53:05,031 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:Winseliu cause:java.io.IOException: Failed to set permissions of path: C:\cygwin\home\Winseliu\hadoop-1.1.0\logs\history to 0755 2012-11-25 13:53:05,032 FATAL org.apache.hadoop.mapred.JobTracker: java.io.IOException: Failed to set permissions of path: C:\cygwin\home\Winseliu\hadoop-1.1.0\logs\history to 0755 at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689) at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:670) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509) at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:344) at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:189) at org.apache.hadoop.mapred.JobHistory.init(JobHistory.java:510)
2012-11-25 13:53:04,389 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /cygwin/home/Winseliu/cloud/mapred/local 2012-11-25 13:53:04,396 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.io.IOException: Failed to set permissions of path: \cygwin\home\Winseliu\cloud\mapred\local\taskTracker to 0755 at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689) at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:670)
权限问题,直接修改FileUtils的checkReturnValue()方法,替换hadoop-core-1.1.0.jar中的FileUtils.class文件
private static void checkReturnValue(boolean rv, File p, FsPermission permission ) throws IOException { if (!rv) { // FIXME try { throw new IOException("Failed to set permissions of path: " + p + " to " + String.format("%04o", permission.toShort())); } catch (Exception e) { e.printStackTrace(); } } }
3、使用jps查不全真正执行的java进程,不知道那个进程启动或未启动
Winseliu@WINSE ~/hadoop-1.1.0 $ jps 6364 NameNode 7168 JobTracker 2692 Jps Winseliu@WINSE ~/hadoop-1.1.0 $ ps aux | grep java 7880 1 5544 7028 ? 1001 13:20:40 /cygdrive/c/Java/jdk1.7.0_02/bin/java 5968 1 7500 4592 ? 1001 13:20:36 /cygdrive/c/Java/jdk1.7.0_02/bin/java 5784 1 484 6364 pty0 1001 13:20:31 /cygdrive/c/Java/jdk1.7.0_02/bin/java 6732 1 484 7168 pty0 1001 13:20:38 /cygdrive/c/Java/jdk1.7.0_02/bin/java 7976 1 5716 5628 ? 1001 13:20:34 /cygdrive/c/Java/jdk1.7.0_02/bin/java 4492 0 0 4492 pty0 1001 Jan 1 /cygdrive/c/Java/jdk1.7.0_02/bin/java
直接再执行一次start-all.sh,如果会提示让你先stop就说明该进程已经启动了。
Winseliu@WINSE ~ $ cd hadoop-1.1.0/ Winseliu@WINSE ~/hadoop-1.1.0 $ bin/start-all.sh starting namenode, logging to /home/Winseliu/hadoop-1.1.0/libexec/../logs/hadoop-Winseliu-namenode-WINSE.out localhost: starting datanode, logging to /home/Winseliu/hadoop-1.1.0/libexec/../logs/hadoop-Winseliu-datanode-WINSE.out localhost: starting secondarynamenode, logging to /home/Winseliu/hadoop-1.1.0/libexec/../logs/hadoop-Winseliu-secondarynamenode-WINSE.out starting jobtracker, logging to /home/Winseliu/hadoop-1.1.0/libexec/../logs/hadoop-Winseliu-jobtracker-WINSE.out localhost: starting tasktracker, logging to /home/Winseliu/hadoop-1.1.0/libexec/../logs/hadoop-Winseliu-tasktracker-WINSE.out Winseliu@WINSE ~/hadoop-1.1.0 $ bin/start-all.sh namenode running as process 2648. Stop it first. localhost: datanode running as process 3512. Stop it first. localhost: secondarynamenode running as process 2468. Stop it first. jobtracker running as process 2388. Stop it first. localhost: tasktracker running as process 860. Stop it first. Winseliu@WINSE ~/hadoop-1.1.0 $
相关推荐
jar包,官方版本,自测可用
apache-mrunit-1.1.0-hadoop2-bin和apache-mrunit-1.1.0-hadoop1-bin包括mrunit-1.1.0-hadoop2.jar
2、大数据环境-安装Hadoop2.5.2伪分布式傻瓜教程 原创
Hadoop技术-Hadoop伪分布式安装.pptx
Hadoop-0.21.0分布式集群配置.doc
王家林的“云计算分布式大数据Hadoop实战高手之路---从零开始”的第二讲Hadoop图文训练课程:全球最详细(不放过任何一个细节)的从零起步搭建Hadoop单机和伪分布式开发环境图文教程.pdf
hadoop eclipse-plugin-1.1.0.jar hadoop eclipse插件
hive-jdbc-1.1.0-hadoop-2.6.0
详细的hadoop2 伪分布式环境搭建以及eclipse部署。demo示例代码测试运行。文中有插件包。资源包等参考链接参考下载。
Windows 下编译的hadoop 1.1.0版本的hadoop-eclipse插件,windows 下测试通过,不过最好装最新的eclipse
Hadoop-2.8.4版本的完全分布式搭建,图文版,步骤详细。
jar包,官方版本,自测可用
大数据教程-Hadoop伪分布式安装,安装、部署详细细节步骤
Hadoop-2.4.0分布式安装手册
学习大数据所需的工具,hive-1.1.0-cdh5.7.0.tar
hadoop-2.7.3+zookeeper-3.4.8+hadoop-2.7.3分布式环境搭建整理(王三旗亲试成功安装)
hive-1.1.0-cdh5.7.0.tar.gz。下载前请注意,你的Hadoop版本号,要和这个版本号对上,才能用。
大数据hadoop中hive-1.1.0 的cli ,jar包,hive-cli-1.1.0.jar
赠送jar包:hadoop-mapreduce-client-jobclient-2.6.5.jar; 赠送原API文档:hadoop-mapreduce-client-jobclient-2.6.5-javadoc.jar; 赠送源代码:hadoop-mapreduce-client-jobclient-2.6.5-sources.jar; 赠送...