- 浏览: 860680 次
- 性别:
- 来自: 郑州
-
文章分类
CentOS 安装 hadoop(伪分布模式)
在本机上装的CentOS 5.5 虚拟机,
软件准备:jdk 1.6 U26
hadoop:hadoop-0.20.203.tar.gz
ssh检查配置
- [root@localhost~]#ssh-keygen-trsa
- Generatingpublic/privatersakeypair.
- Enterfileinwhichtosavethekey(/root/.ssh/id_rsa):
- Createddirectory'/root/.ssh'.
- Enterpassphrase(emptyfornopassphrase):
- Entersamepassphraseagain:
- Youridentificationhasbeensavedin/root/.ssh/id_rsa.
- Yourpublickeyhasbeensavedin/root/.ssh/id_rsa.pub.
- Thekeyfingerprintis:
- a8:7a:3e:f6:92:85:b8:c7:be:d9:0e:45:9c:d1:36:3broot@localhost.localdomain
- [root@localhost~]#
- [root@localhost~]#cd..
- [root@localhost/]#cdroot
- [root@localhost~]#ls
- anaconda-ks.cfgDesktopinstall.loginstall.log.syslog
- [root@localhost~]#cd.ssh
- [root@localhost.ssh]#catid_rsa.pub>authorized_keys
- [root@localhost.ssh]#
- [root@localhost.ssh]#sshlocalhost
- Theauthenticityofhost'localhost(127.0.0.1)'can'tbeestablished.
- RSAkeyfingerprintis41:c8:d4:e4:60:71:6f:6a:33:6a:25:27:62:9b:e3:90.
- Areyousureyouwanttocontinueconnecting(yes/no)?yes
- Warning:Permanentlyadded'localhost'(RSA)tothelistofknownhosts.
- Lastlogin:TueJun2122:40:312011
- [root@localhost~]#
安装jdk
- [root@localhostjava]#chmod+xjdk-6u26-linux-i586.bin
- [root@localhostjava]#./jdk-6u26-linux-i586.bin
- ......
- ......
- ......
- FormoreinformationonwhatdataRegistrationcollectsand
- howitismanagedandused,see:
- http://java.sun.com/javase/registration/JDKRegistrationPrivacy.html
- PressEntertocontinue.....
- Done.
安装完成后生成文件夹:jdk1.6.0_26
配置环境变量
- [root@localhostjava]#vi/etc/profile
- #添加如下信息
- #setjavaenvironment
- exportJAVA_HOME=/usr/java/jdk1.6.0_26
- exportCLASSPATH=$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
- exportPATH=$JAVA_HOME/lib:$JAVA_HOME/jre/bin:$PATH:$HOME/bin
- exportHADOOP_HOME=/usr/local/hadoop/hadoop-0.20.203
- exportPATH=$PATH:$HADOOP_HOME/bin
- [root@localhostjava]#chmod+x/etc/profile
- [root@localhostjava]#source/etc/profile
- [root@localhostjava]#
- [root@localhostjava]#java-version
- javaversion"1.6.0_26"
- Java(TM)SERuntimeEnvironment(build1.6.0_26-b03)
- JavaHotSpot(TM)ClientVM(build20.1-b02,mixedmode,sharing)
- [root@localhostjava]#
修改hosts
- [root@localhostconf]#vi/etc/hosts
- #Donotremovethefollowingline,orvariousprograms
- #thatrequirenetworkfunctionalitywillfail.
- 127.0.0.1localhost.localdomainlocalhost
- ::1localhost6.localdomain6localhost6
- 127.0.0.1namenodedatanode01
解压安装hadoop
- [root@localhosthadoop]#tarzxvfhadoop-0.20.203.tar.gz
- ......
- ......
- ......
- hadoop-0.20.203.0/src/contrib/ec2/bin/image/create-hadoop-image-remote
- hadoop-0.20.203.0/src/contrib/ec2/bin/image/ec2-run-user-data
- hadoop-0.20.203.0/src/contrib/ec2/bin/launch-hadoop-cluster
- hadoop-0.20.203.0/src/contrib/ec2/bin/launch-hadoop-master
- hadoop-0.20.203.0/src/contrib/ec2/bin/launch-hadoop-slaves
- hadoop-0.20.203.0/src/contrib/ec2/bin/list-hadoop-clusters
- hadoop-0.20.203.0/src/contrib/ec2/bin/terminate-hadoop-cluster
- [root@localhosthadoop]#
进入hadoop配置conf
- ####################################
- [root@localhostconf]#vihadoop-env.sh
- #添加代码
- #setjavaenvironment
- exportJAVA_HOME=/usr/java/jdk1.6.0_26
- #####################################
- [root@localhostconf]#vicore-site.xml
- <?xmlversion="1.0"?>
- <?xml-stylesheettype="text/xsl"href="configuration.xsl"?>
- <!--Putsite-specificpropertyoverridesinthisfile.-->
- <configuration>
- <property>
- <name>fs.default.name</name>
- <value>hdfs://namenode:9000/</value>
- </property>
- <property>
- <name>hadoop.tmp.dir</name>
- <value>/usr/local/hadoop/hadooptmp</value>
- </property>
- </configuration>
- #######################################
- [root@localhostconf]#vihdfs-site.xml
- <?xmlversion="1.0"?>
- <?xml-stylesheettype="text/xsl"href="configuration.xsl"?>
- <!--Putsite-specificpropertyoverridesinthisfile.-->
- <configuration>
- <property>
- <name>dfs.name.dir</name>
- <value>/usr/local/hadoop/hdfs/name</value>
- </property>
- <property>
- <name>dfs.data.dir</name>
- <value>/usr/local/hadoop/hdfs/data</value>
- </property>
- <property>
- <name>dfs.replication</name>
- <value>1</value>
- </property>
- </configuration>
- #########################################
- [root@localhostconf]#vimapred-site.xml
- <?xmlversion="1.0"?>
- <?xml-stylesheettype="text/xsl"href="configuration.xsl"?>
- <!--Putsite-specificpropertyoverridesinthisfile.-->
- <configuration>
- <property>
- <name>mapred.job.tracker</name>
- <value>namenode:9001</value>
- </property>
- <property>
- <name>mapred.local.dir</name>
- <value>/usr/local/hadoop/mapred/local</value>
- </property>
- <property>
- <name>mapred.system.dir</name>
- <value>/tmp/hadoop/mapred/system</value>
- </property>
- </configuration>
- #########################################
- [root@localhostconf]#vimasters
- #localhost
- namenode
- #########################################
- [root@localhostconf]#vislaves
- #localhost
- datanode01
启动 hadoop
- #####################<spanstyle="font-size:small;">格式化namenode##############</span>
- [root@localhostbin]#hadoopnamenode-format
- 11/06/2300:43:54INFOnamenode.NameNode:STARTUP_MSG:
- /************************************************************
- STARTUP_MSG:StartingNameNode
- STARTUP_MSG:host=localhost.localdomain/127.0.0.1
- STARTUP_MSG:args=[-format]
- STARTUP_MSG:version=0.20.203.0
- STARTUP_MSG:build=http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203-r1099333;compiledby'oom'onWedMay407:57:50PDT2011
- ************************************************************/
- 11/06/2300:43:55INFOutil.GSet:VMtype=32-bit
- 11/06/2300:43:55INFOutil.GSet:2%maxmemory=19.33375MB
- 11/06/2300:43:55INFOutil.GSet:capacity=2^22=4194304entries
- 11/06/2300:43:55INFOutil.GSet:recommended=4194304,actual=4194304
- 11/06/2300:43:56INFOnamenode.FSNamesystem:fsOwner=root
- 11/06/2300:43:56INFOnamenode.FSNamesystem:supergroup=supergroup
- 11/06/2300:43:56INFOnamenode.FSNamesystem:isPermissionEnabled=true
- 11/06/2300:43:56INFOnamenode.FSNamesystem:dfs.block.invalidate.limit=100
- 11/06/2300:43:56INFOnamenode.FSNamesystem:isAccessTokenEnabled=falseaccessKeyUpdateInterval=0min(s),accessTokenLifetime=0min(s)
- 11/06/2300:43:56INFOnamenode.NameNode:Cachingfilenamesoccuringmorethan10times
- 11/06/2300:43:57INFOcommon.Storage:Imagefileofsize110savedin0seconds.
- 11/06/2300:43:57INFOcommon.Storage:Storagedirectory/usr/local/hadoop/hdfs/namehasbeensuccessfullyformatted.
- 11/06/2300:43:57INFOnamenode.NameNode:SHUTDOWN_MSG:
- /************************************************************
- SHUTDOWN_MSG:ShuttingdownNameNodeatlocalhost.localdomain/127.0.0.1
- ************************************************************/
- [root@localhostbin]#
- ###########################################
- [root@localhostbin]#./start-all.sh
- startingnamenode,loggingto/usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-namenode-localhost.localdomain.out
- datanode01:startingdatanode,loggingto/usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-datanode-localhost.localdomain.out
- namenode:startingsecondarynamenode,loggingto/usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-secondarynamenode-localhost.localdomain.out
- startingjobtracker,loggingto/usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-jobtracker-localhost.localdomain.out
- datanode01:startingtasktracker,loggingto/usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-tasktracker-localhost.localdomain.out
- [root@localhostbin]#jps
- 11971TaskTracker
- 11807SecondaryNameNode
- 11599NameNode
- 12022Jps
- 11710DataNode
- 11877JobTracker
查看集群状态
- [root@localhostbin]#hadoopdfsadmin-report
- ConfiguredCapacity:4055396352(3.78GB)
- PresentCapacity:464142351(442.64MB)
- DFSRemaining:464089088(442.59MB)
- DFSUsed:53263(52.01KB)
- DFSUsed%:0.01%
- Underreplicatedblocks:0
- Blockswithcorruptreplicas:0
- Missingblocks:0
- -------------------------------------------------
- Datanodesavailable:1(1total,0dead)
- Name:127.0.0.1:50010
- DecommissionStatus:Normal
- ConfiguredCapacity:4055396352(3.78GB)
- DFSUsed:53263(52.01KB)
- NonDFSUsed:3591254001(3.34GB)
- DFSRemaining:464089088(442.59MB)
- DFSUsed%:0%
- DFSRemaining%:11.44%
- Lastcontact:ThuJun2301:11:15PDT2011
- [root@localhostbin]#
其他问题:1
- ####################启动报错##########
- [root@localhostbin]#./start-all.sh
- startingnamenode,loggingto/usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-namenode-localhost.localdomain.out
- Theauthenticityofhost'datanode01(127.0.0.1)'can'tbeestablished.
- RSAkeyfingerprintis41:c8:d4:e4:60:71:6f:6a:33:6a:25:27:62:9b:e3:90.
- Areyousureyouwanttocontinueconnecting(yes/no)?y
- Pleasetype'yes'or'no':yes
- datanode01:Warning:Permanentlyadded'datanode01'(RSA)tothelistofknownhosts.
- datanode01:startingdatanode,loggingto/usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-datanode-localhost.localdomain.out
- <strong><spanstyle="color:#ff0000;">datanode01:Unrecognizedoption:-jvm
- datanode01:CouldnotcreatetheJavavirtualmachine.</span>
- </strong>
- namenode:startingsecondarynamenode,loggingto/usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-secondarynamenode-localhost.localdomain.out
- startingjobtracker,loggingto/usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-jobtracker-localhost.localdomain.out
- datanode01:startingtasktracker,loggingto/usr/local/hadoop/hadoop-0.20.203/bin/../logs/hadoop-root-tasktracker-localhost.localdomain.out
- [root@localhostbin]#jps
- 10442JobTracker
- 10533TaskTracker
- 10386SecondaryNameNode
- 10201NameNode
- 10658Jps
- ################################################
- [root@localhostbin]#vihadoop
- elif["$COMMAND"="datanode"];then
- CLASS='org.apache.hadoop.hdfs.server.datanode.DataNode'
- if[[$EUID-eq0]];then
- HADOOP_OPTS="$HADOOP_OPTS-jvmserver$HADOOP_DATANODE_OPTS"
- else
- HADOOP_OPTS="$HADOOP_OPTS-server$HADOOP_DATANODE_OPTS"
- fi
- #http://javoft.net/2011/06/hadoop-unrecognized-option-jvm-could-not-create-the-java-virtual-machine/
- #改为
- elif["$COMMAND"="datanode"];then
- CLASS='org.apache.hadoop.hdfs.server.datanode.DataNode'
- #if[[$EUID-eq0]];then
- #HADOOP_OPTS="$HADOOP_OPTS-jvmserver$HADOOP_DATANODE_OPTS"
- #else
- HADOOP_OPTS="$HADOOP_OPTS-server$HADOOP_DATANODE_OPTS"
- #fi
- #或者换非root用户启动
- #启动成功
2,启动时要关闭防火墙
查看运行情况:
http://localhost:50070
- NameNode'localhost.localdomain:9000'
- Started:ThuJun2301:07:18PDT2011
- Version:0.20.203.0,r1099333
- Compiled:WedMay407:57:50PDT2011byoom
- Upgrades:Therearenoupgradesinprogress.
- Browsethefilesystem
- NamenodeLogs
- ClusterSummary
- 6filesanddirectories,1blocks=7total.HeapSizeis31.38MB/966.69MB(3%)
- ConfiguredCapacity:3.78GB
- DFSUsed:52.01KB
- NonDFSUsed:3.34GB
- DFSRemaining:442.38MB
- DFSUsed%:0%
- DFSRemaining%:11.44%
- LiveNodes:1
- DeadNodes:0
- DecommissioningNodes:0
- NumberofUnder-ReplicatedBlocks:0
- NameNodeStorage:
- StorageDirectoryTypeState
- /usr/local/hadoop/hdfs/nameIMAGE_AND_EDITSActive
http://localhost:50030
- namenodeHadoopMap/ReduceAdministration
- QuickLinks
- *SchedulingInfo
- *RunningJobs
- *RetiredJobs
- *LocalLogs
- State:RUNNING
- Started:ThuJun2301:07:30PDT2011
- Version:0.20.203.0,r1099333
- Compiled:WedMay407:57:50PDT2011byoom
- Identifier:201106230107
- ClusterSummary(HeapSizeis15.31MB/966.69MB)
- RunningMapTasksRunningReduceTasksTotalSubmissionsNodesOccupiedMapSlotsOccupiedReduceSlotsReservedMapSlotsReservedReduceSlotsMapTaskCapacityReduceTaskCapacityAvg.Tasks/NodeBlacklistedNodesGraylistedNodesExcludedNodes
- 00010000224.00000
- SchedulingInformation
- QueueNameStateSchedulingInformation
- defaultrunningN/A
- Filter(Jobid,Priority,User,Name)
- Example:'user:smith3200'willfilterby'smith'onlyintheuserfieldand'3200'inallfields
- RunningJobs
- none
- RetiredJobs
- none
- LocalLogs
- Logdirectory,JobTrackerHistoryThisisApacheHadooprelease0.20.203.0
测试:
- ##########建立目录名称##########
- [root@localhostbin]#hadoopfs-mkdirtestFolder
- ###############拷贝文件到文件夹中
- [root@localhostlocal]#ls
- binetcgameshadoopincludeliblibexecsbinsharesrcSSH_key_file
- [root@localhostlocal]#hadoopfs-copyFromLocalSSH_key_filetestFolder
- 进入web页面即可查看
参考:http://bxyzzy.blog.51cto.com/854497/352692
附: 准备FTP :yum install vsftpd (方便文件传输 和hadoop无关)
关闭防火墙:service iptables start
启动FTP:service vsftpd start
相关推荐
6. **配置Hadoop伪分布式模式**:修改`/usr/local/hadoop/etc/hadoop/core-site.xml`和`hdfs-site.xml`配置文件,设置HDFS的相关参数,如命名节点和数据节点的位置。在`mapred-site.xml`中指定MapReduce框架。同时,...
### CentOS 下安装伪分布式 Hadoop-1.2.1 的详细步骤 ...至此,已经完成了在 CentOS 下伪分布式模式的 Hadoop-1.2.1 的安装与基本配置。这为后续进行 Hadoop 相关的大数据处理任务提供了坚实的基础。
- **Hadoop伪分布部署**:适用于本地测试环境。 - **Zookeeper、Hive、HBase的分布式部署**:提供高可用性和数据仓库支持。 - **Spark、Sqoop、Mahout的分布式部署**:用于提高数据处理性能和数据分析能力。 - **...
### Hadoop伪分布模式在Linux CentOS下的安装与配置详解 #### 一、概览 本文旨在详细介绍如何在Linux CentOS 5.0系统下搭建Hadoop伪分布模式的测试环境,包括必要的步骤、注意事项以及可能遇到的问题及其解决方案...
本文档将详细介绍如何在Ubuntu 14.04环境下安装配置Hadoop 2.6.0版本,包括单机模式和伪分布式模式。无论您是初学者还是有一定经验的技术人员,本教程都将帮助您顺利完成Hadoop的安装和配置。 #### 二、环境准备 1....
在CentOS上安装Hadoop是一项关键的任务,尤其对于学习和实践大数据处理的用户来说。Hadoop是一个开源的分布式计算框架,它允许在廉价硬件上处理大规模数据集。在虚拟机上的CentOS系统上安装Hadoop,可以提供一个安全...
在本资源中,我们将详细介绍Hadoop伪分布式安装的步骤,包括宿主机和客户机的网络连接、Hadoop的伪分布安装步骤、JDK的安装、Hadoop的安装等。 1. 宿主机和客户机的网络连接 在Hadoop伪分布式安装中,宿主机和客户...
##### (七) Hadoop伪分布式配置 - **步骤**: 1. 修改配置文件`core-site.xml`和`hdfs-site.xml`。 2. 对`core-site.xml`进行配置: - 设置Hadoop的FS默认文件系统为HDFS。 - 设置HDFS的地址。 3. 对`hdfs-...
解压后得到的是Hadoop的安装目录,需要进入该目录下的etc/hadoop子目录,对Hadoop的配置文件进行修改以适配伪分布式模式。配置文件主要包括core-site.xml、hdfs-site.xml和yarn-site.xml。 在core-site.xml中,需要...
【在CentOS7下正确安装伪分布Hadoop2.7.2和配置Eclipse】 在CentOS7系统中安装和配置Hadoop2.7.2的伪分布式模式,以及为Eclipse开发环境做准备,涉及多个步骤。首先,我们需要创建一个名为`hadoop`的用户,以便更好...
本实验将引导你完成在CentOS 6操作系统上安装Hadoop的过程,涵盖单机模式、伪分布式模式以及分布式模式的安装。这些模式各有特点,适用于不同的学习和开发需求。\n\n**一、单机模式安装**\n\n1. **环境准备**:首先...
Hadoop有三种工作模式:单机模式、伪分布式模式和完全分布式模式。 1. 单机模式:在单机模式下,Hadoop被配置成以非分布式模式运行的一个独立Java进程。这对调试非常有帮助。 2. 伪分布式模式:Hadoop可以在单节点...
以上知识点详细地阐述了在CentOS系统上配置Hadoop伪分布式环境的全过程,包括了环境准备、JDK安装、环境变量配置、Hadoop配置文件修改、SSH无密码登录配置、集群的启动和使用,以及常用命令的介绍。对于初学者来说,...
### Hadoop伪分布式模式安装详解 #### 一、前言 Hadoop伪分布式模式是一种在单机上模拟Hadoop分布式环境的方式。虽然这种方式并非真正的分布式部署,但它通过使用线程来模拟多节点间的通信和数据处理流程。对于...
根据给定文件的信息,本文将详细介绍如何在 CentOS 6.4 系统中安装 Hadoop 2.6.0,并实现单机模式与伪分布式模式的配置。 ### 环境准备 #### 操作系统环境 - **操作系统**: CentOS 6.4 32位 - **虚拟化平台**: ...
Hadoop伪分布式安装概览 Hadoop可以运行在多种模式下,包括单机模式、伪分布式模式和完全分布式模式。伪分布式模式是指所有的Hadoop守护进程在一台机器上运行,并且对外表现得就像是一个分布式的集群环境。这种模式...
【标题】:“Hadoop课程设计,基于Hadoop的好友推荐,在VM虚拟机上搭建CentOS环境(伪分布式)”这一主题涵盖了多个IT领域的关键知识点,包括大数据处理框架Hadoop、虚拟化技术VMware、操作系统CentOS以及数据推荐...
- 全分布模式:除了伪分布模式外,还需配置 hosts 文件、SSH 免密登录等,并在所有节点上复制 Hadoop 配置文件。 **3.5 验证Hadoop安装** - **运行 WordCount 示例程序**: - 编写 MapReduce 任务。 - 提交任务...
本文档详细介绍了在CentOS系统上进行Hadoop伪分布式安装的过程。 #### 二、CentOS基础配置 **1. 解决Ifconfig查看不到IP的问题** - 虚拟机设置中,确保网络连接设置为NAT模式。 - 使用`ifconfig`或`ip addr`命令...
在单节点集群中,可能还需要配置伪分布式模式,这通常通过在`hadoop-env.sh`中设置`HADOOP_OPTS`来实现,并在`hdfs-site.xml`中指定`dfs.nameservices`和`dfs.datanode.data.dir`等属性。 完成配置后,可以启动...