【2】Hadoop 配置

chakey

浏览: 359996 次
性别:
来自: 水星

最近访客更多访客>>

279135628

Garbage_bird

tc_123

liuxiao723846

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Hadoop

Hadoop SSH CentOS XSL Security

Hadoop Configuration
新增hadoopuser用户
[root@noc rou]# adduser
bash: adduser: command not found
[root@noc rou]# cd /usr/bin/
[root@noc bin]# ln -s /usr/sbin/adduser adduser
[root@noc bin]# adduser hadoopuser

passwd wpsop
修改系统允许打开的文件数
有时候在程序里面需要打开多个文件，进行分析，系统一般默认数量是1024，（用ulimit -n可以看到）对于正常使用是够了，但是对于程序来讲，就太少了。
修改办法：
重启就OK
修改2个文件。
1）/etc/security/limits.conf
vi /etc/security/limits.conf
加上：
* soft nofile 8192
* hard nofile 20480

2）./etc/pam.d/login
session    required     /lib/security/pam_limits.so
注意：要重启才能生效（也就是把putty关了再打开）
创建mysql用户kwps和密码kwps
grant all privileges on *.* to 'kwps'@'%' identified by 'kwps' ；
flush privileges ;
简化输入
sudo -s                            切换到root
vi /usr/bin/wpsop                  新建
#! /bin/bash
ssh s$1-opdev-wps.rdev.kingsoft.net -l hadoopuser            指定用户wpsop
更改hosts
1） sudo vi /etc/hosts
2） sudo vi /etc/sysconfig/network
3） hostname -v newhostname
SSH免密码公钥认证
1） mkdir .ssh
2） cd .ssh
sudo chmod 700 . //这一步很重要
3） ssh-keygen -t rsa
4） cat rsa_d.pub >> authorized_keys
当然也可以： cp rsa_d.pub authorized_keys
使用 scp向其他服务器发送，注意不要覆盖原有的文件！！
5） chmod 644 authorized_keys //这一步很重要
注意：要保证所有的结点间（包括自连接）都是免密码ssh连接的

解压Hadoop-0.19.1
tar -xvf Hadoop-0.19.1
Hadoop配置
Hadoop下载地址
http://apache.etoak.com/hadoop/core/
http://hadoop.apache.org/common/releases.html
本机环境：
版本：Hadoop-0.191
操作系统：CentOS
五台服务器：
S2 (namenode)
S5 (secondarynamenode datanode)
S6 (datanode)
S7 (datanode)
S8 (datanode)
S9 (datanode)

***/home/wps/hadoop-0.19.1/conf***
修改masters：
s5
修改slaves：
s5
s6
s7
s8
s9
修改log4j.propperties
hadoop.log.dir=/data/hadoop-0.19.1/logs
修改hadoop-env.sh
export JAVA_HOME=/opt/JDK-1.6.0.14
export HADOOP_HEAPSIZE=4000
修改hadoop-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>



<configuration>

<property>
<name>fs.default.name</name>
<value>hdfs://s2-opdev-wps.rdev.kingsoft.net:9000/</value>
<description>The name of the default file system. Either the literal string "local" or a host:port for DFS.</description>
</property>

<property>
<name>mapred.job.tracker</name>
<value>s2-opdev-wps.rdev.kingsoft.net:9001</value>
<description>The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task.</description>
</property>

<property>
<name>dfs.name.dir</name>
<value>/data/hadoop-0.19.1/name</value>
<description>Determines where on the local filesystem the DFS name node should store the name table. If this is a comma-delimited list of directories then the name table is

replicated in all of the directories, for redundancy. </description>
</property>

<property>
<name>dfs.data.dir</name>
<value>/data/hadoop-0.19.1/dfsdata</value>
<description>Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in

all named directories, typically on different devices. Directories that do not exist are ignored.</description>
</property>

<property>
<name>hadoop.tmp.dir</name>
<value>/data/hadoop-0.19.1/tmp</value>
<description>A base for other temporary directories.</description>
</property>

<property>
<name>dfs.replication</name>
<value>3</value>
<description>Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in

create time.</description>
</property>

<property>
<name>fs.checkpoint.dir</name>
<value>/data/hadoop-0.19.1/namesecondary</value>
<description>Determines where on the local filesystem the DFS secondary
      name node should store the temporary images to merge.
      If this is a comma-delimited list of directories then the image is
      replicated in all of the directories for redundancy.
</description>
</property>

<property>
<name>dfs.http.address</name>
<value>s2-opdev-wps.rdev.kingsoft.net:50070</value>
<description>
    The address and the base port where the dfs namenode web ui will listen on.
    If the port is 0 then the server will start on a free port.
</description>
</property>

<property>
<name>mapred.map.tasks</name>
<value>50</value>
<description>The default number of map tasks per job. Typically set
to a prime several times greater than number of available hosts.
Ignored when mapred.job.tracker is "local".
</description>
</property>

<property>
<name>mapred.reduce.tasks</name>
<value>7</value>
<description>The default number of reduce tasks per job. Typically set
to a prime close to the number of available hosts. Ignored when
mapred.job.tracker is "local".
</description>
</property>
启动hadoop
bin/hadoop namenode —format
&& Do not format a running Hadoop namenode ,this will cause all your data in the HDFS filesystem to be erased. &&
bin/start-all.sh
bin/stop-all.sh
查看文件目录：
bin/hadoop fs -ls /

查看数据块：
/home/wpsop/hadoop-0.19.1/running/dfsdata/current
Bin/hadoop fs -ls /data/user/hiveware

分享到：

cruisecontrol 持续集成工具配置 | vi 编辑器使用命令

2009-09-26 21:23
浏览 1387
评论(1)
论坛回复 / 浏览 (0 / 3466)
分类:数据库
查看更多

1 楼 di1984HIT 2013-02-13

记录一下，留个影子。

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

【2】Hadoop 配置

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

【2】Hadoop 配置

评论

发表评论

相关推荐

配置hadoop使用lzo对中间数据进行压缩

hadoop作业reduce过程调优使用到的参数笔记

hadoop作业map过程调优使用到的参数笔记

修改Hadoop集群的备份数

基于Hadoop的一些工具一句话介绍

hadoop自动清除日志文件的配置方法

zookeeper的简易安装配置

【3】Hadoop中常出现的错误以及解决方法

【1】Hadoop 介绍

【4】Hadoop HDFS 版本升级

最近访客更多访客>>