`
samuschen
  • 浏览: 398481 次
  • 性别: Icon_minigender_2
  • 来自: 北京
社区版块
存档分类
最新评论

HDFS添加和删除节点

阅读更多

From  http://developer.yahoo.com/hadoop/tutorial/module2.html


Rebalancing Blocks

如何添加新节点到集群:

New nodes can be added to a cluster in a straightforward manner. On the new node, the same Hadoop version and configuration ( conf/hadoop-site.xml ) as on the rest of the cluster should be installed. Starting the DataNode daemon on the machine will cause it to contact the NameNode and join the cluster. (The new node should be added to the slaves file on the master server as well, to inform the master how to invoke script-based commands on the new node.)

 

如何在新的节点上平衡数据:

But the new DataNode will have no data on board initially; it is therefore not alleviating space concerns on the existing nodes. New files will be stored on the new DataNode in addition to the existing ones, but for optimum usage, storage should be evenly balanced across all nodes.

This can be achieved with the automatic balancer tool included with Hadoop. The Balancer class will intelligently balance blocks across the nodes to achieve an even distribution of blocks within a given threshold, expressed as a percentage. (The default is 10%.) Smaller percentages make nodes more evenly balanced, but may require more time to achieve this state. Perfect balancing (0%) is unlikely to actually be achieved.

 

The balancer script can be run by starting bin/start-balancer.sh in the Hadoop directory. The script can be provided a balancing threshold percentage with the -threshold parameter;

e.g., bin/start-balancer.sh -threshold 5 .

The balancer will automatically terminate when it achieves its goal, or when an error occurs, or it cannot find more candidate blocks to move to achieve better balance. The balancer can always be terminated safely by the administrator by running bin/stop-balancer.sh .

The balancing script can be run either when nobody else is using the cluster (e.g., overnight), but can also be run in an "online" fashion while many other jobs are on-going. To prevent the rebalancing process from consuming large amounts of bandwidth and significantly degrading the performance of other processes on the cluster, the dfs.balance.bandwidthPerSec configuration parameter can be used to limit the number of bytes/sec each node may devote to rebalancing its data store.

Copying Large Sets of Files

When migrating a large number of files from one location to another (either from one HDFS cluster to another, from S3 into HDFS or vice versa, etc), the task should be divided between multiple nodes to allow them all to share in the bandwidth required for the process. Hadoop includes a tool called distcp for this purpose.

By invoking bin/hadoop distcp src dest , Hadoop will start a MapReduce task to distribute the burden of copying a large number of files from src to dest . These two parameters may specify a full URL for the the path to copy. e.g., "hdfs://SomeNameNode:9000/foo/bar/" and "hdfs://OtherNameNode:2000/baz/quux/" will copy the children of /foo/bar on one cluster to the directory tree rooted at /baz/quux on the other. The paths are assumed to be directories, and are copied recursively. S3 URLs can be specified with s3://bucket-name /key .

Decommissioning Nodes

如何从集群中删除节点:

In addition to allowing nodes to be added to the cluster on the fly, nodes can also be removed from a cluster while it is running , without data loss. But if nodes are simply shut down "hard," data loss may occur as they may hold the sole copy of one or more file blocks.

Nodes must be retired on a schedule that allows HDFS to ensure that no blocks are entirely replicated within the to-be-retired set of DataNodes.

HDFS provides a decommissioning feature which ensures that this process is performed safely. To use it, follow the steps below:

Step 1: Cluster configuration . If it is assumed that nodes may be retired in your cluster, then before it is started, an excludes file must be configured. Add a key named dfs.hosts.exclude to your conf/hadoop-site.xml file. The value associated with this key provides the full path to a file on the NameNode's local file system which contains a list of machines which are not permitted to connect to HDFS.

Step 2: Determine hosts to decommission . Each machine to be decommissioned should be added to the file identified by dfs.hosts.exclude , one per line. This will prevent them from connecting to the NameNode.

Step 3: Force configuration reload . Run the command bin/hadoop dfsadmin -refreshNodes . This will force the NameNode to reread its configuration, including the newly-updated excludes file. It will decommission the nodes over a period of time, allowing time for each node's blocks to be replicated onto machines which are scheduled to remain active.

Step 4: Shutdown nodes . After the decommission process has completed, the decommissioned hardware can be safely shutdown for maintenance, etc. The bin/hadoop dfsadmin -report command will describe which nodes are connected to the cluster.

Step 5: Edit excludes file again . Once the machines have been decommissioned, they can be removed from the excludes file. Running bin/hadoop dfsadmin -refreshNodes again will read the excludes file back into the NameNode, allowing the DataNodes to rejoin the cluster after maintenance has been completed, or additional capacity is needed in the cluster again, etc.

 

 

 

Verifying File System Health

After decommissioning nodes, restarting a cluster, or periodically during its lifetime, you may want to ensure that the file system is healthy--that files are not corrupted or under-replicated, and that blocks are not missing.

Hadoop provides an fsck command to do exactly this. It can be launched at the command line like so:

  bin/hadoop fsck [path
] [options
]

If run with no arguments, it will print usage information and exit. If run with the argument / , it will check the health of the entire file system and print a report. If provided with a path to a particular directory or file, it will only check files under that path. If an option argument is given but no path, it will start from the file system root (/ ). The options may include two different types of options:

Action options specify what action should be taken when corrupted files are found. This can be -move , which moves corrupt files to /lost+found , or -delete , which deletes corrupted files.

Information options specify how verbose the tool should be in its report. The -files option will list all files it checks as it encounters them. This information can be further expanded by adding the -blocks option, which prints the list of blocks for each file. Adding -locations to these two options will then print the addresses of the DataNodes holding these blocks. Still more information can be retrieved by adding -racks to the end of this list, which then prints the rack topology information for each location. (See the next subsection for more information on configuring network rack awareness.) Note that the later options do not imply the former; you must use them in conjunction with one another. Also, note that the Hadoop program uses -files in a "common argument parser" shared by the different commands such as dfsadmin , fsck , dfs , etc. This means that if you omit a path argument to fsck, it will not receive the -files option that you intend. You can separate common options from fsck-specific options by using -- as an argument, like so:

  bin/hadoop fsck -- -files -blocks

The -- is not required if you provide a path to start the check from, or if you specify another argument first such as -move .

By default, fsck will not operate on files still open for write by another client. A list of such files can be produced with the -openforwrite option.

分享到:
评论

相关推荐

    HDFS Comics HDFS 漫画

    HDFS能 够提供对数据的可扩展访问,通过简单地往集群里添加节点就可以解决大量客户端同时访问的问题。HDFS支持传统的层次文件组织结构,同现 有的一些文件系 统类似,如可以对文件进行创建、删除、重命名等操作。

    HDFS的Trash回收站功能的配置和使用

    HDFS能 够提供对数据的可扩展访问,通过简单地往集群里添加节点就可以解决大量客户端同时访问的问题。HDFS支持传统的层次文件组织结构,同现 有的一些文件系 统类似,如可以对文件进行创建、删除、重命名等操作。

    HDFS集群搭建

    HDFS能 够提供对数据的可扩展访问,通过简单地往集群里添加节点就可以解决大量客户端同时访问的问题。HDFS支持传统的层次文件组织结构,同现 有的一些文件系 统类似,如可以对文件进行创建、删除、重命名等操作。

    hadoop动态增加和删除节点方法介绍

    上一篇文章中我们介绍了Hadoop编程基于MR程序实现倒排索引示例的有关内容,这里我们看看如何在Hadoop中动态地增加和删除节点(DataNode)。 假设集群操作系统均为:CentOS 6.7 x64 Hadoop版本为:2.6.3 一、动态...

    论文研究-海量样本数据集中小文件的存取优化研究.pdf

    HDFS)在海量样本数据集存储方面存在内存占用多、读取效率低的问题,以及分布式数据库HBase在存储文件名重复度和类似度高时产生访问热点的问题,结合样本数据集的特点、类型,提出一种面向样本数据集存取优化方案,...

    大数据平台CDH和Impala的使用

    ClouderaManager的功能:管理:对集群进行管理,如添加、删除节点等操作。监控:监控集群的健康情况,对设置的各种指标和系统运行情况进行全面监控。诊断:对集群出现的问题进行诊断,对出现的问题给出建议解决方案...

    java连接sqoop源码-Security_Labs:安全实验室

    在另一个节点上添加 KMS HDFS 加密练习 将 Hive 仓库移至 EZ 安全的 Hadoop 练习 高密度文件系统 蜂巢 HBase Sqoop 删除加密的 Hive 表 将 Knox 配置为通过 AD 进行身份验证 利用 Knox 连接到 Hadoop 集群服务 网络...

    fourinone-3.04.25

    1、元数据访问,添加删除,按块拆分, 高性能并行读写,排他读写(按文件部分内容锁定),随机读写,集群复制等 2、对集群文件的解析支持(包括按行,按分割符,按最后标识读取) 3、对整形数据的高性能读写支持...

    Fourinone分布式并行计算四合一框架

     1、元数据访问,添加删除,按块拆分, 高性能并行读写,排他读写(按文件部分内容锁定),随机读写,集群复制等  2、对集群文件的解析支持(包括按行,按分割符,按最后标识读取)  3、对整形数据的高性能读写支持...

    hadoop大数据实战手册

    第1 章HDFS 的数据存储………….. .....………………… ……………………····· ··· …..... ... 2 1.1 HDFS 内存存储...............…··························· ·· ···...

    大数据非关系型数据库课程设计基于Scala的交通拥堵预测源码+项目说明.zip

    **a)** 用户传入想要进行预测的时间节点,读取该时间节点之前3分钟,2分钟和1分钟的数据 **b**)此时应该已经得到了历史数据集,通过该历史数据集预测传入时间点的车流状态 **提示**:为了方便观察测试,建议传一个...

    大数据学习笔记.pdf

    第8章 zookeeper的leader节点选择 ................................................................................. 31 第9章 zookeeper安装 ..................................................................

Global site tag (gtag.js) - Google Analytics