一. FSNamesystem概述
FSNamesystem为DataNode做簿记工作,直接点就是到DataNode的请求都是经过FSNamesystem运算后的。FSNamesystem管理着几个主要的数据结构。
- 文件名 -> 数据块(存放在FSImage和日志中)
- 合法的数据块列表(上面关系的逆关系)
- 数据块 -> DataNode(只保存在内存中,根据DataNode发过来的信息动态建立)
- DataNode上保存的数据块(上面关系的逆关系)
- 最近发送过心跳信息的DataNode(LRU)
二. FSNamesystem成员变量
依次见下面
三. FSDirectory
一个文件系统,一个FSNamesystem 一个FSDirectory。FSNamesystem 初始化时会初始化FSDirectory。
public FSDirectory dir;
四. 权限相关
//本地文件的用户文件属主和文件组,可以通过hadoop.job.ugi设置,如果没有设置,那么将使用启动HDFS的用户(通过whoami获得)和该用户所在的组(通过groups获得)作为值。 private UserGroupInformation fsOwner; //对应配置项dfs.permissions.supergroup(默认是supergroup,启动hadoop所使用的用户通常是superuser),应用在defaultPermission中,是系统的超级组。 private String supergroup; //缺省权限,缺省用户为fsOwner;缺省用户组为supergroup;缺省权限为0777,可以通过dfs.upgrade.permission修改。 private PermissionStatus defaultPermission;
五. 系统内各种状态的数据块
//正在复制的数据块 volatile long pendingReplicationBlocksCount = 0L; //损坏的数据块 volatile long corruptReplicaBlocksCount = 0L; //需要复制的数据块 volatile long underReplicatedBlocksCount = 0L; //当前正在处理的复制工作数目 volatile long scheduledReplicationBlocksCount = 0L; //超过配额的数据块 volatile long excessBlocksCount = 0L; //正在删除的数据块 volatile long pendingDeletionBlocksCount = 0L; //保存需要进行复制的数据块 private UnderReplicatedBlocks neededReplications = new UnderReplicatedBlocks(); // We also store pending replication-orders. //保存正在复制的数据块的相关信息 private PendingReplicationBlocks pendingReplications; //保存每个DataNode上无效但还存在的数据块, StorageID -> ArrayList<Block>的对应关系。 private Map<String, Collection<Block>> recentInvalidateSets = new TreeMap<String, Collection<Block>>(); //保存每个DataNode上有效,但超过配额需要删除的数据, StorageID -> TreeSet<Block>的对应关系 Map<String, Collection<Block>> excessReplicateMap = new TreeMap<String, Collection<Block>>(); //保存失效(如:校验没通过)的Block -> DataNode的对应关系 public CorruptReplicasMap corruptReplicas = new CorruptReplicasMap();
六. 系统内DataNode的状态
//Block -> BlockInfo(INode, datanodes, previous BlockInfo, next BlockInfo)的对应 final BlocksMap blocksMap = new BlocksMap(DEFAULT_INITIAL_MAP_CAPACITY, DEFAULT_MAP_LOAD_FACTOR); //保存系统内所有的Datanode, StorageID -> DatanodeDescriptor的对应关系 NavigableMap<String, DatanodeDescriptor> datanodeMap = new TreeMap<String, DatanodeDescriptor>(); //保存所有目前活着的DataNode,线程HeartbeatMonitor会定期检查。 ArrayList<DatanodeDescriptor> heartbeats = new ArrayList<DatanodeDescriptor>(); private Host2NodesMap host2DataNodeMap = new Host2NodesMap();
七. 系统容量
//系统总容量/已使用容量/剩余容量 private long capacityTotal = 0L, capacityUsed = 0L, capacityRemaining = 0L; //系统总连接数,根据DataNode心跳信息跟新。 private int totalLoad = 0;
八. 租约管理器
//租约管理器 public LeaseManager leaseManager = new LeaseManager(this);
九. 复制因子
// The maximum number of replicates we should allow for a single block private int maxReplication; // How many outgoing replication streams a given node should have at one time private int maxReplicationStreams; // MIN_REPLICATION is how many copies we need in place or else we disallow the write private int minReplication; // Default replication private int defaultReplication;
十. 心跳周期
// heartbeatRecheckInterval is how often namenode checks for expired datanodes private long heartbeatRecheckInterval; // heartbeatExpireInterval is how long namenode waits for datanode to report // heartbeat private long heartbeatExpireInterval; //replicationRecheckInterval is how often namenode checks for new replication work private long replicationRecheckInterval;
十一. 网络拓扑结构
// datanode networktoplogy NetworkTopology clusterMap = new NetworkTopology(); private DNSToSwitchMapping dnsToSwitchMapping; // for block replicas placement ReplicationTargetChooser replicator;
十二. 线程
//HeartbeatMonitor thread Daemon hbthread = null; //LeaseMonitor thread public Daemon lmthread = null; //SafeModeMonitor thread Daemon smmthread = null; //Replication thread public Daemon replthread = null; //Replication metrics private ReplicationMonitor replmon = null;
相关推荐
赠送jar包:hadoop-hdfs-client-2.9.1.jar 赠送原API文档:hadoop-hdfs-client-2.9.1-javadoc.jar 赠送源代码:hadoop-hdfs-client-2.9.1-sources.jar 包含翻译后的API文档:hadoop-hdfs-client-2.9.1-javadoc-...
赠送jar包:hadoop-hdfs-2.6.5.jar; 赠送原API文档:hadoop-hdfs-2.6.5-javadoc.jar; 赠送源代码:hadoop-hdfs-2.6.5-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.6.5.pom; 包含翻译后的API文档:hadoop...
赠送jar包:hadoop-hdfs-2.7.3.jar; 赠送原API文档:hadoop-hdfs-2.7.3-javadoc.jar; 赠送源代码:hadoop-hdfs-2.7.3-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.7.3.pom; 包含翻译后的API文档:hadoop...
赠送jar包:hadoop-hdfs-client-2.9.1.jar; 赠送原API文档:hadoop-hdfs-client-2.9.1-javadoc.jar; 赠送源代码:hadoop-hdfs-client-2.9.1-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-client-2.9.1.pom;...
赠送jar包:hadoop-hdfs-2.7.3.jar; 赠送原API文档:hadoop-hdfs-2.7.3-javadoc.jar; 赠送源代码:hadoop-hdfs-2.7.3-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.7.3.pom; 包含翻译后的API文档:hadoop...
赠送jar包:hadoop-hdfs-2.9.1.jar 赠送原API文档:hadoop-hdfs-2.9.1-javadoc.jar 赠送源代码:hadoop-hdfs-2.9.1-sources.jar 包含翻译后的API文档:hadoop-hdfs-2.9.1-javadoc-API文档-中文(简体)版.zip 对应...
赠送jar包:hadoop-hdfs-2.6.5.jar; 赠送原API文档:hadoop-hdfs-2.6.5-javadoc.jar; 赠送源代码:hadoop-hdfs-2.6.5-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.6.5.pom; 包含翻译后的API文档:hadoop...
赠送jar包:hadoop-hdfs-2.5.1.jar; 赠送原API文档:hadoop-hdfs-2.5.1-javadoc.jar; 赠送源代码:hadoop-hdfs-2.5.1-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.5.1.pom; 包含翻译后的API文档:hadoop...
赠送jar包:hadoop-hdfs-2.5.1.jar; 赠送原API文档:hadoop-hdfs-2.5.1-javadoc.jar; 赠送源代码:hadoop-hdfs-2.5.1-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.5.1.pom; 包含翻译后的API文档:hadoop...
赠送jar包:hadoop-hdfs-2.9.1.jar; 赠送原API文档:hadoop-hdfs-2.9.1-javadoc.jar; 赠送源代码:hadoop-hdfs-2.9.1-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.9.1.pom; 包含翻译后的API文档:hadoop...
hadoop-auth-3.1.1.jar hadoop-hdfs-3.1.1.jar hadoop-mapreduce-client-hs-3.1.1.jar hadoop-yarn-client-3.1.1.jar hadoop-client-api-3.1.1.jar hadoop-hdfs-client-3.1.1.jar hadoop-mapreduce-client-jobclient...
Hadoop 3.x(HDFS)----【HDFS 的 API 操作】---- 代码 Hadoop 3.x(HDFS)----【HDFS 的 API 操作】---- 代码 Hadoop 3.x(HDFS)----【HDFS 的 API 操作】---- 代码 Hadoop 3.x(HDFS)----【HDFS 的 API 操作】--...
hadoop-hdfs-2.4.1.jar
hadoop-hdfs-2.7.3搭建flume1.7需要用到的包,还有几个包也有提供
赠送jar包:hadoop-yarn-api-2.5.1.jar; 赠送原API文档:hadoop-yarn-api-2.5.1-javadoc.jar; 赠送源代码:hadoop-yarn-api-2.5.1-sources.jar; 赠送Maven依赖信息文件:hadoop-yarn-api-2.5.1.pom; 包含翻译后...
Hadoop学习总结之四:Map-Reduce的过程解析
flume 想要将数据输出到hdfs,必须要有hadoop相关jar包。本资源是hadoop 2.7.7版本
二、实验环境 Windows 10 VMware Workstation Pro虚拟机 Hadoop环境 Jdk1.8 三、实验内容 1:hdfs常见命令: (1)查看帮助:hdfs dfs -help (2)查看当前目录信息:hdfs dfs -ls / (3)创建文件夹:hdfs dfs ...
hadoop-hdfs-2.2.0.jar 点击下载资源即表示您确认该资源不违反资源分享的使用条款
赠送jar包:hadoop-yarn-server-common-2.6.5.jar; 赠送原API文档:hadoop-yarn-server-common-2.6.5-javadoc.jar; 赠送源代码:hadoop-yarn-server-common-2.6.5-sources.jar; 赠送Maven依赖信息文件:hadoop-...