•The Motivation For Hadoop
· Problems with traditional large-scale systems
· Requirements for a new approach
• Hadoop Basic Concepts
· An Overview of Hadoop
· The Hadoop Distributed File System
· How MapReduce Works
· Anatomy of a Hadoop Cluster
· Other Hadoop Ecosystem Components
• Writing a MapReduce Program
· The MapReduce Flow
· Examining a Sample MapReduce Program
· Basic MapReduce API Concepts
· The Driver Code
· The Mapper
· The Reducer
· Hadoop’s Streaming API
· Using Eclipse for Rapid Development
• Integrating Hadoop Into The Workflow
· Relational Database Management Systems
· Storage Systems
· Creating workflows with Oozie
· Importing Data from RDBMSs With Sqoop
· Importing Real-Time Data with Flume
· Accessing HDFS Using FuseDFS and Hoop
• Delving Deeper Into The Hadoop API
· Using Combiners
· Using LocalJobRunner Mode for Faster Development
· Reducing Intermediate Data with Combiners
· The configure and close methods for MapReduce
Setup and Teardown
· Writing Partitioners for Better Load Balancing
· Directly Accessing HDFS
· Using The Distributed Cache
• Using Hive and Pig
· Hive Basics
· Pig Basics
• Common MapReduce Algorithms
· Sorting and Searching
· Indexing
· Machine Learning with Mahout
· Term Frequency - Inverse Document Frequency
· Word Co-Occurrence
• Practical Development Tips and Techniques
· Testing with MRUnit
· Debugging MapReduce Code
· Using LocalJobRunner Mode for Easier Debugging
· Eclipse development techniques
· Retrieving Job Information with Counters
· Logging
· Splittable File Formats
· Determining the Optimal Number of Reducers
· Map-Only MapReduce Jobs
· Implementing Multiple Mappers using ChainMapper
• More Advanced MapReduce Programming
· Custom Writables and WritableComparables
· Saving Binary Data using SequenceFiles and Avro Files
· Creating InputFormats and OutputFormats
• Joining Data Sets in MapReduce Jobs
· Map-Side Joins
· The Secondary Sort
· Reduce-Side Joins
• Graph Manipulation in Hadoop
· Introduction to graph techniques
· Representing Graphs in Hadoop
· Implementing a sample algorithm: Single Source
· Shortest Path
• Creating Workflows with Oozie
· The Motivation for Oozie
· Oozie’s Workflow Definition Format
分享到:
相关推荐
Cloudera Developer Training Hadoop CCDH
Cloudera Administrator Training for Apache Hadoop 英文版
cdh练习手册,包含了hadoop,spark,hbase,impala等练习
Cloudera essentials for Apache Hadoop Learn how Apache Hadoop addresses the limitations of traditional computing, helps businesses overcome real challenges, and powers new types of big data analytics....
Cloudera_Administrator_Training.pdf
Cloudera Hadoop 5&Hadoop高阶管理及调优课程,完整版,提供课件代码资料下载。 内容简介 本教程针对有一定Hadoop基础的学员,深入讲解如下方面的内容: 1、Hadoop2.0高阶运维,包括Hadoop节点增加删除、HDFS和...
Hive JDBC Driver 2.6.5 (C6 Compatible) 官网资源 Driver-for-Apache-Hive安装 说明手册 https://www.cloudera.com/downloads/connectors/hive/jdbc/2-6-5.html Cloudera JDBC Driver for Apache Hive
Apache Sentry是Cloudera公司发布的一个Hadoop开源组件,截止目前还是Apache的孵化项目,它提供了细粒度级、基于角色的授权以及多租户的管理模式。Sentry当前可以和Hive/Hcatalog、Apache Solr 和Cloudera Impala...
Cloudera Hadoop 安装指南
Cloudera-HBase最佳实践及优化.zip Cloudera 对 HBase最佳实践及优化介绍,权威出品,值得信赖。
DELL的Hadoop部署方案 Dell|Cloudera Solution for Apache Hadoop Deployment Guide
Cloudera 培训的内部资料。 学习Cloudera的绝佳资料
文件内容:(不需要下载了) Example code for "Hadoop: The Definitive Guide, Third Edition" by Tom White. ...app2 - Cloudera's Distribution for Hadoop app3 - Preparing the NCDC Weather Data
Oozie由Cloudera公司贡献给Apache的基于工作流引擎的开源框架,是用于Hadoop平台的开源的工作流调度引擎,是用来管理Hadoop作业,属于web应用程序,由Oozie client和Oozie Server两个组件构成,Oozie Server运行于Java ...
cloudera-作业专题课程的作业,来自 Cloudera“Apache Hadoop – 本科生课程”
Ubuntu 14.04 LTS下通过Cloudera CDH 5.4.8搭建Hadoop集群.pdf
作为业界最领先的企业级数据平台软件,Cloudera企业版除了包含业界最流行的基于开源Hadoop及其生态组件构建的CDH核心,还包含了大量为支撑企业级业务的高级管理特性。 借助于Cloudera企业版的整体解决方案,企业可以...
How to set up environment in Linux for Hadoop projects using Cloudera Hadoop Distribution CDH 5. How to run a MapReduce job How to store data with Apache Hive, Apache HBase How to index data in HDFS ...
Hadoop at Cloudera: HPlab introduction about Hadoop in cloudera
本文总结了ApacheHadoop和Cloudera Hadoop的版本衍化过程,并给出了选择Hadoop版本的一些建议。感兴趣的朋友一起看看吧