- 浏览: 10879 次
- 性别:
- 来自: 北京
最新评论
文章列表
规划:
namenode:crxy1和crxy2
datanode:crxy3、crxy4、crxy5
journalnode:crxy1、crxy2、crxy3
resourcemanager:crxy1
nodemanager:crxy3、crxy4、crxy5
zookeeper: crxy1,crxy2,crxy3
--------------------------------------------------
基础设置
---设置好静态ip
克隆虚拟机ifconfig发现只有eth1.没有eth0
1 编辑 /etc/udev/rules.d/70-persiste ...
hadoop2.2.0的伪分布安装
- 博客分类:
- hadoop2.x
软件的安装目录:
/opt/modules/
安装:
0)说明
1、系统:CentOS 6.4 64位
2、关闭防火墙和SELinux
service iptables status
service iptables stop
chkconfig iptables off
vi /etc/sysconfig/selinux
...
mapred代码示例--自定义分组
- 博客分类:
- hadoop1.x
package group;
import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
import java.net.URI;
import java.net.URISyntaxException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apa ...
mapred代码示例--二次排序
- 博客分类:
- hadoop1.x
package sort;
import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
import java.net.URI;
import java.net.URISyntaxException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apac ...
package join;
import java.net.URI;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;
import org.apache.hadoop.io.LongWritable;
import org.apache.ha ...
package join;
import java.io.BufferedReader;
import java.io.FileReader;
import java.net.URI;
import java.util.HashMap;
import java.util.Map;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.filecache.DistributedCache;
import org.apache.hadoop.fs.FSDataInputStream;
import org.ap ...
mapred代码示例--自定义分区
- 博客分类:
- hadoop1.x
package partitioner;
import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
import java.net.URI;
import java.util.ArrayList;
import java.util.List;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import ...
package combiner;
import java.net.URI;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop. ...
package counter;
import java.net.URI;
import java.util.ArrayList;
import java.util.List;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apac ...
mapred代码示例--旧api的写法
- 博客分类:
- hadoop1.x
package old;
import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.File ...
package cmd;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import o ...
hadoop1.x环境搭建
- 博客分类:
- hadoop1.x
1.1 本地模式:运行时不使用HDFS
1.2 伪分布模式:在一个节点上运行hadoop集群
1.3 集群模式:在生产中真正使用的,hadoop的各个进程运行在集群的很多节点上
2.部署集群开发环境
在windows的C:\Windows\System32\drivers\etc\hosts增加ip与主机名绑定
2.1 设置静态ip
(1)使用vi编辑文件/etc/sysconfig/network-scripts/ifcfg-eth2
BOOTPROTO=static
IPADDR=192.168.1.191
NETMASK=255.255.255.0
GA ...
1.向表中装载数据
load data local inpath '${env:HOME}/california-employees'
overwrite into table employees
partition (country='US',state='CA');
2.通过查询语句向表中插入数据
insert overwrite/into table employees
partition (country='US',state='OR')
select * from staged_employees se
where se.cnty='US' and se.st='OR';
全表扫 ...
hive的cli命令变量和属性
- 博客分类:
- hive
set -------打印出命名空间hivevar,hiveconf,system,env所有的变量
set -v + hadoop定义的属性,例如HDFS和MapReduce的属性
$ hive --define foo=bar; | $hive --hivevar foo=bar;
hive>set foo;
foo=bar
hive>set hivevar:foo
hivevar:foo=bar
//显示当前数据库名
$ hive --hiveconf hive.cli.print.current.db=true
hive (default) >set hiv ...
1.解压安装包
tar -zxvf hive-0.9.0.tar.gz
2.修改$HIVE_HOME/conf里文件,去掉.template,
cp hive-defalt.xml hive-site.xml
3.修改$HIVE_HOME/bin的hive-config.sh,增加以下三行
export JAVA_HOME=/usr/local/jdk
export HIVE_HOME=/usr/local/hive
export HADOOP_HOME=/usr/local/hadoop
4.修改hive 的metastore 到mysq ...