- 浏览: 275446 次
- 性别:
- 来自: 广州
文章分类
- 全部博客 (247)
- free talking (11)
- java (18)
- search (16)
- hbase (34)
- open-sources (0)
- architect (1)
- zookeeper (16)
- vm (1)
- hadoop (34)
- nutch (33)
- lucene (5)
- ubuntu/shell (8)
- ant (0)
- mapreduce (5)
- hdfs (2)
- hadoop sources reading (13)
- AI (0)
- distributed tech (1)
- others (1)
- maths (6)
- english (1)
- art & entertainment (1)
- nosql (1)
- algorithms (8)
- hadoop-2.5 (16)
- hbase-0.94.2 source (28)
- zookeeper-3.4.3 source reading (1)
- solr (1)
- TODO (3)
- JVM optimization (1)
- architecture (0)
- hbase-guideline (1)
- data mining (3)
- hive (1)
- mahout (0)
- spark (28)
- scala (3)
- python (0)
- machine learning (1)
最新评论
-
jpsb:
...
为什么需要分布式? -
leibnitz:
hi guy, this is used as develo ...
compile hadoop-2.5.x on OS X(macbook) -
string2020:
撸主真土豪,在苹果里面玩大数据.
compile hadoop-2.5.x on OS X(macbook) -
youngliu_liu:
怎样运行这个脚本啊??大牛,我刚进入搜索引擎行业,希望你能不吝 ...
nutch 数据增量更新 -
leibnitz:
also, there is a similar bug ...
2。hbase CRUD--Lease in hbase
经过几天来的看资料,写代码,终于对这个东东有点眉目了。
package linhon.crud; import java.util.Date; import java.util.Map.Entry; import java.util.NavigableMap; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.client.Delete; import org.apache.hadoop.hbase.client.Get; import org.apache.hadoop.hbase.client.HBaseAdmin; import org.apache.hadoop.hbase.client.HTable; import org.apache.hadoop.hbase.client.Put; import org.apache.hadoop.hbase.client.Result; import org.apache.hadoop.hbase.client.ResultScanner; import org.apache.hadoop.hbase.client.Scan; import org.apache.hadoop.hbase.util.Bytes; /*** * test hbase crud operations * @author leibnitz * @create jan,12,11 */ public class TestHbaseCrud { /** * 不存在rowKey则添加;否则代表修改某column(s).这些操作在行级上更新是原子的。 * @param tableName * @param rowkey * @param content * @param addTime * @throws Exception */ public static void add(String tableName,int rowkey,String content,Date addTime) throws Exception{ HBaseConfiguration hbaseConf = new HBaseConfiguration(); HTable htable = new HTable(hbaseConf, tableName); htable.setAutoFlush(false); htable.setWriteBufferSize(1024 * 5); //add byte[] rowKey = Bytes.toBytes(rowkey); Put put = new Put(rowKey ); if(content != null) put.add(Bytes.toBytes("info"), Bytes.toBytes("content"), addTime.getTime(),Bytes.toBytes(content)); if(addTime != null) //can add more than one column at the same time put.add(Bytes.toBytes("info"), Bytes.toBytes("add_time"), addTime.getTime(),Bytes.toBytes(addTime.getTime())); htable.put(put); htable.flushCommits(); htable.close(); //invoke flushCommits() also } /** * add a column(member) to specified row * @param tableName * @param rowkey * @param family * @param column * @throws Exception */ public static void addColumnOnly(String tableName,int rowkey,String family,String column) throws Exception{ HBaseConfiguration hbaseConf = new HBaseConfiguration(); HTable htable = new HTable(hbaseConf, tableName); htable.setAutoFlush(false); htable.setWriteBufferSize(1024 * 5); //add byte[] rowKey = Bytes.toBytes(rowkey); Put put = new Put(rowKey ); put.add(Bytes.toBytes(family), Bytes.toBytes(column),Bytes.toBytes("")); htable.put(put); htable.flushCommits(); htable.close(); //invoke flushCommits() also } public static void query(String tblName,int rowKey,String family,String... columns) throws Exception{ HBaseConfiguration hconf = new HBaseConfiguration(); HTable htbl = new HTable(hconf,tblName); Scan s = new Scan(); ResultScanner scan = htbl.getScanner(s); //add a filer param if necessary Result rst = null; while(( rst = scan.next() ) != null){ //scan by row int row = Bytes.toInt(rst.getRow()); System.out.println("row:" + row ); for(String col : columns){ //NOTE :可以使用rst.list()显示所有列 if(col.contains("time") || col.contains("date")){ System.out.printf(" %s:%2$tF %2$tH:%2$tM:%2$tS ", col,Bytes.toLong(rst.getValue(Bytes.toBytes(family),Bytes.toBytes(col)))); }else{ String content = Bytes.toString(rst.getValue(Bytes.toBytes(family), Bytes.toBytes(col))); System.out.printf(" %s:%s " ,col,content); } byte[] key = Bytes.toBytes(rowKey); long ts = 1295977940837l;//1294813460620l;//1295977421536l;//1295976774855l;//1295969908063l;//1294813460625l; //note:the second column param is family instead of column. // String qualifier = family + KeyValue.COLUMN_FAMILY_DELIMITER + col; final Get g = new Get(key); g.addColumn(Bytes.toBytes(family), Bytes.toBytes(col)); g.setTimeStamp( ts); //query by time range.this means time range:[ts,ts+1) boolean b = htbl.exists(g); System.out.println(" has versions:" + ts + "," + b); } } scan.close(); htbl.close(); } /** * test retrieve by versions * @param tblName * @param rowKey * @param family * @param maxVersions 由于建表时指定只保留二个版本,所以如果大于2时输出不会有三个版本。 * @param columns * @throws Exception */ public static void queryByMaxVersions(String tblName,int rowKey,String family,int maxVersions,String...columns) throws Exception{ HBaseConfiguration hconf = new HBaseConfiguration(); HTable htbl = new HTable(hconf,tblName); final Get g = new Get(Bytes.toBytes(rowKey)); if(columns == null || columns.length == 0) g.addColumn(Bytes.toBytes(family)); else{ for(String col : columns){ g.addColumn(Bytes.toBytes(family), Bytes.toBytes(col)); } } g.setMaxVersions(maxVersions); Result rst = htbl.get(g); // System.out.println(rst.getMap()); for(Entry<byte[], NavigableMap<byte[], NavigableMap<Long, byte[]>>> entry : rst.getMap().entrySet()){ System.out.println("family: " + Bytes.toString(entry.getKey())); for(Entry<byte[],NavigableMap<Long, byte[]>> entry2 : entry.getValue().entrySet()){ String col = Bytes.toString(entry2.getKey()); System.out.println(" qualifier: " + col); for(Entry<Long, byte[]> entry3 : entry2.getValue().entrySet()){ if(col.contains("time") || col.contains("date")){ System.out.println(" version: " + entry3.getKey() + ",value:" + Bytes.toLong(entry3.getValue())); }else{ System.out.println(" version: " + entry3.getKey() + ",value:" + Bytes.toString(entry3.getValue())); } } } } // 当输出所有columns,并且maxVersions >=2时,output is: // family: info // qualifier: add_time // version: 1295977940837,value:1295977940837 已经是倒序输出(比早版本大) // version: 1295977489609,value:1295977488769 此版本小 // qualifier: content // version: 1295977940837,value:linhon 同上 // version: 1295976774855,value:bye,linhon htbl.close(); } //见add() public static void modify(){ } /** * 删除可以根据以下条件进行: * 1.family or family+column * 2.timestamp range * 3.regexp */ public static void deleteColumnData(String tblName,int rowKey,String family,String column,long timestamp) throws Exception{ HBaseConfiguration hconf = new HBaseConfiguration(); HTable htbl = new HTable(hconf,tblName); Delete dlt = new Delete(Bytes.toBytes(rowKey)); dlt.deleteColumn(Bytes.toBytes(family), Bytes.toBytes(column), timestamp); htbl.delete(dlt); htbl.flushCommits(); htbl.close(); } /** * delete the column(and data) but family * @param tblName * @param rowKey * @param family * @param column * @param timestamp * @throws Exception */ public static void deleteColumnFamily(String tblName,String family,String column) throws Exception{ HBaseConfiguration hconf = new HBaseConfiguration(); HBaseAdmin admin = new HBaseAdmin(hconf); //disable table is a must if(admin.isTableEnabled(tblName)) admin.disableTable(tblName); admin.deleteColumn(tblName, family /*+ ":" + column*/); //columnName参数是任意family,':',qualifier组合的,有没有qualifier均可 // admin.enableTable(tblName); //this is a artifice(技巧) admin.flush(tblName); } /** * @param args */ public static void main(String[] args) throws Exception{ // add("test_user",1,"linhon",new Date()); // add("test_user",1,"hello,linhon",new Date()); // add("test_user",1,"bye,linhon",new Date()); // add("test_user",1,null,new Date()); // add("test_user",1,null,new Date()); // System.out.println(System.currentTimeMillis()); // query("test_user",1,"info",new String[]{"content","add_time"}); // queryByMaxVersions("test_user",1,"info",3,new String[]{"content","add_time"}); // queryByMaxVersions("test_user",1,"info",3,new String[]{"content"/*,"add_time"*/}); // addColumnOnly("test_user", 1, "info", "age"); // deleteColumnData("test_user",1,"info","age",1296030610746l); // deleteColumnFamily("test_user","info","age"); addColumnOnly("test_user2", 1, "num", "age"); // deleteColumnFamily("test_user2","num","age"); //test table } } 我觉得既然它有横向切分(书上是这样说的,但没有在真正分布式跑过,只在伪分布,所以不是否正确??),非结构化 儲存,支持版本化,那么就不应该只是进行简单的CRUD的普通表似的操作,所以我挖倔一些新功能点出来。 注意问题: 1.旧版本的:exists(final byte [] row, final byte [] column,long timestamp),其中的timestamp代表是从0开始到timestamp 的time range;新版本的exists(Get)可以指定一个具体的timestamp范围而不是使用从0开始的范围。 hbase(main):014:0> scan 'test_user' ROW COLUMN+CELL \x00\x00\x00\x01 column=info:add_time, timestamp=1294813460625, value=\x00\x00\x01-x\xE5uw \x00\x00\x00\x01 column=info:content, timestamp=1295976774855, value=bye,linhon 2.pub或get中的addColumn(column)如果只有一个参数,代表这是old format column,that means the form is:<family:column> 3.Htable是对表数据的修改查询操作;HBaseAdmin是对表结构操作; 4.在shell下进行的scan操作,各cell只输出最后一个version的value 5.添加数据时,row key是必须指定的。 6.在已有数据情况下添加新column,HTable中需要指定一个rowkey,代表只添加到些行上,其它行是没有这列数据的。 7.deleteColumn(tbl,col)使用family+":"+column作为col时删除全部列(family) 8.hbase无法做到动态增加/删除列族(要先disable);删除只能删除列族,不能单独删除column成员
发表评论
-
zookeeper-network partition
2016-02-29 17:36 1823env: hbase,94.26 zookeeper, ... -
hbase-export table to json file
2015-12-25 17:21 1642i wanna export a table to j ... -
hbase-logroll optimizations
2015-09-21 12:10 1024as u know,the hbas's dat ... -
hoya--hbase on yarn
2015-04-23 17:00 406Introducing Hoya – HBase on YA ... -
upgrades of hadoop and hbase
2014-10-28 11:39 7121.the match relationships ... -
hbase-how to know which regionservers are stuck when shutdown a cluster
2014-10-22 15:09 549u know,when u shutdowning a ... -
hbase-bloom filter
2014-10-20 16:06 319there is no notable performa ... -
compile hbase
2014-10-15 17:42 866IF u want to compile a ... -
hbase-region balancer
2014-08-28 13:52 657why along with the time go ... -
HBase Versus Bigtable(comparison)
2014-07-01 23:13 540below views are all from hbas ... -
hbase -how many regions are fit for a table when prespiting or keeping running
2014-06-27 11:46 683how many regions are thinked ... -
hbase -tables replication/snapshot/backup within/cross clusters
2014-06-24 18:09 779serial no soluti ... -
some important optimized advices for hbase-0.94.x
2014-06-20 17:56 1194The following gives you ... -
hbase PerformanceEvaluation benchmark - 0.94.2 VS 0.94.16 VS 0.96
2014-03-23 17:56 1053i worked to benchmark hbase p ... -
install snappy compression in hadoop and hbase
2014-03-08 00:36 4301.what is snappy ... -
downgrade hbase from 0.94.16 to 0.94.2
2014-03-07 02:25 1425we have a cluster about 25 no ... -
2。hbase CRUD--Caching VS Batch
2013-08-26 17:28 7751 in case of ONE region,a ta ... -
5。hbase高级部分:compact/split/balance及其它维护原理-delete table
2013-08-18 02:53 1064* table disabled is NOT ... -
2。hbase CRUD--Read(Scan) operations(server side)
2013-08-15 17:39 932just recovered from a di ... -
2。hbase CRUD--Read(query) operations
2013-08-15 16:41 823read Note: -Ge ...
相关推荐
Hbase初探
A.3实验三:熟悉常用的HBase操作 本实验对应第5章的内容。 A.3.1 实验目的 (1)理解HBase在Hadoop体系结构中的角色。(2)熟练使用HBase操作常用的 Shell命令。(3)熟悉HBase操作常用的 Java API。 A.3.2 实验平台 (1...
java 利用 sping-data-hadoop HbaseTemplate 操作hbase find get execute 等方法 可以直接运行
搭建pinpoint需要的hbase初始化脚本hbase-create.hbase
hbase 权限三种方式
就像Bigtable利用了Google文件系统(File System)所提供的分布式数据存储一样,HBase在Hadoop之上提供了类似于Bigtable的能力。HBase是Apache的Hadoop项目的子项目。HBase不同于一般的关系数据库,它是一个适合于非...
hadoop,hbase,zookeeper安装笔记hadoop,hbase,zookeeper安装笔记hadoop,hbase,zookeeper安装笔记
1. HBase有哪些基本的特征? 1 HBase特征: 1 2. HBase相对于关系数据库能解决的问题是什么? 2 HBase与关系数据的区别? 2 HBase与RDBMS的区别? 2 3. HBase的数据模式是怎么样的?即有哪些元素?如何存储?等 3 1...
HBase开发实战,HBase学习利器:HBase实战
java操作Hbase之从Hbase中读取数据写入hdfs中源码,附带全部所需jar包,欢迎下载学习。
HBase 官方文档.pdf HBase的操作和编程.pdf HBase Cpressr优化与实验 郭磊涛.pdf null【HBase】Data Migratin frm Gri t Clu Cmputing - Natural Sienes .pdf 分布式数据库HBase快照的设计与实现.pdf 【HBase】...
HBASE
HBase开启审计日志
统一了HBase1.x和HBase2.x的实现,并提供了读写HBase的ORM的支持,同时,sdk还对HBase thrift 的客户端API进行了池化封装,(类似JedisPool),消除了直接使用原生API的各种问题,使之可以在生产环境下稳定工作。
课时19:搭建分布式HBase集群之HBase部署 课时20:sqoop2部署 课时21:使用sqoop2将mysql数据导入到HBase 课时22:集群管理之节点管理与数据任务 课时23:Rowkey设计与集群常见故障处理 课时24:集群调优经验...
java操作Hbase之Hbase专用过滤器PageFilter的使用源代码,附带全部所需源代码,欢迎下载学习。
说明:使用外部zookeeper3.4.13之hbase2.3.5一键部署工具,支持部署、启动、停止、清除、连接,支持自定义服务端口,数据存储目录等功能,已在生产环境使用。 Options: deploy.sh build single 构建并启动一个hbase...
HBase3.0参考指南 This is the official reference guide for the HBase version it ships with. Herein you will find either the definitive documentation on an HBase topic as of its standing when the ...
│ Day15[Hbase 基本使用及存储设计].pdf │ ├─02_视频 │ Day1501_Hbase的介绍及其发展.mp4 │ Day1502_Hbase中的特殊概念.mp4 │ Day1503_Hbase与MYSQL的存储比较.mp4 │ Day1504_Hbase部署环境准备.mp4 │ Day...
HBase实践之MOB使用指南(未翻译)