collecting hadoop's logs

http://blog.cloudera.com/blog/2008/11/configuring-and-using-scribe-for-hadoop-log-collection/ http://www.myhowto.org/java/2013/01/20/collecting-diagnostic-information-from-mapreduce-jobs-in-hadoop/

2013-07-05 16:28
浏览 496
评论(0)
分类:开源软件

首先查看官方API那个的解释： ——————————————————————————————————————— intern public String intern() 返回字符串对象的规范化表示形式。一个初始时为空的字符串池，它由类 String 私有地维护。当调用 intern 方法时，如果池已经包含一个等于此 String 对象的字符串（该对象由 equals(Object) 方法确定），则返回池中的字符串。否则，将此 String 对象添加到池中，并且返回此 String 对象的引用。它遵循对于任何两个字符串 s 和 t，当且仅当 s.equals(t) 为 t ...

2013-06-19 08:08
浏览 571
评论(0)
分类:编程语言

用flume 采集log4j 日志到hdfs

1. Web server 集群数据采集采用的架构 2. 在每台web server上启动一个flume agent ( Flume1.3.1 : http://flume.apache.org/download.html )，启动命令为：./bin/flume-ng agent --conf-file ./conf/flume.conf --name a1 -Dflume.root.logger=INFO,console flume-conf文件如下：# Name the components on this agenta1.sources = r1a1.sinks ...

2013-05-29 17:24
浏览 4946
评论(0)
分类:互联网

Eclipse 安装Maven插件(转)

1先安装subeclipse插件就是svn svn - http://subclipse.tigris.org/update_1.6.x 我这里是灰色的说明我安装过了这里只是截图说明下，我就不继续安装了安装这些就可以了，多了没必要。安装过程中可能会出异常，请不用管它，subclipse官方说这异常时插件没被eclipse标注，对安装使用没啥影响，继续就好。安装完后重启eclipse点右上角中按钮，出现svn资源库研究项表示eclipse的svn插件已经按装完成。 2再安装maven插件 m2e - http://m2eclipse.son ...

2013-05-22 11:22
浏览 654
评论(0)
分类:开源软件

Mac OS上配置hadoop eclipse 调试环境

1. 配置Hadoop 将下载的Hadoop压缩文件解压缩，找到conf目录，打开core-site.xml，修改代码如下所示： Xml代码 <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>  <confi ...

2013-05-20 15:13
浏览 5149
评论(0)
分类:互联网

(转)使用Ant编译Hadoop eclipse插件

转自 http://xiaoruoen.blog.51cto.com/4828946/872274 进入%Hadoop_HOME%\src\contrib\ 编辑build-contrib.xml 加入 <property name="version" value="1.0.3"/> <property name="eclipse.home" location="D:/soft/eclipse-jee-indigo-SR1-w ...

2013-05-20 15:00
浏览 631
评论(0)
分类:互联网

【Hadoop实战】在Mac OS上配置Hadoop伪分布式环境

最近大数据很流行，而Hadoop又是分析大数据的有力工具，加之工作需要，我近期也在学习hadoop的相关知识。学习归学习，还是要实践；而实践，得先有环境。看到教科书上的一些方法，都是在linux上配置；mac与linux相近，我就在mac os上实践了。mac os版本是10.8.1，配置的是单机伪分布式环境，目的是学习hadoop程序编写；至于hadoop集群搭建，暂时不感兴趣。主要参考的资料是《hadoop实战》第二章，p18 -- p21的内容。《Hadoop实战》中文版+英文文字版+源码【PDF】下载见 http://www.linuxidc.com/Linu ...

2013-05-20 12:46
浏览 1474
评论(0)
分类:互联网

数据挖掘10大算法(1)——PageRank

1. 前言这系列的文章主要讲述2006年评出的数据挖掘10大算法（见图1）。文章的重点将偏向于算法的来源以及算法的主要思想，不涉及具体的实现。如果发现文中有错，希望各位指出来，一起讨论。 ...

2013-05-17 14:37
浏览 1345
评论(0)
分类:互联网

B树、B-树、B+树、B*树

B树即二叉搜索树： 1.所有非叶子结点至多拥有两个儿子（Left和Right）； 2.所有结点存储一个关键字； 3.非叶子结点的左指针指向小于其关键字的子树，右指针指向大于其关键� ...

2013-05-07 14:25
浏览 690
评论(0)
分类:数据库

HBase技术介绍

http://www.searchtb.com/2011/01/understanding-hbase.html HBase简介 HBase – Hadoop Database，是一个高可靠性、高性能、面向列、可伸缩的分布式存储系统，利用HBase技术可在廉价PC Server上搭建起大规模结构化存储集群。 HBase是Google Bigtable的 ...

2013-05-03 16:31
浏览 523
评论(0)
分类:开源软件

Hbase/MultipleMasters - Hadoop Wiki

http://wiki.apache.org/hadoop/Hbase/MultipleMasters This document is still a draft Since version 0.20.0 HBase supports multiple Masters to provide higher availability. It works in the same way that Bigtable does as explained in the 2006 paper. This page contains the information you need to set ...

2013-05-03 10:49
浏览 682
评论(0)
分类:开源软件

Setup Multi Hbase master on Hadoop Cluster

http://2hei.net/setup-multi-hbase-master-on-hadoop-cluster.html Setup Multi Hbase master on Hadoop Cluster to avoid single point failure. When active master failed/down for some reason exceed timeout we expected, backup master will be active and take over the role of master, see the value ...

2013-05-03 10:46
浏览 995
评论(0)
分类:开源软件

HBase HMaster Architecture

http://blog.zahoor.in/2012/08/hbase-hmaster-architecture/ HBase architecture follows the traditional master slave model where you have a master which takes decisions and one or more slaves which does the real task. In HBase, the master is called HMaster and slaves are called HRegionServers (yes ...

2013-05-03 10:44
浏览 946
评论(0)
分类:开源软件

Using the libjars option with Hadoop

http://grepalex.com/2013/02/25/hadoop-libjars/ When working with MapReduce one of the challenges that is encountered early-on is determining how to make your third-part JAR’s available to the map and reduce tasks. One common approach is to create a fat jar, which is a JAR that contains your c ...

2013-05-03 10:43
浏览 727
评论(0)
分类:开源软件

hadoop面试题（转）

Q1. Name the most common InputFormats defined in Hadoop? Which one is default ? Following 2 are most common InputFormats defined in Hadoop - TextInputFormat - KeyValueInputFormat - SequenceFileInputFormat Q2. What is the difference between TextInputFormatand KeyValueInputFormat class TextIn ...

2013-04-08 15:14
浏览 721
评论(0)
分类:互联网

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论