Tinkerpop + JanusGraph + Hbase简单实现
〇、机器环境说明
192.168.1.2 master 安装配置好jdk1.8;拥有一套正常运行的Hadoop+Hbase+Zookeeper集群. ... 集群其他信息自定义.
注:文档中的主机IP,主机名称,安装目录仅供参考,请以实际环境进行调整。
Apache TinkerPop™是图形数据库(OLTP)和图形分析系统(OLAP)的图形计算框架。
JanusGraph是一个可扩展的图形数据库,专门用于存储和查询包含数千亿个分布在多机群集中的顶点和边缘的图形。JanusGraph是一个事务数据库,可以支持数千个并发用户实时执行复杂的图形遍历。
一、Hadoop+Hbase+Zookeeper的集群搭建
此过程虽不麻烦, 但一时讲清楚, 还是很费时间的, 在这就省略了.
二、创建目录, 下载Tinkerpop
根据JanusGraph官方文档<<Table B.1. Version Compatibility Matrix>>提供的信息, 在此次搭建中, 采用Tinkerpop 3.2.7, JanusGraph 0.2.0, 相关地址http://docs.janusgraph.org/latest/version-compat.html.
1. 创建目录
[root@master ~]# mkdir -p /usr/local/program [root@master ~]# cd /usr/local/program/ [root@master program]# pwd /usr/local/programe2. 下载Tinkerpop
[root@master program]# wget http://www-eu.apache.org/dist/tinkerpop/3.2.7/apache-tinkerpop-gremlin-console-3.2.7-bin.zip # console, 相关地址http://tinkerpop.apache.org/. [root@master program]# wget http://mirror.bit.edu.cn/apache/tinkerpop/3.2.7/apache-tinkerpop-gremlin-server-3.2.7-bin.zip # server,相关地址http://tinkerpop.apache.org/.3. 解压Tinkerpop
[root@master program]# unzip apache-tinkerpop-gremlin-server-3.2.7-bin.zip [root@master program]# unzip apache-tinkerpop-gremlin-console-3.2.7-bin.zip [root@master program]# ls -1 apache-tinkerpop-gremlin-console-3.2.7 apache-tinkerpop-gremlin-server-3.2.7 apache-tinkerpop-gremlin-console-3.2.7-bin.zip apache-tinkerpop-gremlin-server-3.2.7-bin.zip
三、在Tinkerpop Server上安装JanusGraph依赖
Note: 第四节有讲述如何配置grapeConfig.xml的内容, 如果下载失败, 可参考第四节.
1. 进入apache-tinkerpop-gremlin-server-3.2.7下载依赖
安装方式: bin/gremlin-server.sh -i org.janusgraph janusgraph-all $VERSION, 在这里下载JanusGraph的0.2.0版本.
[root@master apache-tinkerpop-gremlin-server-3.2.7]# bin/gremlin-server.sh -i org.janusgraph janusgraph-all 0.2.02. 重要提示(Important, 这个地方应该重点显示, 为了好看, 就这样了)
下面引用JanusGraph官方文档中7.4.2. Using TinkerPop Gremlin Server with JanusGraph的一句话:
The above command uses Groovy Grape and if it is not configured properly download errors may ensue. Please refer to this section of the TinkerPop documentation for more information around setting up ~/.groovy/grapeConfig.xml.
相关地址http://docs.janusgraph.org/latest/server.html.
大意就是:
以上命令使用Groovy Grape,如果没有正确配置,可能会出现下载错误。有关设置的更多信息,请参阅TinkerPop文档的这一部分~/.groovy/grapeConfig.xml。
上文提到的地址http://tinkerpop.apache.org/docs/3.2.6/reference/#gremlin-applications.
四、为避免下载依赖失败, 修改Groovy Grape
Note: Windows目录: C:\Users\[User_Name]\.groovy; Linux目录: ~/.groovy
修改内容如下:
1. 在上述目录中创建grapeConfig.xml文件, 或目录不存在, 先创建目录.
2. 修改文件内容如下:
<ivysettings> <settings defaultResolver="downloadGrapes"/> <property name="m2-pattern" value="${user.home}/.m2/repository/org/apache/tinkerpop/[module]/[revision]/[module]-[revision](-[classifier]).[ext]" /> <property name="m2-pattern-ivy" value="${user.home}/.m2/repository/org/apache/tinkerpop/[module]/[revision]/[module]-[revision](-[classifier]).pom" /> <caches> <cache name="nocache" useOrigin="true" /> </caches> <resolvers> <chain name="downloadGrapes"> <filesystem name="local-maven2" checkmodified="true" changingPattern=".*" changingMatcher="regexp" m2compatible="true" cache="nocache"> <artifact pattern="${m2-pattern}"/> <ivy pattern="${m2-pattern-ivy}"/> </filesystem> <filesystem name="cachedGrapes"> <ivy pattern="${user.home}/.groovy/grapes/[organisation]/[module]/ivy-[revision].xml"/> <artifact pattern="${user.home}/.groovy/grapes/[organisation]/[module]/[type]s/[artifact]-[revision].[ext]"/> </filesystem> <ibiblio name="ibiblio" m2compatible="true"/> <ibiblio name="local" root="file:${user.home}/.m2/repository/" m2compatible="true"/> <ibiblio name="oracle" root="http://download.oracle.com/maven" m2compatible="true"/> </chain> </resolvers> </ivysettings>
3. 重新执行第三节的内容.
五、参数设置(master节点操作)
1. 在${TINKERPOP_HOME}/conf/目录创建janusgraph-hbase-server.properties, 添加内容如下:
gremlin.graph=org.janusgraph.core.JanusGraphFactory storage.backend=hbase # 存储方式为Hbase. storage.hostname=localhost # Zookeeper地址, 需要根据实际地址修改, 多台用','间隔. # 以下信息可参考, 不影响下面步骤. cache.db-cache = true cache.db-cache-clean-wait = 20 cache.db-cache-time = 180000 cache.db-cache-size = 0.5 # 以下信息可参考, 用于配置ElasticSearch索引信息, 不影响下面步骤. index.search.backend=elasticsearch # 索引方式. index.search.hostname=localhost # ElasticSearch主机地址, 多台用','间隔. index.search.port=9200 # ElasticSearch通信端口. index.search.elasticsearch.client-only=false
2. 在${TINKERPOP_HOME}/conf/目录创建janusgraph-gremlin-server.yaml, 添加内容如下:
host: 0.0.0.0 port: 8182 scriptEvaluationTimeout: 300000 channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer graphs: { graph: conf/janusgraph-hbase-server.properties } plugins: - janusgraph.imports scriptEngines: { gremlin-groovy: { imports: [java.lang.Math], staticImports: [java.lang.Math.PI], scripts: [scripts/empty-sample.groovy]}} serializers: - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }} - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0, config: {ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }} - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }} - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }} - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }} - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistryV1d0] }} processors: - { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }} - { className: org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor, config: { cacheExpirationTime: 600000, cacheMaxSize: 1000 }} metrics: { consoleReporter: {enabled: true, interval: 180000}, csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv}, jmxReporter: {enabled: true}, slf4jReporter: {enabled: true, interval: 180000}, gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST}, graphiteReporter: {enabled: false, interval: 180000}} maxInitialLineLength: 4096 maxHeaderSize: 8192 maxChunkSize: 8192 maxContentLength: 65536 maxAccumulationBufferComponents: 1024 resultIterationBatchSize: 64 writeBufferLowWaterMark: 32768 writeBufferHighWaterMark: 65536 3. 在${TINKERPOP_HOME}/scripts/目录创建empty-sample.groovy,(此文件默认存在, 可直接使用.) 添加内容如下: def globals = [:] globals << [g : graph.traversal()]
六、运行Tinkerpop Server
[root@master apache-tinkerpop-gremlin-server-3.2.7]# bin/gremlin-server.sh conf/janusgraph-gremlin-server.yaml
1. 当显示如下内容时, 说明启动成功:
...... INFO] GremlinServer - Executing start up LifeCycleHook [INFO] Logger$info - Executed once at startup of Gremlin Server. [INFO] AbstractChannelizer - Configured application/vnd.gremlin-v1.0+gryo with org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0 [WARN] AbstractChannelizer - The org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0 serialization class is deprecated. [INFO] AbstractChannelizer - Configured application/vnd.gremlin-v1.0+gryo-lite with org.apache.tinkerpop.gremlin.driver.ser.GryoLiteMessageSerializerV1d0 [INFO] AbstractChannelizer - Configured application/vnd.gremlin-v1.0+gryo-stringd with org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0 [INFO] AbstractChannelizer - Configured application/vnd.gremlin-v1.0+json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0 [INFO] AbstractChannelizer - Configured application/vnd.gremlin-v2.0+json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV2d0 [INFO] AbstractChannelizer - Configured application/json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0 [INFO] GremlinServer$1 - Gremlin Server configured with worker thread pool of 1, gremlin pool of 32 and boss thread pool of 1. [INFO] GremlinServer$1 - Channel started at port 8182.
2. 当提示以下内容时, 修改文件, 去除\t
[ERROR] GremlinServer - Configuration file at conf/janusgraph-gremlin-server.yaml could not be found or parsed properly. [while scanning for the next token found character '\t(TAB)' that cannot start any token. (Do not use \t(TAB) for indentation) in 'reader', line 6, column 1: graph: conf/janusgraph-hbase-se ... ^ ]
七、通过Tinkerpop Console测试
1. 进入apache-tinkerpop-gremlin-console-3.2.7目录, 运行:
[root@master apache-tinkerpop-gremlin-console-3.2.7]# ./bin/gremlin.sh Feb 08, 2018 4:57:21 PM java.util.prefs.FileSystemPreferences$1 run INFO: Created user preferences directory. \,,,/ (o o) -----oOOo-(3)-oOOo----- plugin activated: tinkerpop.server plugin activated: tinkerpop.utilities plugin activated: tinkerpop.tinkergraph gremlin> :remote connect tinkerpop.server conf/remote.yaml ==>Configured localhost/127.0.0.1:8182 gremlin> :> g.V().count() ==> 0
运行无阻时, 说明部署完成. 可以继续往下进行.
相关推荐
1、内容概要:Hadoop+Spark+Hive+HBase+Oozie+Kafka+Flume+Flink+Elasticsearch+Redash等大数据集群及组件搭建指南(详细搭建步骤+实践过程问题总结)。 2、适合人群:大数据运维、大数据相关技术及组件初学者。 3、...
搭建Hadoop集群,并使用flume+kafka+storm+hbase实现日志抓取分析,使用一个主节点master、两个slave节点
基于SpringBoot + Kafka + Redis + InfluxDB + HBase + Grafana 的风控系统,项目经过严格测试,确保可以运行! 基于SpringBoot + Kafka + Redis + InfluxDB + HBase + Grafana 的风控系统,项目经过严格测试,确保...
本科毕业设计项目,基于spark streaming+flume+kafka+hbase的实时日志处理分析系统 基于spark streaming+flume+kafka+hbase的实时日志处理分析系统 本科毕业设计项目,基于spark streaming+flume+kafka+hbase的...
基于Flink+ClickHouse构建的分析平台,涉及 Flink1.9.0 、ClickHouse、Hadoop、Hbase、Kafka、Hive、Jmeter、Docker 、HDFS、MapReduce 、Zookeeper 等技术
基于spark streaming+flume+kafka+hbase的实时日志处理分析系统源码(分控制台版本和Web UI可视化版本).zip基于spark streaming+flume+kafka+hbase的实时日志处理分析系统源码(分控制台版本和Web UI可视化版本).zip...
Centos+Hadoop+Hive+HBase
里面搭建的是Hbase+es的Janusgraph,里面有三份文档,包含主机分配情况,搭建过程、还有hadoop,hbase,zookeeper等的相关配置
Hadoop+Hive+Mysql+Zookeeper+Hbase+Sqoop详细安装手册
SpringBoot + Kafka + Redis + InfluxDB + HBase + Grafana 风控系统
Hadoop-0.20.0-HDFS+MapReduce+Hive+HBase十分钟快速入门
HDFS+MapReduce+Hive+HBase十分钟快速入门.pdf
基于Python+SpringBoot+Vue+HDFS+MapReduce+HBase+Hive+Kafka+Sp
大数据实习hdfs+flume+kafka+spark+hbase+hive项目.zip
源码主要用于学习:1. Spring Boot+Hadoop+Hive+Hbase实现数据基本操作,Hive数据源使
hadoop + zookeeper +hive + hbase安装学习共12页word资料.pdf
HDFS+MapReduce+Hive+HBase十分钟快速入门,包括这几个部分的简单使用
HDFS+MapReduce+Hive+HBase十分钟快速入门