`
knight_black_bob
  • 浏览: 823232 次
  • 性别: Icon_minigender_1
  • 来自: 北京
社区版块
存档分类
最新评论

linux pig 安裝使用

阅读更多

 

0.准备工作 hadoop 服务器

10.156.50.35 yanfabu2-35.base.app.dev.yf zk1  hadoop1 master1 master
10.156.50.36 yanfabu2-36.base.app.dev.yf zk2  hadoop2 master2
10.156.50.37 yanfabu2-37.base.app.dev.yf zk3  hadoop3 slaver1

 

2.解压pig

 tar xf pig-0.17.0.tar.gz 
 mv pig-0.17.0 pig

vim ~/.bash_profile

export PIG_HOME=/home/zkkafka/pig
export PATH=$PATH:$PIG_HOME/bin

source ~/.bash_profile

scp -r ~/.bash_profile  zkkafka@10.156.50.36:/home/zkkafka/

 

3.配置文件修改

vim pig.properties

fs.default.name=hdfs://master     #core-site 配置
mapred.job.tracker=master1:10020  #maper-site 配置 jobhistory

scp -r ../conf/  zkkafka@10.156.50.36:/home/zkkafka/pig/conf/
scp -r ../conf/  zkkafka@10.156.50.37:/home/zkkafka/pig/conf/

 

4.pig 版本

pig -version
[zkkafka@yanfabu2-35 pig]$ pig -version
19/06/05 19:58:19 INFO Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
Apache Pig version 0.17.0 (r1797386) 
compiled Jun 02 2017, 15:41:58

 

 

5.准备数据

vim tel.txt

1363157985066	13726230503	00-FD-07-A4-72-B8:CMCC	120.196.100.82	i02.c.aliimg.com	24	27	2481	24681	200

 

hdfs dfs -mkdir -p /hdfs/pig/
hdfs dfs -put /home/zkkafka/pig/data/tel.txt  /hdfs/pig/
hdfs dfs -lsr /hdfs/pig

 

[zkkafka@yanfabu2-35 conf]$ hdfs dfs -lsr /hdfs/pig
lsr: DEPRECATED: Please use 'ls -R' instead.
-rw-r--r--   2 zkkafka supergroup       2546 2019-06-05 21:03 /hdfs/pig/tel.txt

 

6.进入pig 命令

 

[zkkafka@yanfabu2-37 ~]$ pig
19/06/06 16:44:27 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
19/06/06 16:44:27 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
19/06/06 16:44:27 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
2019-06-06 16:44:27,558 [main] INFO  org.apache.pig.Main - Apache Pig version 0.17.0 (r1797386) compiled Jun 02 2017, 15:41:58
2019-06-06 16:44:27,558 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/zkkafka/pig_1559810667556.log
2019-06-06 16:44:27,605 [main] INFO  org.apache.pig.impl.util.Utils - Default bootup file /home/zkkafka/.pigbootup not found
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/zkkafka/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/zkkafka/hbase/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2019-06-06 16:44:28,312 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-06-06 16:44:28,312 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://master/
2019-06-06 16:44:28,859 [main] INFO  org.apache.pig.PigServer - Pig Script ID for the session: PIG-default-3d2427ca-7fdf-4252-ab78-cfb6ed2be36e
2019-06-06 16:44:28,859 [main] WARN  org.apache.pig.PigServer - ATS is disabled since yarn.timeline-service.enabled set to false

 

7.使用pig

7.1导入数据到hive

 t_wlan = LOAD '/hdfs/pig/tel.txt' USING PigStorage('\t')   AS (t0:long, msisdn:chararray, t2:chararray, t3:chararray, t4:chararray, t5:chararray, t6:long, t7:long, t8:long, t9:long, t10:chararray);

 

7.2 查询 表 t_wlan

dump t_wlan;

grunt> dump t_wlan;
2019-06-06 16:59:05,805 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2019-06-06 16:59:05,840 [main] WARN  org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2019-06-06 16:59:05,840 [main] INFO  org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NestedLimitOptimizer, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2019-06-06 16:59:05,847 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2019-06-06 16:59:05,848 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2019-06-06 16:59:05,848 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2019-06-06 16:59:05,880 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2019-06-06 16:59:05,881 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-06-06 16:59:05,883 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2019-06-06 16:59:06,472 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp-1906860032/tmp-489322267/pig-0.17.0-core-h2.jar
2019-06-06 16:59:06,598 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp-1906860032/tmp1532488090/automaton-1.11-8.jar
2019-06-06 16:59:07,094 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp-1906860032/tmp731737639/antlr-runtime-3.4.jar
2019-06-06 16:59:07,190 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp-1906860032/tmp-2081706505/joda-time-2.9.3.jar
2019-06-06 16:59:07,192 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2019-06-06 16:59:07,192 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2019-06-06 16:59:07,193 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2019-06-06 16:59:07,193 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2019-06-06 16:59:07,202 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2019-06-06 16:59:07,264 [JobControl] WARN  org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2019-06-06 16:59:07,286 [JobControl] INFO  org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2019-06-06 16:59:07,289 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-06 16:59:07,289 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-06-06 16:59:07,291 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2019-06-06 16:59:07,487 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2019-06-06 16:59:07,590 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1559370613628_0014
2019-06-06 16:59:07,598 [JobControl] INFO  org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources.
2019-06-06 16:59:07,856 [JobControl] INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1559370613628_0014
2019-06-06 16:59:07,862 [JobControl] INFO  org.apache.hadoop.mapreduce.Job - The url to track the job: http://master1:8088/proxy/application_1559370613628_0014/
2019-06-06 16:59:07,862 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1559370613628_0014
2019-06-06 16:59:07,862 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases t_wlan
2019-06-06 16:59:07,862 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: t_wlan[4,9],t_wlan[-1,-1] C:  R: 
2019-06-06 16:59:07,872 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2019-06-06 16:59:07,873 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0014]
2019-06-06 16:59:20,161 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2019-06-06 16:59:20,161 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0014]
2019-06-06 16:59:23,200 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 16:59:23,409 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 16:59:23,505 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 16:59:23,573 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2019-06-06 16:59:23,574 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: 

HadoopVersion	PigVersion	UserId	StartedAt	FinishedAt	Features
2.6.5	0.17.0	zkkafka	2019-06-06 16:59:05	2019-06-06 16:59:23	UNKNOWN

Success!

Job Stats (time in seconds):
JobId	Maps	Reduces	MaxMapTime	MinMapTime	AvgMapTime	MedianMapTime	MaxReduceTime	MinReduceTime	AvgReduceTime	MedianReducetime	Alias	Feature	Outputs
job_1559370613628_0014	1	0	4	4	4	4	0	0	0	0	t_wlan	MAP_ONLY	hdfs://master/tmp/temp-1906860032/tmp1645766804,

Input(s):
Successfully read 1 records (459 bytes) from: "/hdfs/pig/tel.txt"

Output(s):
Successfully stored 1 records (106 bytes) in: "hdfs://master/tmp/temp-1906860032/tmp1645766804"

Counters:
Total records written : 1
Total bytes written : 106
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1559370613628_0014


2019-06-06 16:59:23,582 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 16:59:23,639 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 16:59:23,698 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 16:59:23,753 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Encountered Warning ACCESSING_NON_EXISTENT_FIELD 1 time(s).
2019-06-06 16:59:23,753 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2019-06-06 16:59:23,755 [main] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2019-06-06 16:59:23,764 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-06 16:59:23,764 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(1363157985066,13726230503,00-FD-07-A4-72-B8:CMCC,120.196.100.82,i02.c.aliimg.com,24,27,2481,24681,200,)

 

7.2 A 表中抽出数据成B 表

 

t_wlan_simple = FOREACH t_wlan GENERATE msisdn, t6, t7, t8, t9;
dump t_wlan_simple;

 

grunt> dump t_wlan_simple;
2019-06-06 17:03:42,827 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2019-06-06 17:03:42,869 [main] WARN  org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2019-06-06 17:03:42,870 [main] INFO  org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NestedLimitOptimizer, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2019-06-06 17:03:42,884 [main] INFO  org.apache.pig.newplan.logical.rules.ColumnPruneVisitor - Columns pruned for t_wlan: $0, $2, $3, $4, $5, $10
2019-06-06 17:03:42,891 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2019-06-06 17:03:42,893 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2019-06-06 17:03:42,893 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2019-06-06 17:03:42,923 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2019-06-06 17:03:42,923 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-06-06 17:03:42,924 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2019-06-06 17:03:43,081 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp-1906860032/tmp1408006038/pig-0.17.0-core-h2.jar
2019-06-06 17:03:43,178 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp-1906860032/tmp1149486211/automaton-1.11-8.jar
2019-06-06 17:03:43,281 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp-1906860032/tmp1835019327/antlr-runtime-3.4.jar
2019-06-06 17:03:43,378 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp-1906860032/tmp2065709292/joda-time-2.9.3.jar
2019-06-06 17:03:43,382 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2019-06-06 17:03:43,383 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2019-06-06 17:03:43,383 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2019-06-06 17:03:43,383 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2019-06-06 17:03:43,399 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2019-06-06 17:03:43,481 [JobControl] WARN  org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2019-06-06 17:03:43,510 [JobControl] INFO  org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2019-06-06 17:03:43,519 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-06 17:03:43,519 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-06-06 17:03:43,522 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2019-06-06 17:03:44,131 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2019-06-06 17:03:44,228 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1559370613628_0015
2019-06-06 17:03:44,232 [JobControl] INFO  org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources.
2019-06-06 17:03:44,471 [JobControl] INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1559370613628_0015
2019-06-06 17:03:44,475 [JobControl] INFO  org.apache.hadoop.mapreduce.Job - The url to track the job: http://master1:8088/proxy/application_1559370613628_0015/
2019-06-06 17:03:44,475 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1559370613628_0015
2019-06-06 17:03:44,475 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases t_wlan,t_wlan_simple
2019-06-06 17:03:44,475 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: t_wlan[4,9],t_wlan_simple[-1,-1] C:  R: 
2019-06-06 17:03:44,480 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2019-06-06 17:03:44,480 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0015]
2019-06-06 17:03:58,648 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2019-06-06 17:03:58,649 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0015]
2019-06-06 17:04:04,679 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:04:04,910 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:04:04,977 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:04:05,043 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2019-06-06 17:04:05,044 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: 

HadoopVersion	PigVersion	UserId	StartedAt	FinishedAt	Features
2.6.5	0.17.0	zkkafka	2019-06-06 17:03:42	2019-06-06 17:04:05	UNKNOWN

Success!

Job Stats (time in seconds):
JobId	Maps	Reduces	MaxMapTime	MinMapTime	AvgMapTime	MedianMapTime	MaxReduceTime	MinReduceTime	AvgReduceTime	MedianReducetime	Alias	Feature	Outputs
job_1559370613628_0015	1	0	4	4	4	4	0	0	0	0	t_wlan,t_wlan_simple	MAP_ONLY	hdfs://master/tmp/temp-1906860032/tmp1236017200,

Input(s):
Successfully read 1 records (459 bytes) from: "/hdfs/pig/tel.txt"

Output(s):
Successfully stored 1 records (29 bytes) in: "hdfs://master/tmp/temp-1906860032/tmp1236017200"

Counters:
Total records written : 1
Total bytes written : 29
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1559370613628_0015


2019-06-06 17:04:05,058 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:04:05,137 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:04:05,223 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:04:05,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2019-06-06 17:04:05,337 [main] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2019-06-06 17:04:05,382 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-06 17:04:05,382 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(13726230503,27,2481,24681,200)

 

7.3 分组数据

 

t_wlan_simple_group = GROUP t_wlan_simple BY msisdn;	
dump t_wlan_simple_group;

 

grunt> dump t_wlan_simple_group;
2019-06-06 17:06:28,589 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: GROUP_BY
2019-06-06 17:06:28,640 [main] WARN  org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2019-06-06 17:06:28,641 [main] INFO  org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NestedLimitOptimizer, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2019-06-06 17:06:28,646 [main] INFO  org.apache.pig.newplan.logical.rules.ColumnPruneVisitor - Columns pruned for t_wlan: $0, $2, $3, $4, $5, $10
2019-06-06 17:06:28,661 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2019-06-06 17:06:28,674 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2019-06-06 17:06:28,674 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2019-06-06 17:06:28,715 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2019-06-06 17:06:28,716 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-06-06 17:06:28,717 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers.
2019-06-06 17:06:28,723 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
2019-06-06 17:06:28,729 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=102
2019-06-06 17:06:28,729 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1
2019-06-06 17:06:28,730 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2019-06-06 17:06:28,929 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp-1906860032/tmp-412980928/pig-0.17.0-core-h2.jar
2019-06-06 17:06:29,039 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp-1906860032/tmp-1182557529/automaton-1.11-8.jar
2019-06-06 17:06:29,543 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp-1906860032/tmp-1112811524/antlr-runtime-3.4.jar
2019-06-06 17:06:30,043 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp-1906860032/tmp432932811/joda-time-2.9.3.jar
2019-06-06 17:06:30,046 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2019-06-06 17:06:30,047 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2019-06-06 17:06:30,047 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2019-06-06 17:06:30,047 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2019-06-06 17:06:30,111 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2019-06-06 17:06:30,174 [JobControl] WARN  org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2019-06-06 17:06:30,189 [JobControl] INFO  org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2019-06-06 17:06:30,191 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-06 17:06:30,191 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-06-06 17:06:30,193 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2019-06-06 17:06:30,391 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2019-06-06 17:06:30,488 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1559370613628_0016
2019-06-06 17:06:30,492 [JobControl] INFO  org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources.
2019-06-06 17:06:30,734 [JobControl] INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1559370613628_0016
2019-06-06 17:06:30,738 [JobControl] INFO  org.apache.hadoop.mapreduce.Job - The url to track the job: http://master1:8088/proxy/application_1559370613628_0016/
2019-06-06 17:06:30,738 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1559370613628_0016
2019-06-06 17:06:30,738 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases t_wlan,t_wlan_simple,t_wlan_simple_group
2019-06-06 17:06:30,738 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: t_wlan[4,9],t_wlan_simple[-1,-1],t_wlan_simple_group[6,22] C:  R: 
2019-06-06 17:06:30,745 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2019-06-06 17:06:30,745 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0016]
2019-06-06 17:06:44,943 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2019-06-06 17:06:44,943 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0016]
2019-06-06 17:06:50,964 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0016]
2019-06-06 17:06:55,983 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:06:56,181 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:06:56,283 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:06:56,335 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2019-06-06 17:06:56,335 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: 

HadoopVersion	PigVersion	UserId	StartedAt	FinishedAt	Features
2.6.5	0.17.0	zkkafka	2019-06-06 17:06:28	2019-06-06 17:06:56	GROUP_BY

Success!

Job Stats (time in seconds):
JobId	Maps	Reduces	MaxMapTime	MinMapTime	AvgMapTime	MedianMapTime	MaxReduceTime	MinReduceTime	AvgReduceTime	MedianReducetime	Alias	Feature	Outputs
job_1559370613628_0016	1	1	4	4	4	4	4	4	4	4	t_wlan,t_wlan_simple,t_wlan_simple_group	GROUP_BY	hdfs://master/tmp/temp-1906860032/tmp912427234,

Input(s):
Successfully read 1 records (459 bytes) from: "/hdfs/pig/tel.txt"

Output(s):
Successfully stored 1 records (46 bytes) in: "hdfs://master/tmp/temp-1906860032/tmp912427234"

Counters:
Total records written : 1
Total bytes written : 46
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1559370613628_0016


2019-06-06 17:06:56,345 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:06:56,403 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:06:56,474 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:06:56,554 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2019-06-06 17:06:56,556 [main] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2019-06-06 17:06:56,568 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-06 17:06:56,568 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(13726230503,{(13726230503,27,2481,24681,200)})

 

7.4 流量汇总

 

t_wlan_simple_group_sum = FOREACH t_wlan_simple_group GENERATE group, SUM(t_wlan_simple.t6), SUM(t_wlan_simple.t7), SUM(t_wlan_simple.t8), SUM(t_wlan_simple.t9);
dump t_wlan_simple_group_sum;

 

grunt> dump t_wlan_simple_group_sum
2019-06-06 17:15:39,824 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: GROUP_BY
2019-06-06 17:15:39,877 [main] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2019-06-06 17:15:39,878 [main] INFO  org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NestedLimitOptimizer, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2019-06-06 17:15:39,885 [main] INFO  org.apache.pig.newplan.logical.rules.ColumnPruneVisitor - Columns pruned for t_wlan: $0, $2, $3, $4, $5, $10
2019-06-06 17:15:39,904 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2019-06-06 17:15:39,908 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.CombinerOptimizerUtil - Choosing to move algebraic foreach to combiner
2019-06-06 17:15:39,972 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2019-06-06 17:15:39,972 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2019-06-06 17:15:40,000 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2019-06-06 17:15:40,001 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-06-06 17:15:40,002 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers.
2019-06-06 17:15:40,002 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
2019-06-06 17:15:40,005 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=102
2019-06-06 17:15:40,005 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1
2019-06-06 17:15:40,005 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2019-06-06 17:15:40,602 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp-1906860032/tmp-784677978/pig-0.17.0-core-h2.jar
2019-06-06 17:15:40,699 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp-1906860032/tmp-1113714067/automaton-1.11-8.jar
2019-06-06 17:15:40,796 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp-1906860032/tmp-1701171835/antlr-runtime-3.4.jar
2019-06-06 17:15:40,910 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp-1906860032/tmp-725132195/joda-time-2.9.3.jar
2019-06-06 17:15:40,914 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2019-06-06 17:15:40,915 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2019-06-06 17:15:40,915 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2019-06-06 17:15:40,915 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2019-06-06 17:15:40,968 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2019-06-06 17:15:41,035 [JobControl] WARN  org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2019-06-06 17:15:41,055 [JobControl] INFO  org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2019-06-06 17:15:41,057 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-06 17:15:41,057 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-06-06 17:15:41,060 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2019-06-06 17:15:41,282 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2019-06-06 17:15:41,432 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1559370613628_0018
2019-06-06 17:15:41,438 [JobControl] INFO  org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources.
2019-06-06 17:15:41,686 [JobControl] INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1559370613628_0018
2019-06-06 17:15:41,691 [JobControl] INFO  org.apache.hadoop.mapreduce.Job - The url to track the job: http://master1:8088/proxy/application_1559370613628_0018/
2019-06-06 17:15:41,692 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1559370613628_0018
2019-06-06 17:15:41,692 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases t_wlan,t_wlan_simple,t_wlan_simple_group,t_wlan_simple_group_sum
2019-06-06 17:15:41,692 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: t_wlan[4,9],t_wlan_simple[-1,-1],t_wlan_simple_group_sum[7,26],t_wlan_simple_group[6,22] C: t_wlan_simple_group_sum[7,26],t_wlan_simple_group[6,22] R: t_wlan_simple_group_sum[7,26]
2019-06-06 17:15:41,698 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2019-06-06 17:15:41,698 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0018]
2019-06-06 17:15:55,903 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2019-06-06 17:15:55,903 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0018]
2019-06-06 17:16:00,962 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0018]
2019-06-06 17:16:06,981 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:16:07,185 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:16:07,257 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:16:07,332 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2019-06-06 17:16:07,333 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: 

HadoopVersion	PigVersion	UserId	StartedAt	FinishedAt	Features
2.6.5	0.17.0	zkkafka	2019-06-06 17:15:39	2019-06-06 17:16:07	GROUP_BY

Success!

Job Stats (time in seconds):
JobId	Maps	Reduces	MaxMapTime	MinMapTime	AvgMapTime	MedianMapTime	MaxReduceTime	MinReduceTime	AvgReduceTime	MedianReducetime	Alias	Feature	Outputs
job_1559370613628_0018	1	1	3	3	3	3	3	3	3	3	t_wlan,t_wlan_simple,t_wlan_simple_group,t_wlan_simple_group_sum	GROUP_BY,COMBINER	hdfs://master/tmp/temp-1906860032/tmp2100428296,

Input(s):
Successfully read 1 records (459 bytes) from: "/hdfs/pig/tel.txt"

Output(s):
Successfully stored 1 records (29 bytes) in: "hdfs://master/tmp/temp-1906860032/tmp2100428296"

Counters:
Total records written : 1
Total bytes written : 29
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1559370613628_0018


2019-06-06 17:16:07,343 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:16:07,402 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:16:07,456 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-06 17:16:07,512 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2019-06-06 17:16:07,513 [main] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2019-06-06 17:16:07,529 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-06 17:16:07,529 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(13726230503,27,2481,24681,200)

 

7.5 存储到HDFS中

STORE t_wlan_simple_group_sum INTO '/hdfs/pig/wlan_result';

 

[zkkafka@yanfabu2-36 ~]$ hdfs dfs -text /hdfs/pig/wlan_result/part-r-00000
13726230503	27	2481	24681	200
[zkkafka@yanfabu2-36 ~]$ 

 

7.6 排序

t_wlan_simple_group_sum_group = ORDER t_wlan_simple_group_sum BY group;
DUMP t_wlan_simple_group_sum_group;

 

grunt> DUMP t_wlan_simple_group_sum_group;
2019-06-12 15:35:33,188 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: GROUP_BY,ORDER_BY
2019-06-12 15:35:33,235 [main] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2019-06-12 15:35:33,236 [main] INFO  org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NestedLimitOptimizer, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2019-06-12 15:35:33,242 [main] INFO  org.apache.pig.newplan.logical.rules.ColumnPruneVisitor - Columns pruned for t_wlan: $0, $2, $3, $4, $5, $10
2019-06-12 15:35:33,255 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2019-06-12 15:35:33,280 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.CombinerOptimizerUtil - Choosing to move algebraic foreach to combiner
2019-06-12 15:35:33,291 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.SecondaryKeyOptimizerMR - Using Secondary Key Optimization for MapReduce node scope-283
2019-06-12 15:35:33,292 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 3
2019-06-12 15:35:33,292 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 3
2019-06-12 15:35:33,328 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2019-06-12 15:35:33,329 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-06-12 15:35:33,330 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers.
2019-06-12 15:35:33,330 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
2019-06-12 15:35:33,332 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=102
2019-06-12 15:35:33,333 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1
2019-06-12 15:35:33,333 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2019-06-12 15:35:33,510 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp1544583298/tmp-955805369/pig-0.17.0-core-h2.jar
2019-06-12 15:35:33,595 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp1544583298/tmp712002240/automaton-1.11-8.jar
2019-06-12 15:35:34,074 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp1544583298/tmp1938988919/antlr-runtime-3.4.jar
2019-06-12 15:35:34,154 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp1544583298/tmp1704097364/joda-time-2.9.3.jar
2019-06-12 15:35:34,157 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2019-06-12 15:35:34,158 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2019-06-12 15:35:34,158 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2019-06-12 15:35:34,158 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2019-06-12 15:35:34,193 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2019-06-12 15:35:34,256 [JobControl] WARN  org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2019-06-12 15:35:34,277 [JobControl] INFO  org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2019-06-12 15:35:34,288 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-12 15:35:34,289 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-06-12 15:35:34,291 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2019-06-12 15:35:34,450 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2019-06-12 15:35:34,952 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1559370613628_0024
2019-06-12 15:35:34,960 [JobControl] INFO  org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources.
2019-06-12 15:35:35,211 [JobControl] INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1559370613628_0024
2019-06-12 15:35:35,216 [JobControl] INFO  org.apache.hadoop.mapreduce.Job - The url to track the job: http://master1:8088/proxy/application_1559370613628_0024/
2019-06-12 15:35:35,216 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1559370613628_0024
2019-06-12 15:35:35,216 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases t_wlan,t_wlan_simple,t_wlan_simple_group,t_wlan_simple_group_sum
2019-06-12 15:35:35,216 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: t_wlan[1,9],t_wlan_simple[-1,-1],t_wlan_simple_group_sum[4,26],t_wlan_simple_group[3,22] C: t_wlan_simple_group_sum[4,26],t_wlan_simple_group[3,22] R: t_wlan_simple_group_sum[4,26]
2019-06-12 15:35:35,231 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2019-06-12 15:35:35,231 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0024]
2019-06-12 15:35:47,386 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 16% complete
2019-06-12 15:35:47,386 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0024]
2019-06-12 15:35:54,902 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 33% complete
2019-06-12 15:35:54,902 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0024]
2019-06-12 15:36:00,424 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:00,596 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:00,651 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:00,688 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2019-06-12 15:36:00,688 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-06-12 15:36:00,689 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers.
2019-06-12 15:36:00,689 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
2019-06-12 15:36:00,698 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=29
2019-06-12 15:36:00,699 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1
2019-06-12 15:36:00,699 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2019-06-12 15:36:01,245 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp1544583298/tmp-87045202/pig-0.17.0-core-h2.jar
2019-06-12 15:36:01,308 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp1544583298/tmp568012746/automaton-1.11-8.jar
2019-06-12 15:36:01,405 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp1544583298/tmp780878190/antlr-runtime-3.4.jar
2019-06-12 15:36:01,485 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp1544583298/tmp772462384/joda-time-2.9.3.jar
2019-06-12 15:36:01,487 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2019-06-12 15:36:01,487 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2019-06-12 15:36:01,487 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2019-06-12 15:36:01,487 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2019-06-12 15:36:01,508 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2019-06-12 15:36:01,559 [JobControl] WARN  org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2019-06-12 15:36:01,582 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-12 15:36:01,582 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-06-12 15:36:01,582 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2019-06-12 15:36:01,749 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2019-06-12 15:36:02,233 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1559370613628_0025
2019-06-12 15:36:02,237 [JobControl] INFO  org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources.
2019-06-12 15:36:02,472 [JobControl] INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1559370613628_0025
2019-06-12 15:36:02,476 [JobControl] INFO  org.apache.hadoop.mapreduce.Job - The url to track the job: http://master1:8088/proxy/application_1559370613628_0025/
2019-06-12 15:36:02,476 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1559370613628_0025
2019-06-12 15:36:02,476 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases t_wlan_simple_group_sum_group
2019-06-12 15:36:02,476 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: t_wlan_simple_group_sum_group[6,32] C:  R: 
2019-06-12 15:36:16,558 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2019-06-12 15:36:16,558 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0025]
2019-06-12 15:36:24,572 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 66% complete
2019-06-12 15:36:24,572 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0025]
2019-06-12 15:36:27,589 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:27,756 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:27,814 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:27,850 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2019-06-12 15:36:27,850 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2019-06-12 15:36:27,854 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers.
2019-06-12 15:36:27,854 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1
2019-06-12 15:36:27,854 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2019-06-12 15:36:27,995 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp1544583298/tmp-1238945561/pig-0.17.0-core-h2.jar
2019-06-12 15:36:28,103 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp1544583298/tmp1385874378/automaton-1.11-8.jar
2019-06-12 15:36:28,223 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp1544583298/tmp2107107107/antlr-runtime-3.4.jar
2019-06-12 15:36:28,297 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/zkkafka/pig/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp1544583298/tmp-637573401/joda-time-2.9.3.jar
2019-06-12 15:36:28,301 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2019-06-12 15:36:28,302 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2019-06-12 15:36:28,302 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2019-06-12 15:36:28,302 [main] INFO  org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2019-06-12 15:36:28,374 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2019-06-12 15:36:28,445 [JobControl] WARN  org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2019-06-12 15:36:28,465 [JobControl] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-12 15:36:28,465 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2019-06-12 15:36:28,465 [JobControl] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2019-06-12 15:36:28,599 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2019-06-12 15:36:28,675 [JobControl] INFO  org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1559370613628_0026
2019-06-12 15:36:28,679 [JobControl] INFO  org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources.
2019-06-12 15:36:28,918 [JobControl] INFO  org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1559370613628_0026
2019-06-12 15:36:28,921 [JobControl] INFO  org.apache.hadoop.mapreduce.Job - The url to track the job: http://master1:8088/proxy/application_1559370613628_0026/
2019-06-12 15:36:28,921 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1559370613628_0026
2019-06-12 15:36:28,921 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases t_wlan_simple_group_sum_group
2019-06-12 15:36:28,921 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: t_wlan_simple_group_sum_group[6,32] C:  R: 
2019-06-12 15:36:44,145 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 83% complete
2019-06-12 15:36:44,146 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0026]
2019-06-12 15:36:51,164 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1559370613628_0026]
2019-06-12 15:36:54,180 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,330 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,369 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,401 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2019-06-12 15:36:54,527 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: 

HadoopVersion	PigVersion	UserId	StartedAt	FinishedAt	Features
2.6.5	0.17.0	zkkafka	2019-06-12 15:35:33	2019-06-12 15:36:54	GROUP_BY,ORDER_BY

Success!

Job Stats (time in seconds):
JobId	Maps	Reduces	MaxMapTime	MinMapTime	AvgMapTime	MedianMapTime	MaxReduceTime	MinReduceTime	AvgReduceTime	MedianReducetime	Alias	Feature	Outputs
job_1559370613628_0024	1	1	3	3	3	3	4	4	4	4	t_wlan,t_wlan_simple,t_wlan_simple_group,t_wlan_simple_group_sum	GROUP_BY,COMBINER	
job_1559370613628_0025	1	1	5	5	5	5	5	5	5	5	t_wlan_simple_group_sum_group	SAMPLER	
job_1559370613628_0026	1	1	3	3	3	3	4	4	4	4	t_wlan_simple_group_sum_group	ORDER_BY	hdfs://master/tmp/temp1544583298/tmp-717585849,

Input(s):
Successfully read 1 records (459 bytes) from: "/hdfs/pig/tel.txt"

Output(s):
Successfully stored 1 records (29 bytes) in: "hdfs://master/tmp/temp1544583298/tmp-717585849"

Counters:
Total records written : 1
Total bytes written : 29
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1559370613628_0024	->	job_1559370613628_0025,
job_1559370613628_0025	->	job_1559370613628_0026,
job_1559370613628_0026


2019-06-12 15:36:54,532 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,584 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,623 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,664 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,702 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,735 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,776 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,836 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,871 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
2019-06-12 15:36:54,928 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2019-06-12 15:36:54,929 [main] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2019-06-12 15:36:54,934 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2019-06-12 15:36:54,934 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(13726230503,27,2481,24681,200)

 

 

8.脚本

pig -x mapreduce  t_wlan.pig

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

捐助开发者 

在兴趣的驱动下,写一个免费的东西,有欣喜,也还有汗水,希望你喜欢我的作品,同时也能支持一下。 当然,有钱捧个钱场(支持支付宝和微信 以及扣扣群),没钱捧个人场,谢谢各位。

 

个人主页http://knight-black-bob.iteye.com/



 
 
 谢谢您的赞助,我会做的更好!

0
0
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics