`

Nutch fetch job中时间的分配比例

XML 
阅读更多

下面是nutch fetch job中map shuffle reduce的时间花费的一个列表:

server name Fri Mar 05 09:45:13 GMT 2010 job_201003050945_0006 fetch crawl/segments/20100305102846 user name


User : username   --用户名
JobName : fetch crawl/segments/20100305102846  --job的名称
JobConf : hdfs://servername:9000/opt/crawler/data/mapred/system /job_201003050945_0006/job.xml --使用的配置文件位置
Submitted At : 5/03 10:30:29 --提交时间
Launched At : 5/03 10:30:30 (0sec) --开始时间
Finished At : 6/03 17:04:09 (30hrs, 33mins, 38sec) --结束时间
Status : SUCCESS  --结束状态

---从下面的分析可以得出map时间 22hrs avg
---从下面的分析可以得出shuffle时间 30hrs avg
---从下面的分析可以得出reduce时间 29mins avg

Time taken by best performing Map task task_201003050945_0006_m_000014 : 14hrs, 5mins, 23sec

Average time taken by Map tasks: 22hrs, 6mins, 40sec

Worse performing map tasks

Task Id Time taken
task_201003050945_0006_m_000010 24hrs, 47mins, 14sec
task_201003050945_0006_m_000011 24hrs, 44mins, 1sec
task_201003050945_0006_m_000013 24hrs, 42mins, 23sec
task_201003050945_0006_m_000012 24hrs, 29mins, 6sec
task_201003050945_0006_m_000007 24hrs, 19mins, 44sec
task_201003050945_0006_m_000006 24hrs, 18mins, 54sec
task_201003050945_0006_m_000001 24hrs, 18mins, 41sec
task_201003050945_0006_m_000008 24hrs, 18mins, 26sec
task_201003050945_0006_m_000000 24hrs, 17mins, 7sec
task_201003050945_0006_m_000005 24hrs, 16mins, 2sec

The last Map task task_201003050945_0006_m_000016 finished at (relative to the Job launch time): 6/03 16:32:44 (30hrs, 2mins, 14sec)


Time taken by best performing shuffle task_201003050945_0006_r_000004 : 30hrs, 2mins, 0sec

Average time taken by Shuffle: 30hrs, 2mins, 10sec

Worse performing Shuffle(s)

Task Id Time taken
task_201003050945_0006_r_000000 30hrs, 2mins, 26sec
task_201003050945_0006_r_000002 30hrs, 2mins, 18sec
task_201003050945_0006_r_000001 30hrs, 2mins, 18sec
task_201003050945_0006_r_000003 30hrs, 2mins, 4sec
task_201003050945_0006_r_000005 30hrs, 2mins, 3sec
task_201003050945_0006_r_000006 30hrs, 2mins, 2sec
task_201003050945_0006_r_000004 30hrs, 2mins, 0sec

The last Shuffle task_201003050945_0006_r_000000 finished at (relative to the Job launch time): 6/03 16:33:08 (30hrs, 2mins, 37sec)


Time taken by best performing Reduce task : task_201003050945_0006_r_000002 : 27mins, 43sec

Average time taken by Reduce tasks: 29mins, 38sec

Worse performing reduce tasks

Task Id Time taken
task_201003050945_0006_r_000000 31mins, 9sec
task_201003050945_0006_r_000001 30mins, 36sec
task_201003050945_0006_r_000003 29mins, 54sec
task_201003050945_0006_r_000005 29mins, 27sec
task_201003050945_0006_r_000004 29mins, 22sec
task_201003050945_0006_r_000006 29mins, 14sec
task_201003050945_0006_r_000002 27mins, 43sec
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics