去哪网的面试:数据量不大的话用awk最方便,但长时间没有用过了,忘记了awk数组的用法。
在这里复习一下。
假设数据格式为:
178.60.128.31 www.google.com.hk 193.192.250.158 www.google.com 210.242.125.35 adwords.google.com 210.242.125.35 accounts.google.com.hk 210.242.125.35 accounts.google.com 210.242.125.35 accounts.l.google.com 64.233.181.49 www.google.com 212.188.10.167 www.google.com 23.239.5.106 www.google.com 64.233.168.41 www.google.com 62.1.38.89 www.google.com 62.1.38.89 chrome.google.com 193.192.250.172 www.google.com 212.188.10.241 www.google.com 37.228.69.57 www.google.com 222.255.120.42 www.google.com 222.255.120.42 www.gstatic.com 212.188.10.167 www.googleapis.com 64.233.181.49 www.googleapis.com 64.233.181.49 fonts.googleapis.com 193.192.250.158 plus.google.com 193.192.250.158 talkgadget.google.com 193.192.250.158 ssl.gstatic.com 193.192.250.158 images-pos-opensocial.googleusercontent.com 193.192.250.158 images1-focus-opensocial.googleusercontent.com 193.192.250.158 images2-focus-opensocial.googleusercontent.com 193.192.250.158 images3-focus-opensocial.googleusercontent.com 193.192.250.158 images4-focus-opensocial.googleusercontent.com 193.192.250.158 images5-focus-opensocial.googleusercontent.com 193.192.250.158 images6-focus-opensocial.googleusercontent.com 193.192.250.158 clients4.google.com 222.255.120.42 google.com 222.255.120.42 apis.google.com 222.255.120.42 clients1.google.com 193.192.250.158 clients2.google.com 193.192.250.158 clients3.google.com 193.192.250.158 clients5.google.com 64.233.181.49 maps.google.com 64.233.181.49 mts0.google.com 64.233.181.49 maps.gstatic.com
awk的统计代码:
awk '{arr[$1]++;}END{for(i in arr){print i , arr[i] }}' test.txt
输出:
[blog@AY1310301904525972ddZ ~]$ awk '{arr[$1]++;}END{for(i in arr){print i , arr[i] }}' test.txt 212.188.10.241 1 64.233.168.41 1 23.239.5.106 1 193.192.250.158 15 178.60.128.31 1 37.228.69.57 1 212.188.10.167 2 193.192.250.172 1 62.1.38.89 2 64.233.181.49 6 210.242.125.35 4 222.255.120.42 5
增加排序:
[blog@AY1310301904525972ddZ ~]$ awk '{arr[$1]++;}END{for(i in arr){print i , arr[i] }}' test.txt | sort -n -k 2 178.60.128.31 1 193.192.250.172 1 212.188.10.241 1 23.239.5.106 1 37.228.69.57 1 64.233.168.41 1 212.188.10.167 2 62.1.38.89 2 210.242.125.35 4 222.255.120.42 5 64.233.181.49 6 193.192.250.158 15
=============对网友:【hattah】 回答的补充===============
测试了两种方法的效率:
理论上sort排序数据量越大,速度越慢。
实测结果:
[blog@AY1310301904525972ddZ ~]$ time awk '{print $1}' test.txt |sort|uniq -c 1380 178.60.128.31 17312 193.192.250.158 1160 193.192.250.172 4640 210.242.125.35 2320 212.188.10.167 1160 212.188.10.241 5734 222.255.120.42 1160 23.239.5.106 1160 37.228.69.57 2320 62.1.38.89 1160 64.233.168.41 6894 64.233.181.49 real 0m0.236s user 0m0.228s sys 0m0.004s
[blog@AY1310301904525972ddZ ~]$ time awk '{arr[$1]++;}END{for(i in arr){print i , arr[i] }}' test.txt | sort -n -k 2 193.192.250.172 1160 212.188.10.241 1160 23.239.5.106 1160 37.228.69.57 1160 64.233.168.41 1160 178.60.128.31 1380 212.188.10.167 2320 62.1.38.89 2320 210.242.125.35 4640 222.255.120.42 5734 64.233.181.49 6894 193.192.250.158 17312 real 0m0.025s user 0m0.022s sys 0m0.001s
相关推荐
面试题总结是一个长期工作,面试不停,这份面试题总结就不会停。以后会慢慢把Java相关的面试题、计算机网络等都加进来,其实这不仅仅是一份面试题,更是一份面试参考,让你熟悉面试题各种提问情况,当然,项目部分,...
linux运维学习笔记:企业Shell面试题总结-2
25道shell面试题,
linux运维学习笔记:企业Shell面试题总结-1
前端面试题: 精选Vue面试题及答案.pdf
一份SHELL面试题,适合对SHELL有一定了解的人看,
JAVA面试题JAVA面试题JAVA面试题JAVA面试题JAVA面试题JAVA面试题
c++面试题面试题面试题面试题面试题面试题面试题面试题面试题面试题面试题面试题面试题面试题面试题面试题面试题面试题面试题面试题面试题面试题面试题面试题面试题面试题面试题面试题面试题面试题面试题面试题面试...
这边提到的5个面试问题,延续之前的有关Linux面试问题和答案。如果你是Tecmint的读者,你的支持我非常感谢。 1. 写一个shell脚本来得到当前的日期,时间,用户名和当前工作目录。 答案 : 输出用户名,当前日期和时间...
经典面试题: 2021Vue经典面试题总结(含答案).pdf
面试题总结是一个长期工作,面试不停,这份面试题总结就不会停。以后会慢慢把Java相关的面试题、计算机网络等都加进来,其实这不仅仅是一份面试题,更是一份面试参考,让你熟悉面试题各种提问情况,当然,项目部分,...
医疗卫生面试真题:卫生类典型面试题汇总及答案.pdf
2021最新大厂AI面试题:107题(含答案及解析).pdf
python笔记50-面试题:交换圣诞节礼物全文共5页,当前为第1页。python笔记50-面试题:交换圣诞节礼物全文共5页,当前为第1页。python笔记50-面试题:交换圣诞节礼物 python笔记50-面试题:交换圣诞节礼物全文共5页,...
2021最新大厂AI面试题:Q3版107题(含答案及解析).pdf
BAT及各大互联网公司2014前端笔试面试题:HTML
经典面试题:最长公共子序列.html
最新面试题:vue-面试题.pdf
http网络面试题:.md