linux awk 按多列 去重 来统计数据之妙用
这是我日常工作统计日志用到的,现记录下:
比如有这样一个日志:
需要统计push,的点击数量,按cookie去重,因为一个设备可能点击多次
第一步,先看看这几列,打印出来看看,
awk -F "," '{print $2" "$3" "$6" "$7" "$9}' pushLog.log
第二步,按上面的几列去重
awk -F "," '!a[$2,$3,$6,$7,$9]++' pushLog.log
第三步, 统计
awk -F "," '!a[$2,$3,$6,$7,$9]++' pushLog.log |awk -F "," '{a[$2" "$3" "$6" "$9]+=1}END{for(i in a) printf "%s %s\n",i,a[i]}' | sort -k 5 -n -r | head -n 15
结果如图:
相关推荐
chenpro 2020-07-04
fendou00sd 2020-06-16
RealJianyuan 2020-06-14
cwgxiaoguizi 2020-06-05
chenpro 2020-06-02
Neptune 2020-05-31
老谢的自留地 2020-05-09
YukiRain 2020-05-08
baobaozai 2020-04-29
Proudoffaith 2020-04-08
fenxinzi 2020-03-01
zhiliang 2020-01-31
wannagonna 2020-01-13
wandererdl 2019-12-25
chenchuang 2020-01-25
jyj00 2020-01-09
fendou00sd 2020-01-07
fendou00sd 2020-01-06
PHP学习笔记 2020-01-06