$ awk '{total+=$6} END{print "Club student total points: " total}' grade.txt Club student total points: 155
累加动作并不一定要写在圆括号中,但是这样做可以增强代码可读性。
另一个累加列的示例,统计当前目录中文件的总大小:
1 2 3 4
ls -lAG | awk '$1~/^[^d]/ {print $9"\t"$5} {total+=$5} END {print "total " total/1024 " KB"}' ...file name followed by size... grade.txt 176 total 234.544 KB
#!/usr/bin/awk -f # all comment line must start with a hash `#` # name: student_total.awk # to call: student_total.awk grade.txt # print total and average of club student points
# print a header first: BEGIN{ print"Student Date Member No. Grade Age Points Max" print"Name Joined Gained Point Available" print"================================================================" }
# let's add the scores of points gained (total+=$6)
# finished processing, now let's print the total and average points: END{ print"Club students total points: " total print"Average club students points: " total/NR }
$ student_total.awk grade.txt Student Date Member No. Grade Age Points Max Name Joined Gained Point Available ================================================================ M.Tansley 05/99 48311 Green 8 40 44 J.Lulu 06/99 48317 green 9 24 26 P.Bunny 02/99 48 Yellow 12 35 28 J.Troll 07/99 4842 Brown-3 12 26 26 L.Transly 05/99 4712 Brown-2 12 30 28 Club students total points: 155 Average club students points: 31
顺便提一下,在OSX上使用ls命令时,会发现有的目录或文件的权限位的最后附有一个@或+符号,其中@说明该目录或文件具有extended attributes,而+符合说明该目录或文件具有类似ACL的非标准权访问限控制策略。可以通过在ls后添加-@命令查看。详情见“ls” on mac and extended attributes。
场景2,清除重复输出:
当调试程序时,遇到这样一个日志文件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
$ cat error.log INVALID LCSD 98GJ23 ERROR* ERROR* CAUTION LPSS ERROR ON ACC NO. ERROR* ERROR* ERROR* ERROR* ERROR* PASS FIELD INVALID ON LDPS ERROR* ERROR* PASS FIELD INVALID ON GHSI ERROR* CAUTION LPSS ERROR ON ACC NO. ERROR* ERROR*
我们想将其中的重复的多行ERROR*合并为一行:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
#!/usr/bin/awk -f # name: error_strip.awk # to call: error_strip.awk <filename> # strips out the ERROR* lines if there are more than one # ERROR* lines after each failed record.
BEGIN{error_line=""}
# tell awk who is "ERROR*" { if($0=="ERROR*" && error_line=="ERROR*") # goto next line next; error_line=$0; print }
执行结果:
1 2 3 4 5 6 7 8 9 10 11
$ error_strip.awk error.log INVALID LCSD 98GJ23 ERROR* CAUTION LPSS ERROR ON ACC NO. ERROR* PASS FIELD INVALID ON LDPS ERROR* PASS FIELD INVALID ON GHSI ERROR* CAUTION LPSS ERROR ON ACC NO. ERROR*
在脚本中制定分隔符FS:
再复习一下,在shell中使用awk命令时,用-F指定分隔符:
1
$ awk -F: '{print $0}' input-file
而在脚本中,则是设置FS变量,值得注意的都是,FS变量需要放在BEGIN部分:
1 2 3 4 5 6 7 8 9 10 11 12
#!/usr/bin/awk -f # to call: passwd.awk /private/etc/passwd # print out the first and fifth fields
BEGIN{FS=":"}
{ # pass the comments of the file if($0 ~ /^#/) next; print$1"\t"$5 }
执行结果:
1 2 3 4
nobody Unprivileged User root System Administrator daemon System Services ...
向脚本传递参数:
前面提到,在shell中使用awk命令时传递参数的形式是:
1
$ awk '{if($5<AGE) print $0}' AGE=10 grade.txt
而使用脚本时,传递的方式的形式也基本相同:
1
awk script_file var=value input_file
示例:
1 2 3 4 5 6 7 8 9
#!/usr/bin/awk -f # check on how many fields in a file # name: field_check.awk # to call: field_check.awk MAX=n FS=<separator> input-file
NF!=MAX { print("line " NR " does not have " MAX " fields") }
#!/usr/bin/awk -f # name: age.awk # to call: age.awk AGE=10 grade.txt # print students whose age are lower than age supplied on the command line { if($5<AGE) print$0 }
执行:
1 2 3
$ age.awk AGE=10 grade.txt M.Tansley 05/99 48311 Green 8 40 44 J.Lulu 06/99 48317 green 9 24 26
通过管道使用脚本:
从du命令获得数据后处理输出:
1 2 3 4 5 6 7 8 9 10 11 12
#!/usr/bin/awk -f # name: du.awk # to call: du | du.awk # print file/direc's in bytes and blocks
#!/usr/bin/awk -f # name: array_test.awk # prints out an array BEGIN{ recode="123#456#789"; split(recode, myarray, "#"); } END{ for(i in myarray) print i, myarray[i] }
#!/usr/bin/awk -f # name: belts.awk # to call: belts.awk grade_student.txt # loop through the file and count how many belts we have # in (yellow, orange, red) also count how many adults and # juniors we have. # # start from BEGIN # set FS and load the arrays with our values BEGIN{ FS="#" # load the belt colors we are interested in only belt["Yellow"] belt["Orange"] belt["Red"] # load the type of students student["Junior"] student["Senior"] } # loop through array that holds the belt colors against field-1 # if we have a match, keep a running total { for(color in belt) if($1==color) belt[color]++ } # loop through array that holds the student type against field-2 # if we have a match, keep a running total { for(typein student) if($2==type) student[type]++ } # finish processing, print out the match for each array END{ for(color in belt) print"The club has", belt[color], color, "belts" for(typein student) print"The club has", student[type], type, "students" }
执行:
1 2 3 4 5 6
$ belts.awk grade_student.txt The club has 2 Orange belts The club has 2 Red belts The club has 3 Yellow belts The club has 8 Junior students The club has 7 Senior students