h10

Task 10.1: The Wild and Weird awk Command

awk '{commands }'.

There are two possible flags to awk:

-f            file specifies that the instructions should be read from the file file rather than from the command line

-Fc         indicates that the program should consider the letter c as the separator between fields of information, rather than the default of white space

 

1.

$ who | awk '{ print }'

root     console Nov  9 07:31

yuenca   ttyAo   Nov 27 17:39

limyx4   ttyAp   Nov 27 16:22

wifey    ttyAx   Nov 27 17:16

tobster  ttyAz   Nov 27 17:59

taylor   ttyqh   Nov 27 17:43   (vax1.umkc.edu)

 

A line of input is broken into specific fields of information, each field being assigned a unique identifier. Field one is $1, field two $2, and so on:

$ who | awk '{ print $1 }'

root

yuenca

limyx4

wifey

tobster

taylor

 

The good news is that you also can specify any other information to print by surrounding it with double quotes:

$ who | awk '{ print "User " $1 " is on terminal line " $2 }'

User root is on terminal line console

User yuenca is on terminal line ttyAo

User limyx4 is on terminal line ttyAp

User hawk is on terminal line ttyAw

User wifey is on terminal line ttyAx

user taylor is on terminal line ttyqh

 

2.

$ grep taylor /etc/passwd | awk -F: '{ print $1 " has "$7" as their login shell." }'

User taylorj has /bin/csh as their login shell.

User mtaylor has /usr/local/bin/tcsh as their login shell.

User dataylor has /usr/local/lib/msh as their login shell.

User taylorjr has /bin/csh as their login shell.

User taylorrj has /bin/csh as their login shell.

User taylormx has /bin/csh as their login shell.

User taylor has /bin/csh as their login shell.

 

3.

how many different login shells are used at my site and which one is most popular

$ awk -F: '{print $7}' /etc/passwd | sort | uniq -c

   2

3365 /bin/csh

   1 /bin/false

  84 /bin/ksh

  21 /bin/sh

  11 /usr/local/bin/ksh

 353 /usr/local/bin/tcsh

  45 /usr/local/lib/msh

 

4.

Sticking with the password file, notice that the names therein are all in first-name-then-last-name format. That is, my account is Dave Taylor,,,,. A common requirement that you might have is to generate a report of system users. You’d like to sort them by name, but by last name.

$ grep taylor /etc/passwd | awk -F: '{print $5}'

James Taylor,,,,

Mary Taylor,,,,

Dave Taylor,,,,

James Taylor,,,,

Robert Taylor,,,,

Melanie Taylor,,,,

Dave Taylor,,,,

 

$ grep taylor /etc/passwd | awk -F: '{print $5}' | sed 's/,//g' | awk '{print $2", "$1}' | sort

Taylor, Dave

Taylor, Dave

Taylor, James

Taylor, James

Taylor, Mary

Taylor, Melanie

Taylor, Robert

 

Note: white space is default for awk.

 

5.

The script earlier that looked for the login shell isn’t quite correct. It turns out that if the user wants to have /bin/sh—the Bourne shell—as his or her default shell, the final field can be left blank:

joe:?:45:555:Joe-Bob Billiard,,,,:/home/joe:

 

NF

Used without a dollar sign, it indicates how many fields are on a line

used with a dollar sign, it’s always the value of the last field on the line itself

$ who | head -3 | awk '{ print NF }'

5

5

5

$ who | head -3 | awk '{ print $NF }'

07:31

16:22

18:21

 

$ grep taylor /etc/passwd | awk -F: '{print $NF}' | sort | uniq -c

3365 /bin/csh

   1 /bin/false

  84 /bin/ksh

  21 /bin/sh

  11 /usr/local/bin/ksh

 353 /usr/local/bin/tcsh

  45 /usr/local/lib/msh

 

6.

NR keeps track of the number of records (or lines) displayed. Here’s a quick way to number a file:

$ ls -l | awk '{ print NR": "$0 }'

1: total 29

2: drwx------  2 taylor        512 Nov 21 10:39 Archives/

3: drwx------  3 taylor        512 Nov 16 21:55 InfoWorld/

4: drwx------  2 taylor       1024 Nov 27 18:02 Mail/

5: drwx------  2 taylor        512 Oct  6 09:36 News/

6: drwx------  3 taylor        512 Nov 21 12:39 OWL/

7: drwx------  2 taylor        512 Oct 13 10:45 bin/

8: -rw-rw----  1 taylor      12556 Nov 16 09:49 keylime.pie

9: -rw-------  1 taylor      11503 Nov 27 18:05 randy

10: drwx------  2 taylor        512 Oct 13 10:45 src/

11: drwxrwx---  2 taylor        512 Nov  8 22:20 temp/

12: -rw-rw----  1 taylor          0 Nov 27 18:29 testme

Here you can see that the zero field of a line is the entire line.

 

$ who | awk '{ print $2": "$0 }'

ttyAp: limyx4   ttyAp   Nov 27 16:22

ttyAt: ltbei    ttyAt   Nov 27 18:21

ttyAu: woodson  ttyAu   Nov 27 18:19

ttyAv: morning  ttyAv   Nov 27 18:19

ttyAw: hawk     ttyAw   Nov 27 18:12

ttyAx: wifey    ttyAx   Nov 27 17:16

ttyAz: wiwatr   ttyAz   Nov 27 18:22

ttyAA: chong    ttyAA   Nov 27 13:56

ttyAB: ishidahx ttyAB   Nov 27 18:20

 

7.

$ ls -lF | awk '{ print $9 "  " $5 }'

rchives/ 512

InfoWorld/ 512

Mail/ 1024

News/ 512

OWL/ 512

bin/ 512

keylime.pie 12556

randy 11503

src/ 512

temp/ 512

testme 582

 

two special character sequences that can be embedded in the quoted arguments to print:

/n Generates a carriage return

/t Generates a tab character

 

$ ls -lF | awk '{ print $5 "/t" $9 }'

512     Archives/

512     InfoWorld/

1024    Mail/

512     News/

512     OWL/

512     bin/

12556   keylime.pie

11503   randy

512     src/

512     temp/

582     testme

 

$ ls -l | awk '{print $5"/t" $9 }' | sort -rn | head -5

12556   keylime.pie

11503   randy

1024    Mail/

582     testme

512     temp/

 

8.

The awk program basically looks for a pattern to appear in a line and then, if the pattern is found, executes the instructions that follow the pattern in the awk script. There are two special patterns in awk: BEGIN and END.

 

The instructions that follow BEGIN are executed before any lines of input are read.

The instructions that follow END are executed only after all the input has been read.

 

This can be very useful for computing the sum of a series of numbers. For example, I’d like to know the total number of bytes I’m using for all my files:

$ ls -l | awk '{print $5}'

512

512

1024

512

512

512

12556

11503

512

512

582

 

$ ls -l | awk '{ totalsize += $5; print totalsize }'

512

1024

2048

2560

3072

3584

16140

27643

28155

28667

29249

 

$ ls -l | awk '{ totalsize += $5; print totalsize }' | tail -1

29249

 

$ ls -l | awk '{ totalsize += $5 } END { print totalsize }'

29249

 

$ ls -l | awk '{ totalsize += $5 } END { print "You have a total of" totalsize " bytes used in files." }'

You have a total of 29249 bytes used in files.

 

9.

$ ls -l | awk '{ totalsize += $5 } END { print "You have a total of" totalsize " bytes used across "NR" files." }'

You have a total of 29249 bytes used across 11 files.

 

An easier way to see all this is to create an awk program file:

$ cat << EOF > script

>{ totalsize += $5 } END { print "You have a total of "totalsize " bytes used across "NR" files."}

>EOF

$ ls -l | awk -f script

You have a total of 29249 bytes used across 11 files.

 

10.

Scripts in awk are really programs and have all the flow-control capabilities. One thing you can do within an awk script is to have conditional execution of statements, the if-then condition.

to see whether the length of the first field (the account name) is exactly two characters long

$ awk -F: '{ if (length($1) == 2) print $0 }' /etc/passwd | wc -l

 

11.

$ cat << EOF > awkscript

>{

>count[length($1)]++

>}

>END{

>for (i=1; i < 9; i++)

>print "There are " count[i] " accounts with " i " letter names."

>}

>EOF

$ awk -F: -f awkscript /etc/passwd

There are 1 accounts with 1 letter names.

There are 26 accounts with 2 letter names.

There are 303 accounts with 3 letter names.

There are 168 accounts with 4 letter names.

There are 368 accounts with 5 letter names.

There are 611 accounts with 6 letter names.

There are 906 accounts with 7 letter names.

There are 1465 accounts with 8 letter names.

 

$ awk -F: '{ if (length($1) == 1) print $0 }' < /etc/passwd

 

Task 10.2: Re-routing the Pipeline with tee

The only option to tee is -a, which appends the output to the specified file, rather than replaces the contents of the file each time.

$ ls -l | awk '{ print $5 "/t" $9 }' | sort -rn | tee bigfiles | head -5

12556   keylime.pie

8729    owl.c

1024    Mail/

582     tetme

512     temp/

 

$ cat bigfiles

12556   keylime.pie

8729    owl.c

1024    Mail/

582     tetme

512     temp/

512     src/

512     bin/

512     OWL/

512     News/

512     InfoWorld/

512     Archives/

207     sample2

199     sample

126     awkscript

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章