SED單行腳本快速參考(Unix 流編輯器)

原帖地址:http://club.topsage.com/thread-424121-1-1.html

-------------------------------------------------------------------------

USEFUL ONE-LINE SCRIPTS FOR SED (Unix stream editor)        Dec. 29, 2005
Compiled by Eric Pement - pemente[at]northpark[dot]edu        version 5.5


Latest version of this file (in English) is usually at:
   http://sed.sourceforge.net/sed1line.txt
   http://www.pement.org/sed/sed1line.txt


This file will also available in other languages:
  Chinese     - http://sed.sourceforge.net/sed1line_zh-CN.html
  Czech       - http://sed.sourceforge.net/sed1line_cz.html
  Dutch       - http://sed.sourceforge.net/sed1line_nl.html
  French      - http://sed.sourceforge.net/sed1line_fr.html
  German      - http://sed.sourceforge.net/sed1line_de.html
  Italian     - http://sed.sourceforge.net/sed1line_it.html
  Portuguese  - http://sed.sourceforge.net/sed1line_pt-BR.html
  Spanish     - http://sed.sourceforge.net/sed1line_es.html


FILE SPACING:


 # double space a file
 sed G


 # double space a file which already has blank lines in it. Output file
 # should contain no more than one blank line between lines of text.
 sed '/^$/d;G'


 # triple space a file
 sed 'G;G'


 # undo double-spacing (assumes even-numbered lines are always blank)
 sed 'n;d'


 # insert a blank line above every line which matches "regex"
 sed '/regex/{x;p;x;}'


 # insert a blank line below every line which matches "regex"
 sed '/regex/G'


 # insert a blank line above and below every line which matches "regex"
 sed '/regex/{x;p;x;G;}'


NUMBERING:


 # number each line of a file (simple left alignment). Using a tab (see
 # note on '\t' at end of file) instead of space will preserve margins.
 sed = filename | sed 'N;s/\n/\t/'


 # number each line of a file (number on left, right-aligned)
 sed = filename | sed 'N; s/^/     /; s/ *\(.\{6,\}\)\n/\1  /'


 # number each line of file, but only print numbers if line is not blank
 sed '/./=' filename | sed '/./N; s/\n/ /'


 # count lines (emulates "wc -l")
 sed -n '$='


TEXT CONVERSION AND SUBSTITUTION:


 # IN UNIX ENVIRONMENT: convert DOS newlines (CR/LF) to Unix format.
 sed 's/.$//'               # assumes that all lines end with CR/LF
 sed 's/^M$//'              # in bash/tcsh, press Ctrl-V then Ctrl-M
 sed 's/\x0D$//'            # works on ssed, gsed 3.02.80 or higher


 # IN UNIX ENVIRONMENT: convert Unix newlines (LF) to DOS format.
 sed "s/$/`echo -e \\\r`/"            # command line under ksh
 sed 's/$'"/`echo \\\r`/"             # command line under bash
 sed "s/$/`echo \\\r`/"               # command line under zsh
 sed 's/$/\r/'                        # gsed 3.02.80 or higher


 # IN DOS ENVIRONMENT: convert Unix newlines (LF) to DOS format.
 sed "s/$//"                          # method 1
 sed -n p                             # method 2


 # IN DOS ENVIRONMENT: convert DOS newlines (CR/LF) to Unix format.
 # Can only be done with UnxUtils sed, version 4.0.7 or higher. The
 # UnxUtils version can be identified by the custom "--text" switch
 # which appears when you use the "--help" switch. Otherwise, changing
 # DOS newlines to Unix newlines cannot be done with sed in a DOS
 # environment. Use "tr" instead.
 sed "s/\r//" infile >outfile         # UnxUtils sed v4.0.7 or higher
 tr -d \r <infile >outfile            # GNU tr version 1.22 or higher


 # delete leading whitespace (spaces, tabs) from front of each line
 # aligns all text flush left
 sed 's/^[ \t]*//'                    # see note on '\t' at end of file


 # delete trailing whitespace (spaces, tabs) from end of each line
 sed 's/[ \t]*$//'                    # see note on '\t' at end of file


 # delete BOTH leading and trailing whitespace from each line
 sed 's/^[ \t]*//;s/[ \t]*$//'


 # insert 5 blank spaces at beginning of each line (make page offset)
 sed 's/^/     /'


 # align all text flush right on a 79-column width
 sed -e :a -e 's/^.\{1,78\}$/ &/;ta'  # set at 78 plus 1 space


 # center all text in the middle of 79-column width. In method 1,
 # spaces at the beginning of the line are significant, and trailing
 # spaces are appended at the end of the line. In method 2, spaces at
 # the beginning of the line are discarded in centering the line, and
 # no trailing spaces appear at the end of lines.
 sed  -e :a -e 's/^.\{1,77\}$/ & /;ta'                     # method 1
 sed  -e :a -e 's/^.\{1,77\}$/ &/;ta' -e 's/\( *\)\1/\1/'  # method 2


 # substitute (find and replace) "foo" with "bar" on each line
 sed 's/foo/bar/'             # replaces only 1st instance in a line
 sed 's/foo/bar/4'            # replaces only 4th instance in a line
 sed 's/foo/bar/g'            # replaces ALL instances in a line
 sed 's/\(.*\)foo\(.*foo\)/\1bar\2/' # replace the next-to-last case
 sed 's/\(.*\)foo/\1bar/'            # replace only the last case


 # substitute "foo" with "bar" ONLY for lines which contain "baz"
 sed '/baz/s/foo/bar/g'


 # substitute "foo" with "bar" EXCEPT for lines which contain "baz"
 sed '/baz/!s/foo/bar/g'


 # change "scarlet" or "ruby" or "puce" to "red"
 sed 's/scarlet/red/g;s/ruby/red/g;s/puce/red/g'   # most seds
 gsed 's/scarlet\|ruby\|puce/red/g'                # GNU sed only


 # reverse order of lines (emulates "tac")
 # bug/feature in HHsed v1.5 causes blank lines to be deleted
 sed '1!G;h;$!d'               # method 1
 sed -n '1!G;h;$p'             # method 2


 # reverse each character on the line (emulates "rev")
 sed '/\n/!G;s/\(.\)\(.*\n\)/&\2\1/;//D;s/.//'


 # join pairs of lines side-by-side (like "paste")
 sed '$!N;s/\n/ /'


 # if a line ends with a backslash, append the next line to it
 sed -e :a -e '/\\$/N; s/\\\n//; ta'


 # if a line begins with an equal sign, append it to the previous line
 # and replace the "=" with a single space
 sed -e :a -e '$!N;s/\n=/ /;ta' -e 'P;D'


 # add commas to numeric strings, changing "1234567" to "1,234,567"
 gsed ':a;s/\B[0-9]\{3\}\>/,&/;ta'                     # GNU sed
 sed -e :a -e 's/\(.*[0-9]\)\([0-9]\{3\}\)/\1,\2/;ta'  # other seds


 # add commas to numbers with decimal points and minus signs (GNU sed)
 gsed -r ':a;s/(^|[^0-9.])([0-9]+)([0-9]{3})/\1\2,\3/g;ta'


 # add a blank line every 5 lines (after lines 5, 10, 15, 20, etc.)
 gsed '0~5G'                  # GNU sed only
 sed 'n;n;n;n;G;'             # other seds


SELECTIVE PRINTING OF CERTAIN LINES:


 # print first 10 lines of file (emulates behavior of "head")
 sed 10q


 # print first line of file (emulates "head -1")
 sed q


 # print the last 10 lines of a file (emulates "tail")
 sed -e :a -e '$q;N;11,$D;ba'


 # print the last 2 lines of a file (emulates "tail -2")
 sed '$!N;$!D'


 # print the last line of a file (emulates "tail -1")
 sed '$!d'                    # method 1
 sed -n '$p'                  # method 2


 # print the next-to-the-last line of a file
 sed -e '$!{h;d;}' -e x              # for 1-line files, print blank line
 sed -e '1{$q;}' -e '$!{h;d;}' -e x  # for 1-line files, print the line
 sed -e '1{$d;}' -e '$!{h;d;}' -e x  # for 1-line files, print nothing


 # print only lines which match regular expression (emulates "grep")
 sed -n '/regexp/p'           # method 1
 sed '/regexp/!d'             # method 2


 # print only lines which do NOT match regexp (emulates "grep -v")
 sed -n '/regexp/!p'          # method 1, corresponds to above
 sed '/regexp/d'              # method 2, simpler syntax


 # print the line immediately before a regexp, but not the line
 # containing the regexp
 sed -n '/regexp/{g;1!p;};h'


 # print the line immediately after a regexp, but not the line
 # containing the regexp
 sed -n '/regexp/{n;p;}'


 # print 1 line of context before and after regexp, with line number
 # indicating where the regexp occurred (similar to "grep -A1 -B1")
 sed -n -e '/regexp/{=;x;1!p;g;$!N;p;D;}' -e h


 # grep for AAA and BBB and CCC (in any order)
 sed '/AAA/!d; /BBB/!d; /CCC/!d'


 # grep for AAA and BBB and CCC (in that order)
 sed '/AAA.*BBB.*CCC/!d'


 # grep for AAA or BBB or CCC (emulates "egrep")
 sed -e '/AAA/b' -e '/BBB/b' -e '/CCC/b' -e d    # most seds
 gsed '/AAA\|BBB\|CCC/!d'                        # GNU sed only


 # print paragraph if it contains AAA (blank lines separate paragraphs)
 # HHsed v1.5 must insert a 'G;' after 'x;' in the next 3 scripts below
 sed -e '/./{H;$!d;}' -e 'x;/AAA/!d;'


 # print paragraph if it contains AAA and BBB and CCC (in any order)
 sed -e '/./{H;$!d;}' -e 'x;/AAA/!d;/BBB/!d;/CCC/!d'


 # print paragraph if it contains AAA or BBB or CCC
 sed -e '/./{H;$!d;}' -e 'x;/AAA/b' -e '/BBB/b' -e '/CCC/b' -e d
 gsed '/./{H;$!d;};x;/AAA\|BBB\|CCC/b;d'         # GNU sed only


 # print only lines of 65 characters or longer
 sed -n '/^.\{65\}/p'


 # print only lines of less than 65 characters
 sed -n '/^.\{65\}/!p'        # method 1, corresponds to above
 sed '/^.\{65\}/d'            # method 2, simpler syntax


 # print section of file from regular expression to end of file
 sed -n '/regexp/,$p'


 # print section of file based on line numbers (lines 8-12, inclusive)
 sed -n '8,12p'               # method 1
 sed '8,12!d'                 # method 2


 # print line number 52
 sed -n '52p'                 # method 1
 sed '52!d'                   # method 2
 sed '52q;d'                  # method 3, efficient on large files


 # beginning at line 3, print every 7th line
 gsed -n '3~7p'               # GNU sed only
 sed -n '3,${p;n;n;n;n;n;n;}' # other seds


 # print section of file between two regular expressions (inclusive)
 sed -n '/Iowa/,/Montana/p'             # case sensitive


SELECTIVE DELETION OF CERTAIN LINES:


 # print all of file EXCEPT section between 2 regular expressions
 sed '/Iowa/,/Montana/d'


 # delete duplicate, consecutive lines from a file (emulates "uniq").
 # First line in a set of duplicate lines is kept, rest are deleted.
 sed '$!N; /^\(.*\)\n\1$/!P; D'


 # delete duplicate, nonconsecutive lines from a file. Beware not to
 # overflow the buffer size of the hold space, or else use GNU sed.
 sed -n 'G; s/\n/&&/; /^\([ -~]*\n\).*\n\1/d; s/\n//; h; P'


 # delete all lines except duplicate lines (emulates "uniq -d").
 sed '$!N; s/^\(.*\)\n\1$/\1/; t; D'


 # delete the first 10 lines of a file
 sed '1,10d'


 # delete the last line of a file
 sed '$d'


 # delete the last 2 lines of a file
 sed 'N;$!P;$!D;$d'


 # delete the last 10 lines of a file
 sed -e :a -e '$d;N;2,10ba' -e 'P;D'   # method 1
 sed -n -e :a -e '1,10!{P;N;D;};N;ba'  # method 2


 # delete every 8th line
 gsed '0~8d'                           # GNU sed only
 sed 'n;n;n;n;n;n;n;d;'                # other seds


 # delete lines matching pattern
 sed '/pattern/d'


 # delete ALL blank lines from a file (same as "grep '.' ")
 sed '/^$/d'                           # method 1
 sed '/./!d'                           # method 2


 # delete all CONSECUTIVE blank lines from file except the first; also
 # deletes all blank lines from top and end of file (emulates "cat -s")
 sed '/./,/^$/!d'          # method 1, allows 0 blanks at top, 1 at EOF
 sed '/^$/N;/\n$/D'        # method 2, allows 1 blank at top, 0 at EOF


 # delete all CONSECUTIVE blank lines from file except the first 2:
 sed '/^$/N;/\n$/N;//D'


 # delete all leading blank lines at top of file
 sed '/./,$!d'


 # delete all trailing blank lines at end of file
 sed -e :a -e '/^\n*$/{$d;N;ba' -e '}'  # works on all seds
 sed -e :a -e '/^\n*$/N;/\n$/ba'        # ditto, except for gsed 3.02.*


 # delete the last line of each paragraph
 sed -n '/^$/{p;h;};/./{x;/./p;}'


SPECIAL APPLICATIONS:


 # remove nroff overstrikes (char, backspace) from man pages. The 'echo'
 # command may need an -e switch if you use Unix System V or bash shell.
 sed "s/.`echo \\\b`//g"    # double quotes required for Unix environment
 sed 's/.^H//g'             # in bash/tcsh, press Ctrl-V and then Ctrl-H
 sed 's/.\x08//g'           # hex expression for sed 1.5, GNU sed, ssed


 # get Usenet/e-mail message header
 sed '/^$/q'                # deletes everything after first blank line


 # get Usenet/e-mail message body
 sed '1,/^$/d'              # deletes everything up to first blank line


 # get Subject header, but remove initial "Subject: " portion
 sed '/^Subject: */!d; s///;q'


 # get return address header
 sed '/^Reply-To:/q; /^From:/h; /./d;g;q'


 # parse out the address proper. Pulls out the e-mail address by itself
 # from the 1-line return address header (see preceding script)
 sed 's/ *(.*)//; s/>.*//; s/.*[:<] *//'


 # add a leading angle bracket and space to each line (quote a message)
 sed 's/^/> /'


 # delete leading angle bracket & space from each line (unquote a message)
 sed 's/^> //'


 # remove most HTML tags (accommodates multiple-line tags)
 sed -e :a -e 's/<[^>]*>//g;/</N;//ba'


 # extract multi-part uuencoded binaries, removing extraneous header
 # info, so that only the uuencoded portion remains. Files passed to
 # sed must be passed in the proper order. Version 1 can be entered
 # from the command line; version 2 can be made into an executable
 # Unix shell script. (Modified from a script by Rahul Dhesi.)
 sed '/^end/,/^begin/d' file1 file2 ... fileX | uudecode   # vers. 1
 sed '/^end/,/^begin/d' "$@" | uudecode                    # vers. 2


 # sort paragraphs of file alphabetically. Paragraphs are separated by blank
 # lines. GNU sed uses \v for vertical tab, or any unique char will do.
 sed '/./{H;d;};x;s/\n/={NL}=/g' file | sort | sed '1s/={NL}=//;s/={NL}=/\n/g'
 gsed '/./{H;d};x;y/\n/\v/' file | sort | sed '1s/\v//;y/\v/\n/'


 # zip up each .TXT file individually, deleting the source file and
 # setting the name of each .ZIP file to the basename of the .TXT file
 # (under DOS: the "dir /b" switch returns bare filenames in all caps).
 echo @echo off >zipup.bat
 dir /b *.txt | sed "s/^\(.*\)\.TXT/pkzip -mo \1 \1.TXT/" >>zipup.bat


TYPICAL USE: Sed takes one or more editing commands and applies all of
them, in sequence, to each line of input. After all the commands have
been applied to the first input line, that line is output and a second
input line is taken for processing, and the cycle repeats. The
preceding examples assume that input comes from the standard input
device (i.e, the console, normally this will be piped input). One or
more filenames can be appended to the command line if the input does
not come from stdin. Output is sent to stdout (the screen). Thus:


 cat filename | sed '10q'        # uses piped input
 sed '10q' filename              # same effect, avoids a useless "cat"
 sed '10q' filename > newfile    # redirects output to disk


For additional syntax instructions, including the way to apply editing
commands from a disk file instead of the command line, consult "sed &
awk, 2nd Edition," by Dale Dougherty and Arnold Robbins (O'Reilly,
1997; http://www.ora.com), "UNIX Text Processing," by Dale Dougherty
and Tim O'Reilly (Hayden Books, 1987) or the tutorials by Mike Arst
distributed in U-SEDIT2.ZIP (many sites). To fully exploit the power
of sed, one must understand "regular expressions." For this, see
"Mastering Regular Expressions" by Jeffrey Friedl (O'Reilly, 1997).
The manual ("man") pages on Unix systems may be helpful (try "man
sed", "man regexp", or the subsection on regular expressions in "man
ed"), but man pages are notoriously difficult. They are not written to
teach sed use or regexps to first-time users, but as a reference text
for those already acquainted with these tools.


QUOTING SYNTAX: The preceding examples use single quotes ('...')
instead of double quotes ("...") to enclose editing commands, since
sed is typically used on a Unix platform. Single quotes prevent the
Unix shell from intrepreting the dollar sign ($) and backquotes
(`...`), which are expanded by the shell if they are enclosed in
double quotes. Users of the "csh" shell and derivatives will also need
to quote the exclamation mark (!) with the backslash (i.e., \!) to
properly run the examples listed above, even within single quotes.
Versions of sed written for DOS invariably require double quotes
("...") instead of single quotes to enclose editing commands.


USE OF '\t' IN SED SCRIPTS: For clarity in documentation, we have used
the expression '\t' to indicate a tab character (0x09) in the scripts.
However, most versions of sed do not recognize the '\t' abbreviation,
so when typing these scripts from the command line, you should press
the TAB key instead. '\t' is supported as a regular expression
metacharacter in awk, perl, and HHsed, sedmod, and GNU sed v3.02.80.


VERSIONS OF SED: Versions of sed do differ, and some slight syntax
variation is to be expected. In particular, most do not support the
use of labels (:name) or branch instructions (b,t) within editing
commands, except at the end of those commands. We have used the syntax
which will be portable to most users of sed, even though the popular
GNU versions of sed allow a more succinct syntax. When the reader sees
a fairly long command such as this:


   sed -e '/AAA/b' -e '/BBB/b' -e '/CCC/b' -e d


it is heartening to know that GNU sed will let you reduce it to:


   sed '/AAA/b;/BBB/b;/CCC/b;d'      # or even
   sed '/AAA\|BBB\|CCC/b;d'


In addition, remember that while many versions of sed accept a command
like "/one/ s/RE1/RE2/", some do NOT allow "/one/! s/RE1/RE2/", which
contains space before the 's'. Omit the space when typing the command.


OPTIMIZING FOR SPEED: If execution speed needs to be increased (due to
large input files or slow processors or hard disks), substitution will
be executed more quickly if the "find" expression is specified before
giving the "s/.../.../" instruction. Thus:


   sed 's/foo/bar/g' filename         # standard replace command
   sed '/foo/ s/foo/bar/g' filename   # executes more quickly
   sed '/foo/ s//bar/g' filename      # shorthand sed syntax


On line selection or deletion in which you only need to output lines
from the first part of the file, a "quit" command (q) in the script
will drastically reduce processing time for large files. Thus:


   sed -n '45,50p' filename           # print line nos. 45-50 of a file
   sed -n '51q;45,50p' filename       # same, but executes much faster


If you have any additional scripts to contribute or if you find errors
in this document, please send e-mail to the compiler. Indicate the
version of sed you used, the operating system it was compiled for, and
the nature of the problem. To qualify as a one-liner, the command line
must be 65 characters or less. Various scripts in this file have been
written or contributed by:


 Al Aab                   # founder of "seders" list
 Edgar Allen              # various
 Yiorgos Adamopoulos      # various
 Dale Dougherty           # author of "sed & awk"
 Carlos Duarte            # author of "do it with sed"
 Eric Pement              # author of this document
 Ken Pizzini              # author of GNU sed v3.02
 S.G. Ravenhall           # great de-html script
 Greg Ubben               # many contributions & much help

-------------------------------------------------------------------------


以下是中文版:



在以下地址可找到本文檔的最新(英文)版本:
   http://sed.sourceforge.net/sed1line.txt
   http://www.pement.org/sed/sed1line.txt


其他語言版本:
  中文          - http://sed.sourceforge.net/sed1line_zh-CN.html
  捷克語        - http://sed.sourceforge.net/sed1line_cz.html
  荷語          - http://sed.sourceforge.net/sed1line_nl.html
  法語          - http://sed.sourceforge.net/sed1line_fr.html
  德語          - http://sed.sourceforge.net/sed1line_de.html


  葡語          - http://sed.sourceforge.net/sed1line_pt-BR.html




文本間隔:
--------


# 在每一行後面增加一空行
sed G


# 將原來的所有空行刪除並在每一行後面增加一空行。
# 這樣在輸出的文本中每一行後面將有且只有一空行。
sed '/^$/d;G'


# 在每一行後面增加兩行空行
sed 'G;G'


# 將第一個腳本所產生的所有空行刪除(即刪除所有偶數行)
sed 'n;d'


# 在匹配式樣“regex”的行之前插入一空行
sed '/regex/{x;p;x;}'


# 在匹配式樣“regex”的行之後插入一空行
sed '/regex/G'


# 在匹配式樣“regex”的行之前和之後各插入一空行
sed '/regex/{x;p;x;G;}'


編號:
--------


# 爲文件中的每一行進行編號(簡單的左對齊方式)。這裏使用了“製表符”
# (tab,見本文末尾關於'\t'的用法的描述)而不是空格來對齊邊緣。
sed = filename | sed 'N;s/\n/\t/'


# 對文件中的所有行編號(行號在左,文字右端對齊)。
sed = filename | sed 'N; s/^/     /; s/ *\(.\{6,\}\)\n/\1  /'


# 對文件中的所有行編號,但只顯示非空白行的行號。
sed '/./=' filename | sed '/./N; s/\n/ /'


# 計算行數 (模擬 "wc -l")
sed -n '$='


文本轉換和替代:
--------


# Unix環境:轉換DOS的新行符(CR/LF)爲Unix格式。
sed 's/.$//'                     # 假設所有行以CR/LF結束
sed 's/^M$//'                    # 在bash/tcsh中,將按Ctrl-M改爲按Ctrl-V
sed 's/\x0D$//'                  # ssed、gsed 3.02.80,及更高版本


# Unix環境:轉換Unix的新行符(LF)爲DOS格式。
sed "s/$/`echo -e \\\r`/"        # 在ksh下所使用的命令
sed 's/"/`echo \\\r`/"         # 在bash下所使用的命令
sed "s/$/`echo \\\r`/"           # 在zsh下所使用的命令
sed 's/$/\r/'                    # gsed 3.02.80 及更高版本


# DOS環境:轉換Unix新行符(LF)爲DOS格式。
sed "s/$//"                      # 方法 1
sed -n p                         # 方法 2


# DOS環境:轉換DOS新行符(CR/LF)爲Unix格式。
# 下面的腳本只對UnxUtils sed 4.0.7 及更高版本有效。要識別UnxUtils版本的
#  sed可以通過其特有的“--text”選項。你可以使用幫助選項(“--help”)看
# 其中有無一個“--text”項以此來判斷所使用的是否是UnxUtils版本。其它DOS
# 版本的的sed則無法進行這一轉換。但可以用“tr”來實現這一轉換。
sed "s/\r//" infile >outfile     # UnxUtils sed v4.0.7 或更高版本
tr -d \r <infile >outfile        # GNU tr 1.22 或更高版本


# 將每一行前導的“空白字符”(空格,製表符)刪除
# 使之左對齊
sed 's/^[ \t]*//'                # 見本文末尾關於'\t'用法的描述


# 將每一行拖尾的“空白字符”(空格,製表符)刪除
sed 's/[ \t]*$//'                # 見本文末尾關於'\t'用法的描述


# 將每一行中的前導和拖尾的空白字符刪除
sed 's/^[ \t]*//;s/[ \t]*$//'


# 在每一行開頭處插入5個空格(使全文向右移動5個字符的位置)
sed 's/^/     /'


# 以79個字符爲寬度,將所有文本右對齊
sed -e :a -e 's/^.\{1,78\}$/ &/;ta'  # 78個字符外加最後的一個空格


# 以79個字符爲寬度,使所有文本居中。在方法1中,爲了讓文本居中每一行的前
# 頭和後頭都填充了空格。 在方法2中,在居中文本的過程中只在文本的前面填充
# 空格,並且最終這些空格將有一半會被刪除。此外每一行的後頭並未填充空格。
sed  -e :a -e 's/^.\{1,77\}$/ & /;ta'                     # 方法1
sed  -e :a -e 's/^.\{1,77\}$/ &/;ta' -e 's/\( *\)\1/\1/'  # 方法2


# 在每一行中查找字串“foo”,並將找到的“foo”替換爲“bar”
sed 's/foo/bar/'                 # 只替換每一行中的第一個“foo”字串
sed 's/foo/bar/4'                # 只替換每一行中的第四個“foo”字串
sed 's/foo/bar/g'                # 將每一行中的所有“foo”都換成“bar”
sed 's/\(.*\)foo\(.*foo\)/\1bar\2/' # 替換倒數第二個“foo”
sed 's/\(.*\)foo/\1bar/'            # 替換最後一個“foo”


# 只在行中出現字串“baz”的情況下將“foo”替換成“bar”
sed '/baz/s/foo/bar/g'


# 將“foo”替換成“bar”,並且只在行中未出現字串“baz”的情況下替換
sed '/baz/!s/foo/bar/g'


# 不管是“scarlet”“ruby”還是“puce”,一律換成“red”
sed 's/scarlet/red/g;s/ruby/red/g;s/puce/red/g'  #對多數的sed都有效
gsed 's/scarlet\|ruby\|puce/red/g'               # 只對GNU sed有效


# 倒置所有行,第一行成爲最後一行,依次類推(模擬“tac”)。
# 由於某些原因,使用下面命令時HHsed v1.5會將文件中的空行刪除
sed '1!G;h;$!d'               # 方法1
sed -n '1!G;h;$p'             # 方法2


# 將行中的字符逆序排列,第一個字成爲最後一字,……(模擬“rev”)
sed '/\n/!G;s/\(.\)\(.*\n\)/&\2\1/;//D;s/.//'


# 將每兩行連接成一行(類似“paste”)
sed '$!N;s/\n/ /'


# 如果當前行以反斜槓“\”結束,則將下一行併到當前行末尾
# 並去掉原來行尾的反斜槓
sed -e :a -e '/\\$/N; s/\\\n//; ta'


# 如果當前行以等號開頭,將當前行併到上一行末尾
# 並以單個空格代替原來行頭的“=”
sed -e :a -e '$!N;s/\n=/ /;ta' -e 'P;D'


# 爲數字字串增加逗號分隔符號,將“1234567”改爲“1,234,567”
gsed ':a;s/\B[0-9]\{3\}\>/,&/;ta'                     # GNU sed
sed -e :a -e 's/\(.*[0-9]\)\([0-9]\{3\}\)/\1,\2/;ta'  # 其他sed


# 爲帶有小數點和負號的數值增加逗號分隔符(GNU sed)
gsed -r ':a;s/(^|[^0-9.])([0-9]+)([0-9]{3})/\1\2,\3/g;ta'


# 在每5行後增加一空白行 (在第5,10,15,20,等行後增加一空白行)
gsed '0~5G'                      # 只對GNU sed有效
sed 'n;n;n;n;G;'                 # 其他sed


選擇性地顯示特定行:
--------


# 顯示文件中的前10行 (模擬“head”的行爲)
sed 10q


# 顯示文件中的第一行 (模擬“head -1”命令)
sed q


# 顯示文件中的最後10行 (模擬“tail”)
sed -e :a -e '$q;N;11,$D;ba'


# 顯示文件中的最後2行(模擬“tail -2”命令)
sed '$!N;$!D'


# 顯示文件中的最後一行(模擬“tail -1”)
sed '$!d'                        # 方法1
sed -n '$p'                      # 方法2


# 顯示文件中的倒數第二行
sed -e '$!{h;d;}' -e x              # 當文件中只有一行時,輸入空行
sed -e '1{$q;}' -e '$!{h;d;}' -e x  # 當文件中只有一行時,顯示該行
sed -e '1{$d;}' -e '$!{h;d;}' -e x  # 當文件中只有一行時,不輸出


# 只顯示匹配正則表達式的行(模擬“grep”)
sed -n '/regexp/p'               # 方法1
sed '/regexp/!d'                 # 方法2


# 只顯示“不”匹配正則表達式的行(模擬“grep -v”)
sed -n '/regexp/!p'              # 方法1,與前面的命令相對應
sed '/regexp/d'                  # 方法2,類似的語法


# 查找“regexp”並將匹配行的上一行顯示出來,但並不顯示匹配行
sed -n '/regexp/{g;1!p;};h'


# 查找“regexp”並將匹配行的下一行顯示出來,但並不顯示匹配行
sed -n '/regexp/{n;p;}'


# 顯示包含“regexp”的行及其前後行,並在第一行之前加上“regexp”所
# 在行的行號 (類似“grep -A1 -B1”)
sed -n -e '/regexp/{=;x;1!p;g;$!N;p;D;}' -e h


# 顯示包含“AAA”、“BBB”或“CCC”的行(任意次序)
sed '/AAA/!d; /BBB/!d; /CCC/!d'  # 字串的次序不影響結果


# 顯示包含“AAA”、“BBB”和“CCC”的行(固定次序)
sed '/AAA.*BBB.*CCC/!d'


# 顯示包含“AAA”“BBB”或“CCC”的行 (模擬“egrep”)
sed -e '/AAA/b' -e '/BBB/b' -e '/CCC/b' -e d    # 多數sed
gsed '/AAA\|BBB\|CCC/!d'                        # 對GNU sed有效


# 顯示包含“AAA”的段落 (段落間以空行分隔)
# HHsed v1.5 必須在“x;”後加入“G;”,接下來的3個腳本都是這樣
sed -e '/./{H;$!d;}' -e 'x;/AAA/!d;'


# 顯示包含“AAA”“BBB”和“CCC”三個字串的段落 (任意次序)
sed -e '/./{H;$!d;}' -e 'x;/AAA/!d;/BBB/!d;/CCC/!d'


# 顯示包含“AAA”、“BBB”、“CCC”三者中任一字串的段落 (任意次序)
sed -e '/./{H;$!d;}' -e 'x;/AAA/b' -e '/BBB/b' -e '/CCC/b' -e d
gsed '/./{H;$!d;};x;/AAA\|BBB\|CCC/b;d'         # 只對GNU sed有效


# 顯示包含65個或以上字符的行
sed -n '/^.\{65\}/p'


# 顯示包含65個以下字符的行
sed -n '/^.\{65\}/!p'            # 方法1,與上面的腳本相對應
sed '/^.\{65\}/d'                # 方法2,更簡便一點的方法


# 顯示部分文本——從包含正則表達式的行開始到最後一行結束
sed -n '/regexp/,$p'


# 顯示部分文本——指定行號範圍(從第8至第12行,含8和12行)
sed -n '8,12p'                   # 方法1
sed '8,12!d'                     # 方法2


# 顯示第52行
sed -n '52p'                     # 方法1
sed '52!d'                       # 方法2
sed '52q;d'                      # 方法3, 處理大文件時更有效率


# 從第3行開始,每7行顯示一次    
gsed -n '3~7p'                   # 只對GNU sed有效
sed -n '3,${p;n;n;n;n;n;n;}'     # 其他sed


# 顯示兩個正則表達式之間的文本(包含)
sed -n '/Iowa/,/Montana/p'       # 區分大小寫方式


選擇性地刪除特定行:
--------


# 顯示通篇文檔,除了兩個正則表達式之間的內容
sed '/Iowa/,/Montana/d'


# 刪除文件中相鄰的重複行(模擬“uniq”)
# 只保留重複行中的第一行,其他行刪除
sed '$!N; /^\(.*\)\n\1$/!P; D'


# 刪除文件中的重複行,不管有無相鄰。注意hold space所能支持的緩存
# 大小,或者使用GNU sed。
sed -n 'G; s/\n/&&/; /^\([ -~]*\n\).*\n\1/d; s/\n//; h; P'


# 刪除除重複行外的所有行(模擬“uniq -d”)
sed '$!N; s/^\(.*\)\n\1$/\1/; t; D'


# 刪除文件中開頭的10行
sed '1,10d'


# 刪除文件中的最後一行
sed '$d'


# 刪除文件中的最後兩行
sed 'N;$!P;$!D;$d'


# 刪除文件中的最後10行
sed -e :a -e '$d;N;2,10ba' -e 'P;D'   # 方法1
sed -n -e :a -e '1,10!{P;N;D;};N;ba'  # 方法2


# 刪除8的倍數行
gsed '0~8d'                           # 只對GNU sed有效
sed 'n;n;n;n;n;n;n;d;'                # 其他sed


# 刪除匹配式樣的行
sed '/pattern/d'                      # 刪除含pattern的行。當然pattern
                                       # 可以換成任何有效的正則表達式


# 刪除文件中的所有空行(與“grep '.' ”效果相同)
sed '/^$/d'                           # 方法1
sed '/./!d'                           # 方法2


# 只保留多個相鄰空行的第一行。並且刪除文件頂部和尾部的空行。
# (模擬“cat -s”)
sed '/./,/^$/!d'        #方法1,刪除文件頂部的空行,允許尾部保留一空行
sed '/^$/N;/\n$/D'      #方法2,允許頂部保留一空行,尾部不留空行


# 只保留多個相鄰空行的前兩行。
sed '/^$/N;/\n$/N;//D'


# 刪除文件頂部的所有空行
sed '/./,$!d'


# 刪除文件尾部的所有空行
sed -e :a -e '/^\n*$/{$d;N;ba' -e '}'  # 對所有sed有效
sed -e :a -e '/^\n*$/N;/\n$/ba'        # 同上,但只對 gsed 3.02.*有效


# 刪除每個段落的最後一行
sed -n '/^$/{p;h;};/./{x;/./p;}'


特殊應用:
--------


# 移除手冊頁(man page)中的nroff標記。在Unix System V或bash shell下使
# 用'echo'命令時可能需要加上 -e 選項。
sed "s/.`echo \\\b`//g"    # 外層的雙括號是必須的(Unix環境)
sed 's/.^H//g'             # 在bash或tcsh中, 按 Ctrl-V 再按 Ctrl-H
sed 's/.\x08//g'           # sed 1.5,GNU sed,ssed所使用的十六進制的表示方法


# 提取新聞組或 e-mail 的郵件頭
sed '/^$/q'                # 刪除第一行空行後的所有內容


# 提取新聞組或 e-mail 的正文部分
sed '1,/^$/d'              # 刪除第一行空行之前的所有內容


# 從郵件頭提取“Subject”(標題欄字段),並移除開頭的“Subject:”字樣
sed '/^Subject: */!d; s///;q'


# 從郵件頭獲得回覆地址
sed '/^Reply-To:/q; /^From:/h; /./d;g;q'


# 獲取郵件地址。在上一個腳本所產生的那一行郵件頭的基礎上進一步的將非電郵
# 地址的部分剃除。(見上一腳本)
sed 's/ *(.*)//; s/>.*//; s/.*[:<] *//'


# 在每一行開頭加上一個尖括號和空格(引用信息)
sed 's/^/> /'


# 將每一行開頭處的尖括號和空格刪除(解除引用)
sed 's/^> //'


# 移除大部分的HTML標籤(包括跨行標籤)
sed -e :a -e 's/<[^>]*>//g;/</N;//ba'


# 將分成多卷的uuencode文件解碼。移除文件頭信息,只保留uuencode編碼部分。
# 文件必須以特定順序傳給sed。下面第一種版本的腳本可以直接在命令行下輸入;
# 第二種版本則可以放入一個帶執行權限的shell腳本中。(由Rahul Dhesi的一
# 個腳本修改而來。)
sed '/^end/,/^begin/d' file1 file2 ... fileX | uudecode   # vers. 1
sed '/^end/,/^begin/d' "$@" | uudecode                    # vers. 2


# 將文件中的段落以字母順序排序。段落間以(一行或多行)空行分隔。GNU sed使用
# 字元“\v”來表示垂直製表符,這裏用它來作爲換行符的佔位符——當然你也可以
# 用其他未在文件中使用的字符來代替它。
sed '/./{H;d;};x;s/\n/={NL}=/g' file | sort | sed '1s/={NL}=//;s/={NL}=/\n/g'
gsed '/./{H;d};x;y/\n/\v/' file | sort | sed '1s/\v//;y/\v/\n/'


# 分別壓縮每個.TXT文件,壓縮後刪除原來的文件並將壓縮後的.ZIP文件
# 命名爲與原來相同的名字(只是擴展名不同)。(DOS環境:“dir /b”
# 顯示不帶路徑的文件名)。
echo @echo off >zipup.bat
dir /b *.txt | sed "s/^\(.*\)\.TXT/pkzip -mo \1 \1.TXT/" >>zipup.bat




使用SED:Sed接受一個或多個編輯命令,並且每讀入一行後就依次應用這些命令。
當讀入第一行輸入後,sed對其應用所有的命令,然後將結果輸出。接着再讀入第二
行輸入,對其應用所有的命令……並重復這個過程。上一個例子中sed由標準輸入設
備(即命令解釋器,通常是以管道輸入的形式)獲得輸入。在命令行給出一個或多
個文件名作爲參數時,這些文件取代標準輸入設備成爲sed的輸入。sed的輸出將被
送到標準輸出(顯示器)。因此:


cat filename | sed '10q'         # 使用管道輸入
sed '10q' filename               # 同樣效果,但不使用管道輸入
sed '10q' filename > newfile     # 將輸出轉移(重定向)到磁盤上


要了解sed命令的使用說明,包括如何通過腳本文件(而非從命令行)來使用這些命
令,請參閱《sed & awk》第二版,作者Dale Dougherty和Arnold Robbins
(O'Reilly,1997;http://www.ora.com),《UNIX Text Processing》,作者
Dale Dougherty和Tim O'Reilly(Hayden Books,1987)或者是Mike Arst寫的教
程——壓縮包的名稱是“U-SEDIT2.ZIP”(在許多站點上都找得到)。要發掘sed
的潛力,則必須對“正則表達式”有足夠的理解。正則表達式的資料可以看
《Mastering Regular Expressions》作者Jeffrey Friedl(O'reilly 1997)。
Unix系統所提供的手冊頁(“man”)也會有所幫助(試一下這些命令
“man sed”、“man regexp”,或者看“man ed”中關於正則表達式的部分),但
手冊提供的信息比較“抽象”——這也是它一直爲人所詬病的。不過,它本來就不
是用來教初學者如何使用sed或正則表達式的教材,而只是爲那些熟悉這些工具的人
提供的一些文本參考。


括號語法:前面的例子對sed命令基本上都使用單引號('...')而非雙引號
("...")這是因爲sed通常是在Unix平臺上使用。單引號下,Unix的shell(命令
解釋器)不會對美元符($)和後引號(`...`)進行解釋和執行。而在雙引號下
美元符會被展開爲變量或參數的值,後引號中的命令被執行並以輸出的結果代替
後引號中的內容。而在“csh”及其衍生的shell中使用感嘆號(!)時需要在其前
面加上轉義用的反斜槓(就像這樣:\!)以保證上面所使用的例子能正常運行
(包括使用單引號的情況下)。DOS版本的Sed則一律使用雙引號("...")而不是
引號來圈起命令。


'\t'的用法:爲了使本文保持行文簡潔,我們在腳本中使用'\t'來表示一個製表
符。但是現在大部分版本的sed還不能識別'\t'的簡寫方式,因此當在命令行中爲
腳本輸入製表符時,你應該直接按TAB鍵來輸入製表符而不是輸入'\t'。下列的工
具軟件都支持'\t'做爲一個正則表達式的字元來表示製表符:awk、perl、HHsed、
sedmod以及GNU sed v3.02.80。


不同版本的SED:不同的版本間的sed會有些不同之處,可以想象它們之間在語法上
會有差異。具體而言,它們中大部分不支持在編輯命令中間使用標籤(:name)或分
支命令(b,t),除非是放在那些的末尾。這篇文檔中我們儘量選用了可移植性較高
的語法,以使大多數版本的sed的用戶都能使用這些腳本。不過GNU版本的sed允許使
用更簡潔的語法。想像一下當讀者看到一個很長的命令時的心情:


   sed -e '/AAA/b' -e '/BBB/b' -e '/CCC/b' -e d


好消息是GNU sed能讓命令更緊湊:


   sed '/AAA/b;/BBB/b;/CCC/b;d'      # 甚至可以寫成
   sed '/AAA\|BBB\|CCC/b;d'


此外,請注意雖然許多版本的sed接受象“/one/ s/RE1/RE2/”這種在's'前帶有空
格的命令,但這些版本中有些卻不接受這樣的命令:“/one/! s/RE1/RE2/”。這時
只需要把中間的空格去掉就行了。


速度優化:當由於某種原因(比如輸入文件較大、處理器或硬盤較慢等)需要提高
命令執行速度時,可以考慮在替換命令(“s/.../.../”)前面加上地址表達式來
提高速度。舉例來說:


   sed 's/foo/bar/g' filename         # 標準替換命令
   sed '/foo/ s/foo/bar/g' filename   # 速度更快
   sed '/foo/ s//bar/g' filename      # 簡寫形式


當只需要顯示文件的前面的部分或需要刪除後面的內容時,可以在腳本中使用“q”
命令(退出命令)。在處理大的文件時,這會節省大量時間。因此:


   sed -n '45,50p' filename           # 顯示第45到50行
   sed -n '51q;45,50p' filename       # 一樣,但快得多


如果你有其他的單行腳本想與大家分享或者你發現了本文檔中錯誤的地方,請發電
子郵件給本文檔的作者(Eric Pement)。郵件中請記得提供你所使用的sed版本、 
該sed所運行的操作系統及對問題的適當描述。本文所指的單行腳本指命令行的長
度在65個字符或65個以下的sed腳本〔譯註1〕。本文檔的各種腳本是由以下所列作
者所寫或提供:


Al Aab                               # 建立了“seders”郵件列表
Edgar Allen                          # 許多方面
Yiorgos Adamopoulos                  # 許多方面
Dale Dougherty                       # 《sed & awk》作者
Carlos Duarte                        # 《do it with sed》作者
Eric Pement                          # 本文檔的作者
Ken Pizzini                          # GNU sed v3.02 的作者
S.G. Ravenhall                       # 去html標籤腳本
Greg Ubben                           # 有諸多貢獻並提供了許多幫助
-------------------------------------------------------------------------


譯註1:大部分情況下,sed腳本無論多長都能寫成單行的形式(通過`-e'選項和`;'
號)——只要命令解釋器支持,所以這裏說的單行腳本除了能寫成一行還對長度有
所限制。因爲這些單行腳本的意義不在於它們是以單行的形式出現。而是讓用戶能
方便地在命令行中使用這些緊湊的腳本纔是其意義所在。


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章