shell編程之正則表達式

一、正則表達式

1.何爲正則表達式

正則表達式是用於描述字符排列和匹配模式的一種語法規則，他主要用於字符串的模式分割、匹配、查找以及替換操作

主要是用於模糊匹配，分割查找替換稍微少一些

2.正則表達式與通配符

正則表達式用來在文件中匹配符合條件的字符串，正則是包含匹配。grep、awk、sed等命令可以支持正則表達式,做字符匹配（數據）
通配符是用來匹配符合文件的文件名，通配符是完全匹配。ls、find、cp這些命令不支持正則表達式，所以之恩那個使用shell自己的通配符來進行匹配，做文件（名）完全匹配

* 匹配任意字符

？匹配任意一個內容

[ ] 匹配括號中的一個字符

[root@localhost ~]# touch cangls

[root@localhost ~]# touch cangll

[root@localhost ~]# ls

ananconda-ks.cfg cangls cangll install.log

[root@localhost ~]# ls cangl?

cangls cangll

[root@localhost ~]# ls cang??

cangls cangll

[root@localhost ~]# ls cang*

cangls cangll

[root@localhost ~]# ls cangl[l,s]

cangls cangll

區分正則表達式和通配符：

[root@localhost ~]# touch abc

[root@localhost ~]# touch abcd

[root@localhost ~]# ls

ananconda-ks.cfg cangls cangll install.log abc abcd

[root@localhost ~]# find . –name abc

./abc

#不包括abcd，完全匹配，而不是包含匹配

[root@localhost ~]# find . - name abc?

./abcd

[root@localhost ~]# find . – name “abc*”

./abc

./abcd

3.基礎正則表達式

正則表達式分爲基礎正則表達式和擴展正則表達式

基礎正則表達式：

$:匹配行尾，統一放在命令字符尾部，例如Helllo$匹配以Hello結尾的行

其實linux中？（）是擴展正則表達式

詳細介紹：

1)“*”前一個字符匹配0次，或任意多次

“a*”：匹配所有內容，包括空白行

“aa*”：匹配至少包含一個a的行

“aaa*”：匹配至少包括兩個連續a的字符串

“aaaaa*”：則會匹配最少包含四個連續啊的字符串

例子：

[root@localhost ~]# vim test.txt

#!/bin/bashrc

aaa

aaaa

aaaaa

bbb

bbbb

bbbbb

aabb

~wq

[root@localhost ~]# vi .bashrc

# .bashrc

# User specific aliases and function

alias rm = ‘rm –i’

alias cp = ‘cp –i’

alias mv = ‘mv –i’

alias vi = ‘vim;

alias grep = ‘grep –color = auto’

…

#設置別名文件，

[root@localhost ~]# source .bashrc

#生效

[root@localhost ~]# grep “a*” test.txt

#*前加任意字符代表該字符重複0次到任意多次

aaa

aaaa

aaaaa#上面a顏色爲紅色

bbb

bbbb

bbbbb

aabb

[root@localhost ~]# grep “aa*” test.txt

#至少匹配一個a，後面是多少個a都行，不匹配空格

aaa

aaaa

aaaaa

aabb

#出現的a皆是紅色

[root@localhost ~]# grep “aaaaa*” test.txt

aaaaa

2)“.“匹配除了換行符外任意一個字符

”s..d“:”s..d”會匹配在s和d這兩個字母之間一定有兩個字符的單詞

”s.*d“：匹配在s和d之間有任意字符

”.*“：匹配所有內容

[root@localhost ~]# vi test.txt

#!/bin/bashrc

aaa

aaaa

aaaaa

said

soid

suud

sooooood

bbb

bbbb

bbbbb

aabb

:wq

[root@localhost ~]# grep “s..d” test.txt

said

soid

suud

[root@localhost ~]# grep “s.*d” test.txt

said

soid

suud

sooooood

[root@localhost ~]# grep “.*” test.txt

aaa

aaaa

aaaaa

said

soid

suud

sooooood

bbb

bbbb

bbbbb

aabb

#所有的都是紅色的，當然空格是沒有顏色的=。=

3）.“^”匹配行首，“$”匹配行尾

“^M”：匹配以大寫“M”開頭的行

“n$”：匹配以小寫“n”結尾的行

“^$”：會匹配空白行

[root@localhost ~]# vim test.txt

#!/bin/bashrc

aaa

aaaa

aaaaa

said

soid

suud

sooooood

bbb

bbbb

bbbbb

aabb

:wq

[root@localhost ~]# grep “^s” test.txt

said

soid

suud

sooooood

[root@localhost ~]# grep “b$” test.txt

bbb

bbbb

bbbbb

aabb

[root@localhost ~]# grep “^$” test.txt

[root@localhost ~]# grep -n “^$” test.txt

11:

4）.“[]”匹配中括號中指定的任意一個字符，只匹配一個字符

“s[ao]id”: 匹配s和i字幕中，要不是啊，要不是o

“[0-9]”：匹配任意一個數字

“^[a-z]”：匹配用小寫字母開頭的行

[root@localhost ~]# grep –n “s[ao]id” test.txt

7：said

8：soid

[root@localhost ~]# vim test.txt

aaa

aaaa

aaaaa

said

soid

suud

sooooood

1234567

890

2aa

cc22

bbb

bbbb

bbbbb

aabb

:wq

[root@localhost ~]# grep “[0-9]”test.txt

1234567

890

2aa

cc22

#雖然是部分匹配內容，但是顯示還是會顯示一整行

[root@localhost ~]# grep “^[0-9]” test.txt

1234567

890

2aa

#沒有cc22

[root@localhost ~]# grep “[0-9]$” test.txt

1234567

890

cc22

#不匹配2aa

4）.”[^]”:匹配除中括號的字符以外的任意一個字符

“^[^a-z]”:匹配不用小寫字母開頭的行

“^[^a-zA-Z]”:匹配不用字母開頭的行

[root@localhost ~]# grep “[a-z]” test.txt

#這行中必須有一個字母

aaa

aaaa

aaaaa

said

soid

suud

sooooood

2aa

cc22

bbb

bbbb

bbbbb

aabb

[root@localhost ~]# grep “[^a-z]”test.txt

1234567

890

2aa

cc22

[root@localhost ~]# grep “^[^a-z]”test.txx

1234567

890

2aa

#注意：[A-z]:在一些正則表達式中表示所有字母，但是在linux中應該寫[a-zA-Z]表示所有字母

5）.“\”轉義符

“\.$”：匹配使用“.”結尾的行

[root@localhost ~]# vim test.txt

aaa

aaaa

aaaaa

said

soid

suud

sooooood.

1234567

890

2aa

cc22

bbb

bbbb

bbbbb

aabb

[root@localhost ~]# grep “.$” test.txt

aaa

aaaa

aaaaa

said

soid

suud

sooooood.

1234567

890

2aa

cc22

bbb

bbbb

bbbbb

aabb

sab

saabb

#匹配任意結尾的行

[root@localhost ~]# grep “\.$” test.txt

sooooood.

6）.“\{n\}”表示前面的字符恰好出現n次

“a\{3\}”:匹配a字母必須連續出現三次的字符串

“[0-9]\{3\}”：匹配包含連續的三個數字的字符串

[root@localhost ~]# grep “a\{\3}” test.txt

aaa

aaaa

aaaaa

#其實和大於兩次是一樣的，需要更復雜更詳細的方法來區分，比如加定界符

[root@localhost ~]# grep “a\{\2}b”test.txt

aabb

#只有這一個

[root@localhost ~]# grep “sa\{\2}b” test.txt

saabb

#沒有aabb

7）.“\{n,\}”表示前面的字符出現不小於n次

“^[0-9]\{3,\}[a-z]”:匹配最少連續三個數字開頭的行

8）.”\{n,m\}”匹配其前面的字符至少出現n次，最多出現m次

“sa\{1,3\}b”:匹配在字母s和字母b之間至少有一個a，最多三個a

[root@localhost ~]# grep “sa\{1,2\}b” test.txt

sab

saabb

#我們可以用正則來確定一個服務的關鍵字，來確定這個服務時候正常運行

其他例子：

[0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\}:匹配日期格式：YYYY-MM-DD

[0-9]\{3\}.[0-9]\{3\}.[0-9]\{3\}.[0-9]\{3\}：匹配IP地址（點分十進制）

[root@localhost ~]# vim date.txt

2015-08-10

20150810

192.168.3.10

127.0.0.1

123456789

~wq

[root@localhost ~]# grep [0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\} date.txt

2015-08-10

[root@localhost ~]# grep [0-9]\{3\}.[0-9]\{3\}.[0-9]\{3\}.[0-9]\{3\} date.txt

192.168.3.10

127.0.0.1

二、字符截取命令

寫在前面的例子：提取出普通用戶

[root@localhost ~]# useradd user1

[root@localhost ~]# useradd user2

[root@localhost ~]# grep “/bin/bash” /etc/passwd

root:x:0:0:root:/root:/bin/bash

user1:x:500:500::/home/user1:/bin/bash

user2:x:500:500::/home/user2:/bin/bash

[root@localhost ~]# grep “/bin/bash” /etc/passwd | grep –v “root”

user1:x:500:500::/home/user1:/bin/bash

user2:x:500:500::/home/user2:/bin/bash

1.cut字段提取命令

格式：[root@localhost ~]# cut [選項] 文件名

選項：

-f 列號：提取第幾列

-d分隔符：按照制定分隔符分割列

cut –d “:” –f /etc/passwd

#注意grep是行提取命令，cut是列提取，支持以符號來提取，但是不支持空格號

例子1：cut基礎操作

[root@localhost ~]# vi student.txt

ID Name gender Mark

1 furong F 85

2 fengj F 60

3 cang F 70

#之間用製表符隔開，Tab

[root@localhost ~]# cut –f 2 student.txt

Name

furong

fengj

cang

[root@localhost ~]# cut –f 2,4 student.txt

Name Mark

furong 85

fengj 60

cang 70

[root@localhost ~]# grep “/bin/bash” /etc/passwd | grep –v “root”

user1:x:500:500::/home/user1:/bin/bash

user2:x:500:500::/home/user2:/bin/bash

[root@localhost ~]# grep “/bin/bash” /etc/passwd | grep –v “root”| cut –f 1 –d “:”

user1

user2

#只提取出這兩個用戶名稱

例子2：使用cut查看各分區情況

[root@localhost ~]# df

文件系統 1K-塊已用可用已用% 掛載點

/dev/sda5 17906468 1793848 15203004 11% /

tmpfs 515396 0 515396 0% /dev/shm

/dev/sda1 198337 26361 161736 15% /boot

/dev/sda2 2015824 35808 1877616 2% /home

[root@localhost ~]# df –h

文件系統容量已用可用已用% 掛載點

/dev/sda5 18G 1.8G 15G 11% /

tmpfs 504M 0 504M 0% /dev/shm

/dev/sda1 194M 26M 158M 15% /boot

/dev/sda2 2.0G 35M 1.8G 2% /home

#h:human，人性化顯示

[root@localhost ~]# df –h | cut –f 5

/dev/sda5 18G 1.8G 15G 11% /

tmpfs 504M 0 504M 0% /dev/shm

/dev/sda1 194M 26M 158M 15% /boot

/dev/sda2 2.0G 35M 1.8G 2% /home

[root@localhost ~]# df –h | cut –f 1

文件系統

/dev/sda5 18G 1.8G 15G 11% /

tmpfs 504M 0 504M 0% /dev/shm

/dev/sda1 194M 26M 158M 15% /boot

/dev/sda2 2.0G 35M 1.8G 2% /home

#爲什麼？因爲這個文件裏的這些數據都是用空格隔開的，沒有製表符，在系統內默認是一列的

2.printf命令

其實printf不是字符截取命令，但是printf在awk中會使用，它是一種標準輸出格式

格式：printf ‘輸出類型輸出格式’ 輸出內容

輸出類型：

%ns ：輸出字符串。n是指數字代指幾個字符

%ni ：輸出整數。n是數字指代輸出幾個數字

%m.nf ：輸出浮點數。m和n是數字，指代輸出的整數位和小數點位數，如%8.2f代表是共輸出8位數，其中2位是小數，6位是整數

輸出格式：

\a：輸出警告聲音

\b：輸出退格鍵，也就是Backspace鍵

\f ：清除屏幕

\n ：換行

\r ：回車，也就是Enter鍵

\t ：水平輸出退格鍵，也就是Tab鍵

\v ：垂直輸出退格鍵，也就是Tab鍵

[root@localhost ~]# echo 123456

123456

[root@localhost ~]# echo 1 2 3 4 5 6

1 2 3 4 5 6

[root@localhost ~]# printf 1 2 3 4 5 6

1[root@localhost ~]#

#因爲你沒有標出輸出類型

[root@localhost ~]# printf %s 1 2 3 4 5 6

123456[root@localhost ~]# printf %s %s %s 1 2 3 4 5 6

#不能省略單引號

%s%s123456[root@localhost ~]# printf ‘%s %s %s’ 1 2 3 4 5 6

1 2 34 5 6[root@localhost ~]# printf ‘%s\t%s\t%s\t\n’1 2 3 4 5 6

1 2 3

4 5 6

[root@localhost ~]# printf student.txt

student.txt

[root@localhost ~]# cat student.txt | printf

printf : usage : printf {-v var} format [arguments]

#printf不支持數據流程

[root@localhost ~]# cat student.txt

ID Name gender Mark

1 furong F 85

2 fengj F 60

3 cang F 70

[root@localhost ~]# printf $(cat student.txt)

ID[root@localhost ~]# printf ‘%s’$(cat student.txt)

IDNamegenderMark1furongFb52fengjF603cangF70[root@localhost ~]#

[root@localhost ~]# printf ‘%s\t%s\t%s\t%\n’$(cat student.txt)

ID Name gender Mark

1 furong F 85

2 fengj F 60

3 cang F 70

補充：在awk命令的輸出中支持print和printf命令

print：print會在每個輸出之後自動加一個換行符（Linux默認是沒有print命令的）

printf：printf是標準格式輸出命令，並不會自動加入換行符，如果需要在輸出格式中手動加入換行符即可

3.awk命令

1).格式：awk ‘條件1｛動作1｝條件2｛動作2｝…’文件名

條件（Pattern）：

一般使用關係表達式作爲條件

x>10判斷變量x是否大於10

x>=10判斷x是否大於等於10

x<=10判斷x是否小於等於10

動作（Action）：

格式化輸出

流程控制語句

函數

[root@localhost ~]# vi student.txt

ID Name gender Mark

1 furong F 85

2 fengj F 60

3 cang F 70

[root@localhost ~]# cut –f 2,4 student.txt

Name Mark

furong 85

fengj 60

cang 70

例子：使用awk實現選取2，4列

[root@localhost ~]# awk ‘{printf $2 “\t” $4 “\n”}’ student.txt

#如果沒條件就是無條件執行，$2,$4 ：第2，4列

Name Mark

furong 85

fengj 60

cang 70

#注意\t,\n要用雙引號引起來，其實awk提取的叫做字段，首先把第一行第2個字段賦值到$2，重複讀取完之後形成一列

Name

furong

fengj

cang

[root@localhost ~]# df –h

文件系統容量已用可用已用% 掛載點

/dev/sda5 18G 1.8G 15G 11% /

tmpfs 504M 0 504M 0% /dev/shm

/dev/sda1 194M 26M 158M 15% /boot

/dev/sda2 2.0G 35M 1.8G 2% /home

[root@localhost ~]# df –h | grep “/dev/sda5”

/dev/sda5 18G 1.8G 15G 11% /

[root@localhost ~]# df –h | grep “/dev/sda5”| awk ‘{print $1 “\t” $5}’

11%

2）. BEGIN關係表達式

awk ‘BEGIN{printf”This is a transcript \n”} {printf $2 “\t” $4 “\n”} ‘student.txt

#第一個條件BEGIN，滿足纔會執行，第二個沒有條件，無條件執行

[root@localhost ~]# cat student.txt

ID Name gender Mark

1 furong F 85

2 fengj F 60

3 cang F 70

[root@localhost ~]#

awk ‘BEGIN{printf”This is a transcript \n”} {printf $2 “\t” $4 “\n”} ‘student.txt

This is a transcript

Name Mark

furong 85

fengj 60

cang 70

3).END

awk ‘END{printf”The End \n”} {printf $2 “\t” $4 “\n”} ‘student.txt

[root@localhost ~]# awk ‘END{printf”The End \n”} {printf $2 “\t” $4 “\n”} ‘student.txt

Name Mark

furong 85

fengj 60

cang 70

The End

4).FS內置變量

#awk是默認使用製表符或者空格來分割的，我們如果想提取向：分割的passwd就要使用到其他的

cat /etc/passwd | grep “/bin/bash” | \ awk ‘BEGIN{FS=”:”}{printf $1 “\t”$3 “\n”}’

使用冒號“：”分割

[root@localhost ~]# cat /etc/passwd | grep /bin/bash

root:x:0:0:root:/root:/bin/bash

user1:x:500:500::/home/user1:/bin/bash

user2:x:500:500::/home/user2:/bin/bash

[root@localhost ~]# cat /etc/passwd | grep /bin/bash | awk ‘{FS=”:”}{print $1 “\t” $3}’

root:x:0:0:root:/root:/bin/bash

user1 500

user2 501

#原因：在把數據讀取之後再處理，第一行在命令FS執行之前以前完成了

[root@localhost ~]# cat /etc/passwd | grep /bin/bash | awk ‘BEGIN{FS=”:”}{print $1 “\t” $3}’

root 0

user1 500

user2 501

5).關係運算符

cat student.txt | grep –v Name | \ awk ‘$4>=70{printf $2 “\n”}’

[root@localhost ~]# cat student.txt

ID Name gender Mark

1 furong F 85

2 fengj F 60

3 cang F 70

[root@localhost ~]# cat student.txt | grep –v Name

1 furong F 85

2 fengj F 60

3 cang F 70

[root@localhost ~]# cat student.txt | grep –v Name | awk ‘$4>=70{print $2}’

furong

cang

4.sed命令

字符替換命令，sed是一種幾乎包括所有的UNIX平臺（包括Linux）的輕量級編輯器。sed主要用來將數據進行選取、替換、刪除、新增的命令

格式：sed [選項] ’[動作]’文件名

選項：

-n：一般sed命令會把所有數據都輸出到屏幕，如果加入此選擇則會把經過sed命令處理的行輸出到屏幕，即只顯示修改後的數據

-e：與需對輸入的數據應用多條sed命令編輯即多個動作

-i ：用sed的修改結果直接修改讀取數據的文件，而不是由屏幕輸出，會修改原始文件

sed一般不會修改原始的數據

動作：

a ：追加，在當前行後添加一行或多行

c ：行替換，用c後面的字符替換原始數據行

i ：插入（insert），在當前行插入一行或多行。

p ：打印（print），輸出指定的行

s ：字符串替換，用一個字符串替換另一個字符串。格式爲“行範圍s/舊字符串/新字符串/g”（和vim中的替換格式類似）

d ： DELETE，刪除，刪除指定行數據，但是不修改文件本身

[root@localhost ~]# vi student.txt

ID Name gender Mark

1 furong F 85

2 fengj F 60

3 cang F 70

1）.行數據操作

sed ‘2p’ student.txt

#查看文件的第二行

sed –n ‘2p’ student.txt

[root@localhost ~]# sed ‘2p’ student.txt

ID Name gender Mark

1 furong F 85

2 fengj F 60

3 cang F 70

#沒有-n，仍舊會顯示所有的信息

[root@localhost ~]# sed -n‘2p’ student.txt

1 furong F 85

[root@localhost ~]# sed ‘2d’ student.txt

ID Name gender Mark

2 fengj F 60

3 cang F 70

[root@localhost ~]# sed ‘2,4d’ student.txt

ID Name gender Mark

#刪除第二行到第四行的數據，但不修改文件本身

[root@localhost ~]# sed ‘2a Life is Short’student.txt

ID Name gender Mark

1 furong F 85

Life is Short

2 fengj F 60

3 cang F 70

#在原始第二行後面加入

[root@localhost ~]# sed ‘2i Life is Short’student.txt

ID Name gender Mark

Life is Short

1 furong F 85

2 fengj F 60

3 cang F 70

#在原始的第二行前面加入

[root@localhost ~]# sed ‘2c Life is Short’student.txt

ID Name gender Mark

Life is Short

2 fengj F 60

3 cang F 70

#第二行數據完全替換

3）.字符串替換

格式：sed ‘行號s/舊字符串/新字符串/g’文件名

sed ‘3s/60/99/g’ student.txt；在第三行中把60換成99

sed -i‘3s/60/99/g’ student.txt;sed操作的數據直接寫入文件

sed –e ‘s/fengj//g;s/cang//g’ student.txt;同時把”fengj”和”cang”替換爲空

[root@localhost ~]# sed ‘4s/70/100/g’ student.txt

ID Name gender Mark

1 furong F 85

2 fengj F 60

3 cang F 100

[root@localhost ~]# sed -i‘4s/70/100/g’ student.txt

[root@localhost ~]#

[root@localhost ~]# cat student.txt

ID Name gender Mark

1 furong F 85

2 fengj F 60

3 cang F 100

[root@localhost ~]# sed –e ‘s/furong//g;s/fengj//g’ student.txt

ID Name gender Mark

1 F 85

2 F 60

3 cang F 100

#furong,fengj爲空了，但是注意行號，沒有添加就是默認全文，單獨的一個還可以區分，但是如果一個文件存在多個相同的符號或者內容，需要使用行號來區分

三、字符處理命令

1.排序命令sort命令

sort [選項]文件名

選項：

-f ：忽略大小寫

-n：以數值型進行排序，默認使用字符串型排序

-r ：反向排序，默認從小到大

-t ：指定分隔符，默認分隔符是製表符

-k n[，m] ：按照指定的字段範圍排序，從第n字段開始，m字段結束（默認到行尾）

例子：

sort/etc/passwd；排序用戶信息文件

sort –r /etc/passwd；反向排序

2.統計wc命令

wc [選項] 文件名

選項：

-l ：只統計行數

-w ：只統計單詞數word

-m ：只統計字符數，會統計回車符

正則自在心中。

shell編程之正則表達式

Win10 LTSC 2019 安裝後的一些步驟

推薦2款開源、美觀的WinForm UI控件庫

NET9 AspnetCore將整合OpenAPI的文檔生成功能而無需三方庫

在Linux下管理MySQL的大小寫敏感性

shell編程之正則表達式

Linux學習之Shell基礎

Vlan學習筆記終極整理

Linux學習日誌之Linux常用命令總結

Linux學習筆記1-20150715

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結