三:簡單的例子
std::string regstr = "a+";
boost::regex expression(regstr);
std::string testString = "aaa";
// 匹配至少一個a
if( boost::regex_match(testString, expression) )
{
std::cout<< "Match" << std::endl;
}
else
{
std::cout<< "Not Match" << std::endl;
}
1 我們經常會看一個字符串是不是合法的IP地址,合法的IP地址需要符合以下這個特徵:
xxx.xxx.xxx.xxx 其中xxx是不超過255的整數
正則表達式找到上面的這種形式的字符串相當容易,只是判斷xxx是否超過255就比較困難了(因爲正則表達式是處理的文本,而非數字)
OK,我們先來處理一個數字,即:xxx。找到一種表達式來處理這個數字,並且保證這個數字不會超過255
第一種情況:x,即只有一個數字,它可以是0~9 ,用/d 表示
第二種情況:xx,即有兩個數字,它可以是00~99,用/d/d 表示
第三種情況:xxx,這種情況分爲兩種,一種是 1xx,可以用 1/d/d 表示
另外一種是 2xx,這又分爲兩種 2[1234]/d
和 25[12345]
好了組合起來
1?/d{1,2}|2[1234]/d|25[12345]
既可以標識一個不大於255的數字字符串
嗯,我們現在需要重複這種情況既可:
(1?/d{1,2}|2[1234]/d|25[12345])/.(1?/d{1,2}|2[1234]/d|25[12345])/.(1?/d{1,2}|2[1234]/d|25[12345])/.(1?/d{1,2}|2[1234]/d|25[12345])
呵呵,長是長了點,我試圖用boost支持的子表達式縮短,但是沒有達到效果,請各位瞭解boost的正則表達式的達人指點:
(1?/d{1,2}|2[1234]/d|25[12345])/./1$/./1$/./1$
(參看反向索引:http://www.boost.org/libs/regex/doc/syntax_perl.html
似乎反向只能匹配與第一個字符完全一樣的字符串,與我們的需求不同)
Example:
boost::regex expression(regstr);
std::string testString = "192.168.4.1";
if( boost::regex_match(testString, expression) )
{
std::cout<< "This is ip address" << std::endl;
}
else
{
std::cout<< "This is not ip address" << std::endl;
}
2 我們來看看regex_match的另外一個函數原型
template <class ST, class SA, class Allocator, class charT, class traits>
bool regex_match(const basic_string<charT, ST, SA>& s,
match_results<typename basic_string<charT, ST, SA>::const_iterator, Allocator>& m,
const basic_regex <charT, traits>& e, match_flag_type flags = match_default);
template <class BidirectionalIterator, class Allocator, class charT, class traits>
bool regex_match(BidirectionalIterator first, BidirectionalIterator last,
match_results<BidirectionalIterator, Allocator>& m,
const basic_regex <charT, traits>& e,
match_flag_type flags = match_default);
注意參數m,如果這個函數返回false的話,m無定義。如果返回true的話,m的定義如下
Element |
Value |
m.size() |
e.mark_count() |
m.empty() |
false |
m.prefix().first |
first |
m.prefix().last |
first |
m.prefix().matched |
false |
m.suffix().first |
last |
m.suffix().last |
last |
m.suffix().matched |
false |
m[0].first |
first |
m[0].second |
last |
m[0].matched |
|
m[n].first |
For all integers n < m.size(), the start of the sequence that matched sub-expression n. Alternatively, if sub-expression n did not participate in the match, then last. |
m[n].second |
For all integers n < m.size(), the end of the sequence that matched sub-expression n. Alternatively, if sub-expression n did not participate in the match, then last. |
m[n].matched |
For all integers n < m.size(), true if sub-expression n participated in the match, false otherwise. |
boost::regex expression(regstr);
std::string testString = "192.168.4.1";
boost::smatch what;
if( boost::regex_match(testString, what, expression) )
{
std::cout<< "This is ip address" << std::endl;
for(int i = 1;i <= 4;i++)
{
std::string msg(what[i].first, what[i].second);
std::cout<< i << ":" << msg.c_str() << std::endl;
}
}
else
{
std::cout<< "This is not ip address" << std::endl;
}
This is ip address
1:192
2:168
3:4
4:1
五:regex_search學習
regex_search與regex_match基本相同,只不過regex_search不要求全部匹配,即部份匹配(查找)即可。
簡單例子:
boost::regex expression(regstr);
std::string testString = "192.168.4.1";
boost::smatch what;
if( boost::regex_search(testString, expression) )
{
std::cout<< "Have digit" << std::endl;
}
上面這個例子檢測給出的字符串中是否包含數字。
好了,再來一個例子,用於打印出所有的數字
boost::regex expression(regstr);
std::string testString = "192.168.4.1";
boost::smatch what;
std::string::const_iterator start = testString.begin();
std::string::const_iterator end = testString.end();
while( boost::regex_search(start, end, what, expression) )
{
std::cout<< "Have digit:" ;
std::string msg(what[1].first, what[1].second);
std::cout<< msg.c_str() << std::endl;
start = what[0].second;
}
打印出:
Have digit:192
Have digit:168
Have digit:4
Have digit:1
我們先來一個例子:
boost::regex expression(regstr);
std::string testString = "My age is 28 His age is 27";
boost::smatch what;
std::string::const_iterator start = testString.begin();
std::string::const_iterator end = testString.end();
while( boost::regex_search(start, end, what, expression) )
{
std::string name(what[1].first, what[1].second);
std::string age(what[4].first, what[4].second);
std::cout<< "Name:" << name.c_str() << std::endl;
std::cout<< "Age:" <<age.c_str() << std::endl;
start = what[0].second;
}
我們希望得到的是打印人名,然後打印年齡。但是效果令我們大失所望:
Name:My age is 28 His
Age:27
嗯,查找原因:這是由於"+"號或者"*"號等重複符號帶來的副作用,這些符號會消耗盡可能多的輸入,使之是“貪婪”的。即正則表達式(.*)會匹配最長的串,而不是匹配最短的成功串。
如何使得這些重複的符號不再“貪婪”,我們在重複符號後加上"?"即可。
boost::regex expression(regstr);
std::string testString = "My age is 28 His age is 27";
boost::smatch what;
std::string::const_iterator start = testString.begin();
std::string::const_iterator end = testString.end();
while( boost::regex_search(start, end, what, expression) )
{
std::string name(what[1].first, what[1].second);
std::string age(what[4].first, what[4].second);
std::cout<< "Name:" << name.c_str() << std::endl;
std::cout<< "Age:" <<age.c_str() << std::endl;
start = what[0].second;
}
Name:My
Age:28
Name: His
Age:27
七:regex_replace學習
寫了個去除左側無效字符(空格,回車,TAB)的正則表達式。
std::string TrimLeft = "([//s//r//n//t]*)(//w*.*)";
boost::regex expression(TrimLeft);
testString = boost::regex_replace( testString, expression, "$2" );
std::cout<< "TrimLeft:" << testString <<std::endl;
TrimLeft:Hello World ! GoodBye World