前言

******************************************************************
本系列文章所提供的算法均在以下環境下編譯通過。
【算法編譯環境】Federa 8，linux 2.6.35.6-45.fc14.i686
【處理器】 Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz
【內存】 2025272 kB
如果有問題或者紕漏或者有好的建議更或者有更好的算法，請不吝賜教。
*****************************************************************

正文

在面試當中，字符串匹配的題目也是數見不鮮。當然今天寫的並不是KMP等一些經典字符串匹配算法。因爲這些經典算法在一些博客和網頁中已經講解的很詳細，而且配有插圖，這裏就不多言語了。最近在細看微軟面試100題，這是其中的兩道題目。

第一道題：字符串匹配

【題目】實現一個挺高級的字符匹配算法：給一串很長字符串，要求找到符合要求的字符串
【例子】如目的串是123，則1******3***2 ,12*****3這些都要找出來。
【分析】前面也講到了，字符串匹配題目數見不鮮，其中的技巧也很多。藉助hash就是一種。什麼意思呢？因爲字符串都是ASCII編碼，總共是摯友256種情況。於是我們可以定義一個hash_table【256】數組。那具體用這個hash數組做什麼事情，我想不同的目的就有不同的用法。需要細細體會。那對於這道題我們如何設計算法呢？以下是文字算法過程：
第一步：遍歷src字符串，並且hash_table相應的位置置1，表明該字符存在src中。
第二步：遍歷dest字符串，對於每個字符，我們通過hash_table來判斷是否存在，即是否爲1，如果是，則存在。
第三步：如果判斷所有字符都存在，說明匹配。返回之。
基於上面的文字思路基礎上，我們寫下如下的code：

#include <iostream>
#include <cstring>

bool string_match( const char * const src, const char * const dest )
{
   int srcLen = strlen( src );
   int destLen = strlen( dest );
   int hash_table[256] = { 0 };
   bool bMatch = true;
   for( int i = 0; i < srcLen; i++ )
   {
      hash_table[ (int)src[i] ] = 1;
   }
   for( int i = 0; i < destLen; i++ )
   {
      // if hash_table doesn't contain string 2,
      // set set bMatch as false.
      if( 0 == hash_table[ (int)dest[i] ] )
      {
         bMatch = false;
      }
   }
   return bMatch;
}

int main( int argc, char ** argv )
{
   char src[] = "1**2**3******";
   char dest[] = "123";
   bool isMatch = string_match( src, dest );
   if( isMatch )
   {
      std::cout << "match" << std::endl;
   }
   else
   {
      std::cout << "not match" << std::endl;
   }
   return 0;
}

第二道題：最短字符串匹配

【題目】就是給一個很長的字符串str 還有一個字符集比如{a,b,c} 找出str裏包含{a,b,c}的最短子串。要求O(n)
【例子】字符集是a,b,c，字符串是abdcaabcx，則最短子串爲abc
【分析】這道題依然是字符串匹配，但此題是求最短字符串。技巧依然是藉助上面的hash _table。那此題我們如何分析呢？首先看到最短兩個字,我們會想到什麼呢？對！就是先設置一個最小值min，然後每一次處理得到一個值value，如果value比min小，則將min替換成value。就是這種方法。好了，這道題與上面有一個不一樣的地方，就是對誰先初始化hash_table。我用紅色字標出。且看下面文字算法描述：
第一步：遍歷dest字符串，並且hash_table相應的位置置1，表明該字符存在dest中。
第二步：遍歷src字符串，如果當前字符在dest中，即hash_table相應的值爲1，我們每判斷一次都做sum++操作，如果sum值等於dest長度，說明一次匹配成功。假設該字串位於front和rear之間。求出其長度，再跟最小長度min對比。如果比min還小，則替換掉min。
第三步：將front到end之間的字符串複製到結果中，返回之。
基於上面的文字思路基礎上，我們寫下如下的code，如果上面文字們描述不清楚，請結合下面代碼理解就清楚了。算法本身很簡單。

#include <iostream>
#include <cstring>

void string_min_match( const char * const src,
   const char * const dest, char * result )
{
   /* srcLen is the length of src string. */
   int srcLen = strlen( src );
   /* destLen is the length of dest string. */
   int destLen = strlen( dest );
   /* front point to start of result string. */
   int front = 0;
   /* rear point to end of result string. */
   int index = 0;
    /* because the strings is ascii, so we can
      make a hash table of them and its length is 256.*/
   int hash_table[256] = { 0 };
   /* a counter. */
   int sum = 0;
   int totalLen = 0;
   /* the minimun length of match string. */
   int min = srcLen;
   // init hash_table array.
   for( int i = 0; i < destLen; i++ )
   {
      hash_table[ (int)dest[i] ] ++;
   }
   // handle every character in src string.
   for( int i = 0; i < srcLen; i++ )
   {
      if( sum < destLen )
      {
         if( hash_table[ (int)src[i] ] == 1 )
         {
            sum ++;
         }
         if( sum == 1 )
         {
            index = i;
         }
      }
      else if( sum == destLen )
      {
         totalLen = i - index;
         if( totalLen < min )
         {
            min = totalLen;
            front = index;
         }
         sum = 0;
         totalLen = 0;
      }
   }
   // copy the minimun string to result.
   memcpy( result, &src[front], min );
}

int main( int argc, char ** argv )
{
   char src[] = "abdcaabcx";
   char dest[] = "abc";
   char result[10] = { 0 };
   string_min_match( src, dest, result );
   std::cout << result << std::endl;
}

總結：

希望通過自己寫這些算法，一來督促自己每天學習，二來可以鍛鍊自己的寫文章的能力。三來可以將這些東西分享給大家。雖然微軟都那麼多題，然後各大網站的面試題五花八門，卻沒有人做過認真整理。這兩道題都是字符串匹配。望對大家有所幫助。

作者：Alex
出處：http://blog.csdn.net/hellotime
本文版權歸作者所有，歡迎轉載，但未經作者同意必須保留此段聲明，且在文章頁面明顯位置給出原文連接，否則保留追究法律責任的權利。

字符串面試題系列之四：字符串匹配

前言

正文

第一道題：字符串匹配

第二道題：最短字符串匹配

《Python進階》學習筆記

Leetcode 3161. 物塊放置查詢

leetcode 60 排列序列

一個docker容器暴露多個端口

微服務實踐之使用 Visual Studio 2022 調試Dapr 應用程序

wpf附加屬性理解 WPF附加屬性

字符串面試題系列之四：字符串匹配

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結