迴文子串 Python 普通解和 Manacher(馬拉車) 算法分析

文章目錄

迴文子串 Python 一般解和 Manacher(馬拉車) 算法分析

迴文子串 Python 一般解和 Manacher(馬拉車) 算法分析

迴文就是 abcba 或 abccba 類型的字符串
題: 求字符串中最長的迴文子串

answer = 'abc'*5600+'cedec'*5706 + 'cba'*5600 # 最長迴文
question = 'qwesc'*1035 + answer + 'qwversaqe'*1204 # 問題字符串

普通解

首先，一般的想法就是從頭到尾，依次選取進行比較。
因爲，只需要求最長的，設定一個 max_len 作爲門寬，可以寫得一般的解:

class BasicSolution:
    def longestPalindrome(self, s: str) -> str:
        n = len(s)
        if n < 2 or s == s[::-1]:# 特判
            return s
        start, max_len = 0, 1
        for i in range(1,n):
            left = i-max_len
            if left-1>=0 and s[left-1:i+1] == s[i-n:left-n-2:-1]: # 加二
                start = left-1 # 因爲加二 所以start 要退一格
                max_len += 2
            elif left>=0 and s[left:i+1] == s[i-n:left-n-1:-1]: # 加一
                start = left
                max_len += 1
        return s[start:start+max_len]
    
sol = BasicSolution()
%timeit sol.longestPalindrome(question)==answer

8.65 s ± 74.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
True

計算時間: 8.65 s, 內存佔用: 0.4 MiB

一般解解釋:
1. 特判 return 本身
  1. 如果是單字符或空
  2. 如果整個都是迴文
2. 初始值:
  1. n 字符串全長
  2. start 最長迴文起始點
  3. max_len 迴文最長長度
3. 因爲迴文性質有分奇偶，所以每次判斷都先判斷它的兩邊再加它的右邊例迴文 abbabb:
  1. 因爲a左側沒有字符所以判斷 a加一: ab 因爲 ab 不是，所以開始第二步
  2. 判斷 b左右加一: abb 不是，再判斷 b加一: bb 是，所以 max_len = 2
  3. 判斷 bb左右加一: abba 是，所以 max_len = 4
  4. 判斷 abba左右加一: left<0 不是，再判斷 abba加一: abbab 不是, 所以開始第二步
  5. 判斷 bbab左右加一: abbabb 不是，再判斷 bbab加一: bbabb 是, max_len = 5
這個解法看似是 O(n)，其實裏面判斷兩個字符串是否迴文依賴於python 的字符串對比，所以它實際上並不屬於 O(n)。
接下來介紹一個 Manacher 的算法。因爲讀音近似於中文馬拉車，所以一般有人稱它爲馬拉車算法。

Manacher(馬拉車) 算法

由科學家 Manacher 研究的算法。
在字符串中插入 # 使得字符串變成 #a#b#c#b#a# 或 #a#b#c#c#b#a# 使得更利於表示迴文半徑
- string = # a # b # c # c # b # c # c #
- indexs = 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
- 當指針 index = 1 $\to$ a 時，兩邊都爲 #， (# a #) 所以 a 的半徑爲 1，
- 當指針 index = 6 $\to$ # 時，兩邊都爲 #a#b#c， (#a#b#c # c#b#a#) 所以 # 的半徑爲 6，
因爲加了 #, 這裏的半徑就是它的迴文字符串長度
把所有指針對應的半徑值命名爲 p
- string = # a # b # c # c # b # c # c #
- indexs = 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
- p = 0 1 0 1 0 1 4 1 0 5 0 1 1 1 0
Manacher 用的是中心擴散法:
- 符號意義:
  - * 未知值
  - T_1 時間步驟
  - ! max_right 指針，搜索到的位置
  - | center 中心點
  - ) 鏡像
要點: 當 T < max_right，使用 mirror 可以直接參考左半邊的迴文節省計算
1. 如果 p[mirror] < max_right 的話，直接複製取出例 P(T_7)
2. 如果 p[mirror] > max_right 的話，右邊繼續擴散例 P(T_9)
3. 如果 max_right 到盡頭了，取max_right -T 和第一個步驟取最小值例 P(T_10)

# Manacher 算法
class ManacherSolution:
    def longestPalindrome(self, s: str)-> str:  
        if len(s) < 2 or s == s[::-1]:# 特判
            return s
        string = '#'+'#'.join(s)+'#' # 預處理字符串
        n = len(string)
        p = [0 for _ in range(n)] # 初始化 p
        max_right, center = 0,0  # 對應的雙指針，須同時更新
        start, max_len = 1,1 # 當前遍歷的中心最大擴散步數 和 起始位置，須同時更新  
        for i in range(n): # i -> index
            if i < max_right:
                mirror = 2*center -i
                p[i] = min(max_right -i, p[mirror])
            left, right = i -(1+p[i]), i+(1+p[i]) # 擴散的左右指針  
            # left >= 0 and right < n 保證不越界
            # t[left] == t[right] 表示可以再擴散 1 次
            while left >=0 and right< n and string[left]==string[right]:
                p[i] += 1
                left -= 1
                right += 1
            # 擴散後 找到 p[i]
            # max_right 爲 p[i]+ i 就是圖上的 ！標誌
            # i < max_right 使得 可以重複利用迴文信息
            if i + p[i] > max_right:
                max_right, center = i+p[i], i # max_right 和 center 需要同時更新 
            if p[i] > max_len:
                max_len = p[i] 
                start = (i - max_len) // 2 # 因爲 擴大了兩倍
        return s[start: start + max_len]
    
sol = ManacherSolution()
%timeit sol.longestPalindrome(question)==answer

345 ms ± 7.94 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
True

計算時間: 345 ms, 內存佔用: 2.7 MiB

因爲構建了個 P 所有比之前的更佔用內存。
但是計算空間優化到 O(n)

迴文子串 Python 普通解和 Manacher(馬拉車) 算法分析

文章目錄

迴文子串 Python 一般解和 Manacher(馬拉車) 算法分析

普通解

Manacher(馬拉車) 算法

微服務實踐之使用 Visual Studio 2022 調試Dapr 應用程序

wpf附加屬性理解 WPF附加屬性

Multiple Edge Responses for Signal Simulations 小筆記

信號與系統(Python) 學習筆記摘錄 (2) 傅里葉 Fourier

信號與系統(Python) 學習筆記 (6) 拉普拉斯變換 Laplace Transform

數學中的圖像重構與成像問題 (筆記總目錄)

Tensorflow 2.0 基本函數與數學表示

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

迴文子串 Python 普通解 和 Manacher(馬拉車) 算法分析

文章目錄

迴文子串 Python 一般解 和 Manacher(馬拉車) 算法分析

普通解

Manacher(馬拉車) 算法

迴文子串 Python 普通解和 Manacher(馬拉車) 算法分析

迴文子串 Python 一般解和 Manacher(馬拉車) 算法分析