Huffman樹及其編碼(STL array實現)

原創

osc_yny7gjj7

2021-12-25 21:29

這篇隨筆主要是Huffman編碼，構建哈夫曼樹有各種各樣的實現方法，如優先隊列，數組構成的樹等，但本質都是堆。

這裏我用數組來存儲數據，以堆的思想來構建一個哈弗曼樹，並存入vector中，進而實現哈夫曼編碼

步驟：　　1生成哈夫曼樹（取最小權值樹和次小權值樹生成新樹，排列後重新取樹，不斷重複）

　　　　 2編碼（遵循左零右一的原則）

　　 3解碼（是編碼的逆向，本文還未實現，日後有機會補充）

data.txt 測試數據：

5
1 2 3 4 5
abcde

結果：

下面貼代碼：

  1 #include <iostream>
  2 #include <fstream>
  3 #include <algorithm>
  4 #include <vector>
  5 #include <array>
  6 
  7 using namespace std;
  8 
  9 #define ARR_SIZE 100   //緩衝區大小
 10 
 11 typedef struct Tree
 12 {
 13     int freq;
 14     char key = '\0';
 15     Tree *left, *right;
 16     Tree()
 17     {
 18         freq = 0;
 19         key = '\0';
 20         left = NULL;
 21         right = NULL;
 22     }
 23 } Tree, *pTree;
 24 union key_or_point
 25 {
 26     char key;
 27     pTree point;
 28 };
 29 enum infor_type
 30 {
 31     key_s,
 32     point_s
 33 };
 34 class infor
 35 {
 36 public:
 37     int freq;//權值
 38     key_or_point kp;//記錄鍵值或者 新生成的樹的地址
 39     infor_type type;//  聯合體key_or_point的類型由infor_type標誌
 40     infor()
 41     {
 42         freq = 0;
 43         kp.key = NULL;
 44         type = key_s;
 45     }
 46 };
 47 
 48 array<infor, ARR_SIZE> arr;//用來讀取要處理的數據
 49 vector<pTree> trees;  //所有生成的樹都放在vector裏面
 50 
 51 int num;   //要處理的數據個數
 52 
 53 bool cmp(infor a, infor b)
 54 {
 55     return a.freq > b.freq;
 56 }
 57 
 58 void Huffman()
 59 {
 60     //找出最小權值和次小權值
 61     sort(&arr[0], &arr[num], cmp);
 62     int cal = num - 1;
 63     while (cal > 0)
 64     {
 65 
 66         pTree pta = new Tree();
 67         vector<pTree>::iterator it;
 68 
 69         pTree ptl = new Tree();
 70         ptl->freq = arr[cal].freq;
 71         // pt all 的左子樹
 72         if (arr[cal].type == point_s)
 73         {
 74             pta->left = arr[cal].kp.point;//如果存放的是地址，那麼該樹已入vector
 75             //無需重複操作
 76         }
 77         else
 78         {
 79             ptl->key = arr[cal].kp.key;
 80             trees.push_back(ptl);
 81             it = trees.end() - 1;
 82             pta->left = *it;
 83         }
 84 
 85 
 86         pTree ptr = new Tree();
 87         ptr->freq = arr[cal - 1].freq;
 88         // pt all 的右子樹
 89         if (arr[cal - 1].type == point_s)
 90         {
 91             pta->right = arr[cal - 1].kp.point; //如果存放的是地址，那麼該樹已入vector
 92             //無需重複操作
 93         }
 94         else
 95         {
 96             ptr->key = arr[cal - 1].kp.key;
 97             trees.push_back(ptr);
 98             it = trees.end() - 1;
 99             pta->right = *it;
100         }
101 
102         pta->freq = arr[cal].freq + arr[cal - 1].freq;
103         trees.push_back(pta);//pt all 本樹
104 
105         it = trees.end() - 1;
106         arr[cal - 1].kp.point = *it;
107         arr[cal - 1].type = point_s;//保存新生成樹的地址
108 
109         arr[cal - 1].freq = arr[cal - 1].freq + arr[cal ].freq;
110         //最小權值的樹和次權值的樹組成新樹後，放回原數組
111         //新樹的key_or_point此時類型變爲point_s指針指向vector存放的位置
112 
113         //第一次循環會有三棵樹入vector,重新排列後，新樹無需重複入vector
114         cal--;
115         sort(&arr[0], &arr[cal + 1], cmp);
116 
117     }
118 
119 }
120 
121 void traversTree(pTree pt, string st = "")
122 {
123     //中序遍歷二叉樹
124     //遵循左0右1的原則 
125     if (pt->left == NULL && pt->right == NULL)
126     {
127         cout.flags(ios::left);
128         cout.width(10);
129         cout << st.c_str() << "  ";
130         cout << pt->key << endl;
131         return;
132     }
133     if (pt->left != NULL)
134     {
135         st += '0';
136         traversTree(pt->left, st);
137         st.pop_back();//從左邊出來後要回退一個字符，避免進入右邊時多出一個字符
138     }
139 
140     if (pt->right != NULL)
141     {
142         st += '1';
143         traversTree(pt->right, st);
144     }
145     return ;
146 }
147 
148 void printCode()
149 {
150     vector<pTree>::iterator it;
151     it = trees.end() - 1;
152     pTree pt = *it; //取出最頂端的樹
153     cout << "print HuffmanCode:" << endl;
154     traversTree(pt);
155 }
156 int main()
157 {
158     ifstream filein("data.txt");
159     cin.rdbuf(filein.rdbuf());//重定向輸入
160     cin >> num;//要處理的數據個數
161     for (int i = 0; i < num; i++)
162     {
163         cin >> arr[i].freq;
164     }
165     for (int i = 0; i < num; i++)
166     {
167         cin >> arr[i].kp.key;
168     }
169     Huffman();
170     printCode();
171     return 0;
172 }

分析：

這是以上測試數據生成的樹的情況。

只有葉子節點表示有效的符號，所以遍歷樹時返回條件是葉子節點（如果是葉子節點則返回）

總結：

1 編程時用的一些小技巧總結：

　　1.1 輸出調試信息：可以採用如下方式

　　　　　　#ifdef DEBUG

　　　　　　　　cout調試信息....

　　　　　　#endif

　　1.2 聯合體union需要取得類型時，可以加一個enum來記錄和標誌uninon的類型

2 編程方法反思：

　　可以看到源碼中用到了兩次sort，這是省事的做法了。

　　目前想到的改進的方法是用二分插入（數據已經排序）

　　對比起來，我覺得優先隊列的方式更易懂且效率更高，但此文也算是一次小探索，值得記錄下來

3 感想：

　　本人入園第一次隨筆，如有不足或錯誤，還望指出。

以上

原文出處：https://www.cnblogs.com/virgildevil/p/10349693.html

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Huffman樹及其編碼(STL array實現)

[轉帖]使用NMT和pmap解決JVM資源泄漏問題原創

Python實現大麥網搶票的四大關鍵技術點解析

Python 安裝庫指令大全

salesforce零基礎學習（一百三十八）零碎知識點小總結（十）

一款開源的.NET程序集反編譯、編輯和調試神器

關於接口協議，你必須要知道這些！

2020年上半年數據庫系統工程師考試

基於 Milvus + LlamaIndex 實現高級 RAG

【2024-05-21】以茶會友

JavaScript學習筆記（三）——對象

微軟跨平臺ORM框架之EFCore — 約定與屬性映射

輸出排列遞歸、回溯法

動態儲存方式和靜態儲存方式

Huffman樹及其編碼(STL array實現)

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結