[C++] MD5加密算法原理及實現

參考文獻：
1. RFC1321 - R. Rivest
2. 中山大學蔡國揚老師的 Web安全課件

算法概述

MD5 使用 little-endian，輸入任意不定長度信息，以 512 位長進行分組,生成四個32位數據,最後聯合起來輸出固定 128 位長的信息摘要。
MD5 算法的基本過程爲：求餘、取餘、調整長度、與鏈接變量進行循環運算、得出結果。

在 RFC1321 中，算法共分爲五步，對於每一步的細節我都會舉出例子來更方便的理解。另外有一點需要注意的是，下文中若無特別說明，都是以比特爲單位來闡述算法。

基本流程圖

一、Append Padding Bits

在原始消息的尾部進行填充，使得填充後的消息位數 L mod 512 = 448。
填充規則爲，先填充一個 1，然後剩餘的填充 0。並且填充是必須的，即使原始消息的長度模 512 後正好爲 448 比特，也要進行填充。總之，填充的長度至少爲 1 比特，最多爲 512 比特。

例如，原始消息爲 12345678，總長度爲 8 * 8 = 64 比特，那麼需要填充 384 比特，即填充 1000…. 後面還有 380 個 0。

填充後的消息用 16 進製表示（此處省略 0x）爲
31 32 33 34 35 36 37 38 80 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

二、Append Length

計算原始消息（未填充 padding 前）的長度，用長 64 位的 b 表示。若 b 大於 264 ，即 64 位不夠表示原始消息的長度時，只取低 64 位。將 b 填充至第一步填充後的消息尾部。

此時，填充後得到的消息總長度爲 512 的倍數，也是 16 的倍數。將填充後的消息分割爲 L 個 512 位的分組，Y0,Y1,...,YL−1 。

注意，實際填充時不是直接將長度的 64 位二進制表示接上去就可以。而是先用兩個 32 位的字來表示原始消息長度 b，將低位的字先填充，然後再填充高位的字，並且每個字在填充時使用 little-endian。

little-endian：將低位字節排放在內存的低地址端,高位字節排放在內存的高地址端。

例如，原始消息爲 12345678，總長度爲 8 * 8 = 64 比特，用 64 位二進制表示爲 00000000 00000000 00000000 00000000 00000000 00000000 00000000 01000000。分成兩個 32 位的字：

高位：00000000 00000000 00000000 00000000
低位：00000000 00000000 00000000 01000000

低位字節的 little-endian 表示爲 01000000 00000000 00000000 00000000。

因此，應該填充的 64 位爲 01000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000。

三、Initialize MD Buffer

初始化一個 128 位的 MD 緩衝區，也表示爲 4 個 32 位寄存器 (A, B, C, D)，用來迭代計算保存信息摘要。

對於 4 個 32 位的寄存器 A、B、C、D 分別初始化爲 16 進制初始值，採用小端規則

word					little-endian
A	01	23	45	67	0x67452301
B	89	AB	CD	EF	0xEFCDAB89
C	FE	DC	BA	98	0x98BADCFE
D	76	54	32	10	0x10325476

四、Process Message in 16-Word Blocks

首先，定義四個輪函數，每個函數以 3 個 32 位字爲輸入，輸出 1 個 32 位字。

Function	return
F(X,Y,Z)	(X∧Y)∨(¬X∧Z)
G(X,Y,Z)	(X∧Z)∨(Y∧¬Z)
H(X,Y,Z)	X⊕Y⊕Z
I(X,Y,Z)	Y⊕(X∨¬Z)

以第二步分割後的 512 比特的分組爲單位，每一個分組 Yq (q = 0, 1, …, L - 1) 經過 4 輪循環的壓縮算法，記爲 Hmd5 ，對第三步初始化的 MD 緩衝區進行迭代更新，初始 MD 緩衝區記爲 CV0=IV ；第 q 個分組處理後的 MD 緩衝區記爲 CVq=Hmd5(Yq−1,CVq−1) ，最終輸出結果爲 CVL 。

另外，512 比特的分組再分割爲 16 個 32 比特的字，記爲 X0,X1,...,X15

特別注意： 這裏的 X[k] 並不是順序讀取 32 比特直接形成，而是需要對讀取的 4 個字節按照 little-endian 進行轉換。

例如，原始信息爲 12345678，經過第一、二步填充後的十六進制（此處省略 0x）表示爲
31 32 33 34 35 36 37 38 80 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 40 0 0 0 0 0 0 0

對於 X[0]，順序讀取 32 比特直接形成得到的是 0x31323334，而實際計算時應該爲 0x34333231。

X[0…7] = {0x34333231, 0x38373635, 0x00000080, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000}
X[8…15] = {0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000040, 0x00000000}

Hmd5 的具體步驟大致爲，

輸入上一輪的 128 位結果 CVq−1 和第 q 個分組 Yq
用輪函數 F 和 T 表的 [1…16] 項及 X[i] 對上一輪的結果 CVq−1 進行 16 次迭代計算
用輪函數 G 和 T 表的 [17…32] 項及 X[ρ2i] 對第 2 步的結果進行 16 次迭代計算
用輪函數 H 和 T 表的 [33…48] 項及 X[ρ3i] 對第 3 步的結果進行 16 次迭代計算
用輪函數 I 和 T 表的 [49…64] 項及 X[ρ4i] 對第 4 步的結果進行 16 次迭代計算
將上一輪結果 CVq−1 的 4 個 32 位的字與第 5 步產生的 4 個 32 位的字分別進行模 232 加法，得到 CVq

模 232 加法大致爲，兩個 32 位字相加，若有第 33 位的進位，則捨棄。例如，0xFFFFFFFF + 0x00000001 = 0x00000000。

Hmd5 流程圖

Hmd5 的 2~5 步，每輪的一步運算邏輯爲，

a \leftarrow b + ((a + g (b, c, d) + X [k] + T [i]) < < < s)

說明：

a, b, c, d 分別爲 MD 緩衝區 (A, B, C, D) 的當前值
g：輪函數 (F, G, H, I 中的一個)
<<< s：將 32 位輸入循環左移 s 位
X[k]: 當前處理消息分組的第 k 個 32 位字
T[i]: T 表的第 i 個元素，32 位字
+：模 232 加法

每次計算後，要對 MD 緩衝區進行循環右移。記一步運算後 MD 緩衝區爲 (AA, BB, CC, DD)，循環右移即令 A = DD, B = AA, C = BB, D = CC

流程圖：

各輪迭代中的 X[k]

輪函數 F 迭代，X[i], i = 0, 1,…, 15
輪函數 G 迭代，X[ρ2i], ρ2i = (1 + 5i) mod 16, i = 0, 1,…, 15
輪函數 H 迭代，X[ρ3i], ρ3i = (5 + 3i) mod 16, i = 0, 1,…, 15
輪函數 I 迭代，X[ρ4i], ρ4i = 7i mod 16, i = 0, 1,…, 15

用表格更清晰的表示爲

輪函數	X[k] 中 k 依次爲
F	[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
G	[1, 6, 11, 0, 5, 10, 15, 4, 9, 14, 3, 8, 13, 2, 7, 12]
H	[5, 8, 11, 14, 1, 4, 7, 10, 13, 0, 3, 6, 9, 12, 15, 2]
I	[0, 7, 14, 5, 12, 3, 10, 1, 8, 15, 6, 13, 4, 11, 2, 9]

T 表的生成

T [i] = i n t (232 * | s i n (i) |)

T[1…8] = {0xd76aa478, 0xe8c7b756, 0x242070db, 0xc1bdceee, 0xf57c0faf, 0x4787c62a, 0xa8304613, 0xfd469501}
T[9…16] = {0x698098d8, 0x8b44f7af, 0xffff5bb1, 0x895cd7be, 0x6b901122, 0xfd987193, 0xa679438e, 0x49b40821}
T[17…24] = {0xf61e2562, 0xc040b340, 0x265e5a51, 0xe9b6c7aa, 0xd62f105d, 0x2441453, 0xd8a1e681, 0xe7d3fbc8}
T[25…32] = {0x21e1cde6, 0xc33707d6, 0xf4d50d87, 0x455a14ed, 0xa9e3e905, 0xfcefa3f8, 0x676f02d9, 0x8d2a4c8a}
T[33…40] = {0xfffa3942, 0x8771f681, 0x6d9d6122, 0xfde5380c, 0xa4beea44, 0x4bdecfa9, 0xf6bb4b60, 0xbebfbc70}
T[41…48] = {0x289b7ec6, 0xeaa127fa, 0xd4ef3085, 0x4881d05, 0xd9d4d039, 0xe6db99e5, 0x1fa27cf8, 0xc4ac5665}
T[49…56] = {0xf4292244, 0x432aff97, 0xab9423a7, 0xfc93a039, 0x655b59c3, 0x8f0ccc92, 0xffeff47d, 0x85845dd1}
T[57…64] = {0x6fa87e4f, 0xfe2ce6e0, 0xa3014314, 0x4e0811a1, 0xf7537e82, 0xbd3af235, 0x2ad7d2bb, 0xeb86d391}

五、Output

根據 MD 緩衝區最後的結果 (A, B, C, D) 輸出信息摘要，從 A 到 D，從低字節至高字節的順序輸出。

例如，原始信息爲 12345678 經過上述步驟處理後得到的 (A, B, C, D) = {0xd25ad525, 0x0a40aa83, 0x6dc764f4, 0xad073c71}

A：輸出 25 d5 5a d2
B：輸出 83 aa 40 0a
C：輸出 f4 64 c7 6d
D：輸出 71 3c 07 ad

最終,輸出結果爲 25d55ad283aa400af464c76d713c07ad

C++ 實現

一、自己的實現方法（不可加密未知長度的原始消息）

我的實現方法是按照參考文獻中的五步，一步一步做的，沒有參照文獻後面附錄中的代碼實現。這種實現方法在使用時必須滿足一個前提，即在一開始就知道整個原始消息及其長度，因爲這種實現方法在前兩步就對原始消息進行填充，然後才一組一組進行處理，不能對長度未知的原始消息進行加密，這算是個缺陷吧。因此，後面我會給出 L. Peter Deutsch 的實現方法。

MD5.hpp

/*
 *   file: md5.hpp
 *   author: Els-y
 *   time: 2017-10-16 21:08:21
*/
#ifndef _MD5_H
#define _MD5_H

#include <string>
#include <vector>
#include <cstring>
#include <cmath>
#include <iostream>
#include <bitset>
using std::string;
using std::vector;
using std::bitset;
using std::cout;
using std::endl;
using std::sin;
using std::abs;

// default little-endian
class MD5 {
public:
    MD5();
    ~MD5();
    string encrypt(string plain);
    // 輸出擴展後的消息
    void print_buff();

private:
    // 128 位 MD 緩衝區，md[0...3] = {A, B, C, D}
    vector<unsigned int> md;
    // 存儲擴展後的消息
    unsigned char* buffer;
    // 擴展後的消息長度，以字節爲單位
    unsigned int buffer_len;
    // 存放 4 個輪函數的數組
    unsigned int (MD5::*round_funcs[4])(unsigned int, unsigned int, unsigned int);

    // 初始化 MD 緩衝區
    void init_md();
    // 填充 padding 和 length
    void padding(string plain);
    void clear();
    void h_md5(int groupid);
    // 4 個輪函數
    unsigned int f_rf(unsigned int x, unsigned int y, unsigned int z);
    unsigned int g_rf(unsigned int x, unsigned int y, unsigned int z);
    unsigned int h_rf(unsigned int x, unsigned int y, unsigned int z);
    unsigned int i_rf(unsigned int x, unsigned int y, unsigned int z);
    // 返回 MD 緩衝區轉換後的 string 格式密文 
    string md2str();
    // 返回 buffer 中 [pos, pos + 3] 四個字節按照 little-endian 組成的 X
    unsigned int uchar2uint(int pos);
    // 返回 unsigned char 對應的十六進制 string
    string uchar2hex(unsigned char uch);
    // 返回 val 循環左移　bits 位的值
    unsigned int cycle_left_shift(unsigned int val, int bits);
    // 返回第 round 輪迭代中，第 step　步的 X 對應下標
    int get_x_index(int round, int step);
};

#endif

MD5.cpp

/*
 *   file: md5.cpp
 *   author: Els-y
 *   time: 2017-10-16 21:08:21
*/
#include "MD5.hpp"

/* -- public --*/
MD5::MD5() {
    buffer = NULL;
    round_funcs[0] = &MD5::f_rf;
    round_funcs[1] = &MD5::g_rf;
    round_funcs[2] = &MD5::h_rf;
    round_funcs[3] = &MD5::i_rf;
}

MD5::~MD5() {
    clear();
}

string MD5::encrypt(string plain) {
    init_md();
    clear();
    padding(plain);

    int group_len = buffer_len / 64;

    for (int i = 0; i < group_len; ++i) h_md5(i);

    return md2str();
}

void MD5::print_buff() {
    cout << "buffer_len = " << buffer_len << endl;
    for (int i = 0; i < buffer_len; ++i) {
        bitset<8> ch = buffer[i];
        cout << ch << " ";
    }
    cout << endl;
}

/* -- private --*/
// 初始化 MD 緩衝區
void MD5::init_md() {
    md = vector<unsigned int>({0x67452301, 0xefcdab89, 0x98badcfe, 0x10325476});
}

// 填充 padding 和 length
void MD5::padding(string plain) {
    unsigned int plain_len = plain.size();
    unsigned long long plain_bits_len = plain.size() * 8;
    unsigned int fill_bits_len = plain_bits_len % 512 == 448 ? 512 : (960 - plain_bits_len % 512) % 512;
    unsigned int fill_len = fill_bits_len / 8;
    buffer_len = plain_len + fill_len + 8;
    buffer = new unsigned char[buffer_len];

    // 複製原始消息
    for (int i = 0; i < plain_len; ++i) buffer[i] = plain[i];

    // 填充 padding
    buffer[plain_len] = 0x80;
    for (int i = 1; i < fill_len; ++i) buffer[plain_len + i] = 0;

    // 填充原始消息 length
    for (int i = 0; i < 8; ++i) {
        unsigned char ch = plain_bits_len;
        buffer[plain_len + fill_len + i] = ch;
        plain_bits_len >>= 8;
    }
}

void MD5::clear() {
    if (buffer != NULL) {
        delete []buffer;
        buffer = NULL;
    }
}

void MD5::h_md5(int groupid) {
    int buff_begin = 64 * groupid;

    unsigned int next;
    vector<unsigned int> last_md(md);

    const unsigned int CYCLE_BITS[4][4] = {
        {7, 12, 17, 22},
        {5, 9, 14, 20},
        {4, 11, 16, 23},
        {6, 10, 15, 21}
    };

    // round = [0, 1, 2, 3] 分別對應 [F, G, H, I] 輪
    for (int round = 0; round < 4; ++round) {
        for (int i = 0; i < 16; ++i) {
            unsigned int x = uchar2uint(buff_begin + get_x_index(round, i) * 4);
            unsigned int t = 0x100000000UL * abs(sin(round * 16 + i + 1));
            next = md[1] + cycle_left_shift(md[0] + (this->*round_funcs[round])(md[1], md[2], md[3]) + x + t, CYCLE_BITS[round][i % 4]);
            // (A, B, C, D) 循環右移
            md[0] = md[3];
            md[3] = md[2];
            md[2] = md[1];
            md[1] = next;
        }
    }

    for (int i = 0; i < 4; ++i) md[i] += last_md[i];
}

// 4 個輪函數
unsigned int MD5::f_rf(unsigned int x, unsigned int y, unsigned int z) {
    return (x & y) | (~x & z);
}

unsigned int MD5::g_rf(unsigned int x, unsigned int y, unsigned int z) {
    return (x & z) | (y & ~z);
}

unsigned int MD5::h_rf(unsigned int x, unsigned int y, unsigned int z) {
    return x ^ y ^ z;
}

unsigned int MD5::i_rf(unsigned int x, unsigned int y, unsigned int z) {
    return y ^ (x | ~z);
}

// 返回 MD 緩衝區轉換後的 string 格式密文
string MD5::md2str() {
    string res;

    for (int i = 0; i < 4; ++i) {
        unsigned int val = md[i];
        for (int j = 0; j < 4; ++j) {
            unsigned char ch = val;
            res += uchar2hex(ch);
            val >>= 8;
        }
    }

    return res;
}

// 返回 buffer 中 [pos, pos + 3] 四個字節按照 little-endian 組成的 X
unsigned int MD5::uchar2uint(int pos) {
    unsigned int val = 0;
    int end = pos + 3;
    for (int i = end; i >= pos; --i) {
        val |= buffer[i];
        if (i != pos) val <<= 8;
    }
    return val;
}

// 返回 unsigned char 對應的十六進制 string
string MD5::uchar2hex(unsigned char uch) {
    string res;
    unsigned char mask = 0x0F;

    for (int i = 1; i >= 0; --i) {
        char ch = uch >> (i << 2) & mask;
        if (ch < 10) ch += '0';
        else ch += 'A' - 10;
        res += ch;
    }

    return res;
}

// 返回 val 循環左移　bits 位的值
unsigned int MD5::cycle_left_shift(unsigned int val, int bits) {
    bits %= 32;
    return (val << bits) | (val >> (32 - bits));
}

// 返回第 round 輪迭代中，第 step　步的 X 對應下標
int MD5::get_x_index(int round, int step) {
    if (round == 0) {
        return step;
    } else if (round == 1) {
        return (1 + 5 * step) % 16;
    } else if (round == 2) {
        return (5 + 3 * step) % 16;
    } else {
        return (7 * step) % 16;
    }
}

main.cpp

#include <iostream>
#include "MD5.hpp"
using namespace std;

int main() {
    MD5 md5;

    string plain = "12345678";
    string cipher = md5.encrypt(plain);

    cout << "plain: " << plain << endl;
    cout << "cipher: " << cipher << endl;

    return 0;
}

輸出結果爲：

plain: 12345678
cipher: 25D55AD283AA400AF464C76D713C07AD

二、可加密未知長度的原始消息

md5.h

/*
  Copyright (C) 1999, 2002 Aladdin Enterprises.  All rights reserved.

  This software is provided 'as-is', without any express or implied
  warranty.  In no event will the authors be held liable for any damages
  arising from the use of this software.

  Permission is granted to anyone to use this software for any purpose,
  including commercial applications, and to alter it and redistribute it
  freely, subject to the following restrictions:

  1. The origin of this software must not be misrepresented; you must not
     claim that you wrote the original software. If you use this software
     in a product, an acknowledgment in the product documentation would be
     appreciated but is not required.
  2. Altered source versions must be plainly marked as such, and must not be
     misrepresented as being the original software.
  3. This notice may not be removed or altered from any source distribution.

  L. Peter Deutsch
  [email protected]

 */
/* $Id: md5.h,v 1.2 2007/12/24 05:58:37 lilyco Exp $ */
/*
  Independent implementation of MD5 (RFC 1321).

  This code implements the MD5 Algorithm defined in RFC 1321, whose
  text is available at
    http://www.ietf.org/rfc/rfc1321.txt
  The code is derived from the text of the RFC, including the test suite
  (section A.5) but excluding the rest of Appendix A.  It does not include
  any code or documentation that is identified in the RFC as being
  copyrighted.

  The original and principal author of md5.h is L. Peter Deutsch
  <[email protected]>.  Other authors are noted in the change history
  that follows (in reverse chronological order):

  2002-04-13 lpd Removed support for non-ANSI compilers; removed
    references to Ghostscript; clarified derivation from RFC 1321;
    now handles byte order either statically or dynamically.
  1999-11-04 lpd Edited comments slightly for automatic TOC extraction.
  1999-10-18 lpd Fixed typo in header comment (ansi2knr rather than md5);
    added conditionalization for C++ compilation from Martin
    Purschke <[email protected]>.
  1999-05-03 lpd Original version.
 */

#ifndef md5_INCLUDED
#  define md5_INCLUDED

/*
 * This package supports both compile-time and run-time determination of CPU
 * byte order.  If ARCH_IS_BIG_ENDIAN is defined as 0, the code will be
 * compiled to run only on little-endian CPUs; if ARCH_IS_BIG_ENDIAN is
 * defined as non-zero, the code will be compiled to run only on big-endian
 * CPUs; if ARCH_IS_BIG_ENDIAN is not defined, the code will be compiled to
 * run on either big- or little-endian CPUs, but will run slightly less
 * efficiently on either one than if ARCH_IS_BIG_ENDIAN is defined.
 */

typedef unsigned char md5_byte_t; /* 8-bit byte */
typedef unsigned int md5_word_t; /* 32-bit word */

/* Define the state of the MD5 Algorithm. */
typedef struct md5_state_s {
    md5_word_t count[2];    /* message length in bits, lsw first */
    md5_word_t abcd[4];        /* digest buffer */
    md5_byte_t buf[64];        /* accumulate block */
} md5_state_t;

#ifdef __cplusplus
extern "C" 
{
#endif

/* Initialize the algorithm. */
void md5_init(md5_state_t *pms);

/* Append a string to the message. */
void md5_append(md5_state_t *pms, const md5_byte_t *data, int nbytes);

/* Finish the message and return the digest. */
void md5_finish(md5_state_t *pms, md5_byte_t digest[16]);

#ifdef __cplusplus
}  /* end extern "C" */
#endif

#endif /* md5_INCLUDED */

md5.cpp

/*
  Copyright (C) 1999, 2000, 2002 Aladdin Enterprises.  All rights reserved.

  This software is provided 'as-is', without any express or implied
  warranty.  In no event will the authors be held liable for any damages
  arising from the use of this software.

  Permission is granted to anyone to use this software for any purpose,
  including commercial applications, and to alter it and redistribute it
  freely, subject to the following restrictions:

  1. The origin of this software must not be misrepresented; you must not
     claim that you wrote the original software. If you use this software
     in a product, an acknowledgment in the product documentation would be
     appreciated but is not required.
  2. Altered source versions must be plainly marked as such, and must not be
     misrepresented as being the original software.
  3. This notice may not be removed or altered from any source distribution.

  L. Peter Deutsch
  [email protected]

 */
/* $Id: md5.cpp,v 1.3 2008/01/20 22:52:04 lilyco Exp $ */
/*
  Independent implementation of MD5 (RFC 1321).

  This code implements the MD5 Algorithm defined in RFC 1321, whose
  text is available at
    http://www.ietf.org/rfc/rfc1321.txt
  The code is derived from the text of the RFC, including the test suite
  (section A.5) but excluding the rest of Appendix A.  It does not include
  any code or documentation that is identified in the RFC as being
  copyrighted.

  The original and principal author of md5.c is L. Peter Deutsch
  <[email protected]>.  Other authors are noted in the change history
  that follows (in reverse chronological order):

  2002-04-13 lpd Clarified derivation from RFC 1321; now handles byte order
    either statically or dynamically; added missing #include <string.h>
    in library.
  2002-03-11 lpd Corrected argument list for main(), and added int return
    type, in test program and T value program.
  2002-02-21 lpd Added missing #include <stdio.h> in test program.
  2000-07-03 lpd Patched to eliminate warnings about "constant is
    unsigned in ANSI C, signed in traditional"; made test program
    self-checking.
  1999-11-04 lpd Edited comments slightly for automatic TOC extraction.
  1999-10-18 lpd Fixed typo in header comment (ansi2knr rather than md5).
  1999-05-03 lpd Original version.
 */

#include "md5.h"
#include <string.h>

#undef BYTE_ORDER    /* 1 = big-endian, -1 = little-endian, 0 = unknown */
#ifdef ARCH_IS_BIG_ENDIAN
#  define BYTE_ORDER (ARCH_IS_BIG_ENDIAN ? 1 : -1)
#else
#  define BYTE_ORDER 0
#endif

#define T_MASK ((md5_word_t)~0)
#define T1 /* 0xd76aa478 */ (T_MASK ^ 0x28955b87)
#define T2 /* 0xe8c7b756 */ (T_MASK ^ 0x173848a9)
#define T3    0x242070db
#define T4 /* 0xc1bdceee */ (T_MASK ^ 0x3e423111)
#define T5 /* 0xf57c0faf */ (T_MASK ^ 0x0a83f050)
#define T6    0x4787c62a
#define T7 /* 0xa8304613 */ (T_MASK ^ 0x57cfb9ec)
#define T8 /* 0xfd469501 */ (T_MASK ^ 0x02b96afe)
#define T9    0x698098d8
#define T10 /* 0x8b44f7af */ (T_MASK ^ 0x74bb0850)
#define T11 /* 0xffff5bb1 */ (T_MASK ^ 0x0000a44e)
#define T12 /* 0x895cd7be */ (T_MASK ^ 0x76a32841)
#define T13    0x6b901122
#define T14 /* 0xfd987193 */ (T_MASK ^ 0x02678e6c)
#define T15 /* 0xa679438e */ (T_MASK ^ 0x5986bc71)
#define T16    0x49b40821
#define T17 /* 0xf61e2562 */ (T_MASK ^ 0x09e1da9d)
#define T18 /* 0xc040b340 */ (T_MASK ^ 0x3fbf4cbf)
#define T19    0x265e5a51
#define T20 /* 0xe9b6c7aa */ (T_MASK ^ 0x16493855)
#define T21 /* 0xd62f105d */ (T_MASK ^ 0x29d0efa2)
#define T22    0x02441453
#define T23 /* 0xd8a1e681 */ (T_MASK ^ 0x275e197e)
#define T24 /* 0xe7d3fbc8 */ (T_MASK ^ 0x182c0437)
#define T25    0x21e1cde6
#define T26 /* 0xc33707d6 */ (T_MASK ^ 0x3cc8f829)
#define T27 /* 0xf4d50d87 */ (T_MASK ^ 0x0b2af278)
#define T28    0x455a14ed
#define T29 /* 0xa9e3e905 */ (T_MASK ^ 0x561c16fa)
#define T30 /* 0xfcefa3f8 */ (T_MASK ^ 0x03105c07)
#define T31    0x676f02d9
#define T32 /* 0x8d2a4c8a */ (T_MASK ^ 0x72d5b375)
#define T33 /* 0xfffa3942 */ (T_MASK ^ 0x0005c6bd)
#define T34 /* 0x8771f681 */ (T_MASK ^ 0x788e097e)
#define T35    0x6d9d6122
#define T36 /* 0xfde5380c */ (T_MASK ^ 0x021ac7f3)
#define T37 /* 0xa4beea44 */ (T_MASK ^ 0x5b4115bb)
#define T38    0x4bdecfa9
#define T39 /* 0xf6bb4b60 */ (T_MASK ^ 0x0944b49f)
#define T40 /* 0xbebfbc70 */ (T_MASK ^ 0x4140438f)
#define T41    0x289b7ec6
#define T42 /* 0xeaa127fa */ (T_MASK ^ 0x155ed805)
#define T43 /* 0xd4ef3085 */ (T_MASK ^ 0x2b10cf7a)
#define T44    0x04881d05
#define T45 /* 0xd9d4d039 */ (T_MASK ^ 0x262b2fc6)
#define T46 /* 0xe6db99e5 */ (T_MASK ^ 0x1924661a)
#define T47    0x1fa27cf8
#define T48 /* 0xc4ac5665 */ (T_MASK ^ 0x3b53a99a)
#define T49 /* 0xf4292244 */ (T_MASK ^ 0x0bd6ddbb)
#define T50    0x432aff97
#define T51 /* 0xab9423a7 */ (T_MASK ^ 0x546bdc58)
#define T52 /* 0xfc93a039 */ (T_MASK ^ 0x036c5fc6)
#define T53    0x655b59c3
#define T54 /* 0x8f0ccc92 */ (T_MASK ^ 0x70f3336d)
#define T55 /* 0xffeff47d */ (T_MASK ^ 0x00100b82)
#define T56 /* 0x85845dd1 */ (T_MASK ^ 0x7a7ba22e)
#define T57    0x6fa87e4f
#define T58 /* 0xfe2ce6e0 */ (T_MASK ^ 0x01d3191f)
#define T59 /* 0xa3014314 */ (T_MASK ^ 0x5cfebceb)
#define T60    0x4e0811a1
#define T61 /* 0xf7537e82 */ (T_MASK ^ 0x08ac817d)
#define T62 /* 0xbd3af235 */ (T_MASK ^ 0x42c50dca)
#define T63    0x2ad7d2bb
#define T64 /* 0xeb86d391 */ (T_MASK ^ 0x14792c6e)


static void
md5_process(md5_state_t *pms, const md5_byte_t *data /*[64]*/)
{
    md5_word_t
    a = pms->abcd[0], b = pms->abcd[1],
    c = pms->abcd[2], d = pms->abcd[3];
    md5_word_t t;
#if BYTE_ORDER > 0
    /* Define storage only for big-endian CPUs. */
    md5_word_t X[16];
#else
    /* Define storage for little-endian or both types of CPUs. */
    md5_word_t xbuf[16];
    const md5_word_t *X;
#endif

    {
#if BYTE_ORDER == 0
    /*
     * Determine dynamically whether this is a big-endian or
     * little-endian machine, since we can use a more efficient
     * algorithm on the latter.
     */
    static const int w = 1;

    if (*((const md5_byte_t *)&w)) /* dynamic little-endian */
#endif
#if BYTE_ORDER <= 0        /* little-endian */
    {
        /*
         * On little-endian machines, we can process properly aligned
         * data without copying it.
         */
        if (!((data - (const md5_byte_t *)0) & 3)) {
        /* data are properly aligned */
        X = (const md5_word_t *)data;
        } else {
        /* not aligned */
        memcpy(xbuf, data, 64);
        X = xbuf;
        }
    }
#endif
#if BYTE_ORDER == 0
    else            /* dynamic big-endian */
#endif
#if BYTE_ORDER >= 0        /* big-endian */
    {
        /*
         * On big-endian machines, we must arrange the bytes in the
         * right order.
         */
        const md5_byte_t *xp = data;
        int i;

#  if BYTE_ORDER == 0
        X = xbuf;        /* (dynamic only) */
#  else
#    define xbuf X        /* (static only) */
#  endif
        for (i = 0; i < 16; ++i, xp += 4)
        xbuf[i] = xp[0] + (xp[1] << 8) + (xp[2] << 16) + (xp[3] << 24);
    }
#endif
    }

#define ROTATE_LEFT(x, n) (((x) << (n)) | ((x) >> (32 - (n))))

    /* Round 1. */
    /* Let [abcd k s i] denote the operation
       a = b + ((a + F(b,c,d) + X[k] + T[i]) <<< s). */
#define F(x, y, z) (((x) & (y)) | (~(x) & (z)))
#define SET(a, b, c, d, k, s, Ti)\
  t = a + F(b,c,d) + X[k] + Ti;\
  a = ROTATE_LEFT(t, s) + b
    /* Do the following 16 operations. */
    SET(a, b, c, d,  0,  7,  T1);
    SET(d, a, b, c,  1, 12,  T2);
    SET(c, d, a, b,  2, 17,  T3);
    SET(b, c, d, a,  3, 22,  T4);
    SET(a, b, c, d,  4,  7,  T5);
    SET(d, a, b, c,  5, 12,  T6);
    SET(c, d, a, b,  6, 17,  T7);
    SET(b, c, d, a,  7, 22,  T8);
    SET(a, b, c, d,  8,  7,  T9);
    SET(d, a, b, c,  9, 12, T10);
    SET(c, d, a, b, 10, 17, T11);
    SET(b, c, d, a, 11, 22, T12);
    SET(a, b, c, d, 12,  7, T13);
    SET(d, a, b, c, 13, 12, T14);
    SET(c, d, a, b, 14, 17, T15);
    SET(b, c, d, a, 15, 22, T16);
#undef SET

     /* Round 2. */
     /* Let [abcd k s i] denote the operation
          a = b + ((a + G(b,c,d) + X[k] + T[i]) <<< s). */
#define G(x, y, z) (((x) & (z)) | ((y) & ~(z)))
#define SET(a, b, c, d, k, s, Ti)\
  t = a + G(b,c,d) + X[k] + Ti;\
  a = ROTATE_LEFT(t, s) + b
     /* Do the following 16 operations. */
    SET(a, b, c, d,  1,  5, T17);
    SET(d, a, b, c,  6,  9, T18);
    SET(c, d, a, b, 11, 14, T19);
    SET(b, c, d, a,  0, 20, T20);
    SET(a, b, c, d,  5,  5, T21);
    SET(d, a, b, c, 10,  9, T22);
    SET(c, d, a, b, 15, 14, T23);
    SET(b, c, d, a,  4, 20, T24);
    SET(a, b, c, d,  9,  5, T25);
    SET(d, a, b, c, 14,  9, T26);
    SET(c, d, a, b,  3, 14, T27);
    SET(b, c, d, a,  8, 20, T28);
    SET(a, b, c, d, 13,  5, T29);
    SET(d, a, b, c,  2,  9, T30);
    SET(c, d, a, b,  7, 14, T31);
    SET(b, c, d, a, 12, 20, T32);
#undef SET

     /* Round 3. */
     /* Let [abcd k s t] denote the operation
          a = b + ((a + H(b,c,d) + X[k] + T[i]) <<< s). */
#define H(x, y, z) ((x) ^ (y) ^ (z))
#define SET(a, b, c, d, k, s, Ti)\
  t = a + H(b,c,d) + X[k] + Ti;\
  a = ROTATE_LEFT(t, s) + b
     /* Do the following 16 operations. */
    SET(a, b, c, d,  5,  4, T33);
    SET(d, a, b, c,  8, 11, T34);
    SET(c, d, a, b, 11, 16, T35);
    SET(b, c, d, a, 14, 23, T36);
    SET(a, b, c, d,  1,  4, T37);
    SET(d, a, b, c,  4, 11, T38);
    SET(c, d, a, b,  7, 16, T39);
    SET(b, c, d, a, 10, 23, T40);
    SET(a, b, c, d, 13,  4, T41);
    SET(d, a, b, c,  0, 11, T42);
    SET(c, d, a, b,  3, 16, T43);
    SET(b, c, d, a,  6, 23, T44);
    SET(a, b, c, d,  9,  4, T45);
    SET(d, a, b, c, 12, 11, T46);
    SET(c, d, a, b, 15, 16, T47);
    SET(b, c, d, a,  2, 23, T48);
#undef SET

     /* Round 4. */
     /* Let [abcd k s t] denote the operation
          a = b + ((a + I(b,c,d) + X[k] + T[i]) <<< s). */
#define I(x, y, z) ((y) ^ ((x) | ~(z)))
#define SET(a, b, c, d, k, s, Ti)\
  t = a + I(b,c,d) + X[k] + Ti;\
  a = ROTATE_LEFT(t, s) + b
     /* Do the following 16 operations. */
    SET(a, b, c, d,  0,  6, T49);
    SET(d, a, b, c,  7, 10, T50);
    SET(c, d, a, b, 14, 15, T51);
    SET(b, c, d, a,  5, 21, T52);
    SET(a, b, c, d, 12,  6, T53);
    SET(d, a, b, c,  3, 10, T54);
    SET(c, d, a, b, 10, 15, T55);
    SET(b, c, d, a,  1, 21, T56);
    SET(a, b, c, d,  8,  6, T57);
    SET(d, a, b, c, 15, 10, T58);
    SET(c, d, a, b,  6, 15, T59);
    SET(b, c, d, a, 13, 21, T60);
    SET(a, b, c, d,  4,  6, T61);
    SET(d, a, b, c, 11, 10, T62);
    SET(c, d, a, b,  2, 15, T63);
    SET(b, c, d, a,  9, 21, T64);
#undef SET

     /* Then perform the following additions. (That is increment each
        of the four registers by the value it had before this block
        was started.) */
    pms->abcd[0] += a;
    pms->abcd[1] += b;
    pms->abcd[2] += c;
    pms->abcd[3] += d;
}

void
md5_init(md5_state_t *pms)
{
    pms->count[0] = pms->count[1] = 0;
    pms->abcd[0] = 0x67452301;
    pms->abcd[1] = /*0xefcdab89*/ T_MASK ^ 0x10325476;
    pms->abcd[2] = /*0x98badcfe*/ T_MASK ^ 0x67452301;
    pms->abcd[3] = 0x10325476;
}

void
md5_append(md5_state_t *pms, const md5_byte_t *data, int nbytes)
{
    const md5_byte_t *p = data;
    int left = nbytes;
    int offset = (pms->count[0] >> 3) & 63;
    md5_word_t nbits = (md5_word_t)(nbytes << 3);

    if (nbytes <= 0)
    return;

    /* Update the message length. */
    pms->count[1] += nbytes >> 29;
    pms->count[0] += nbits;
    if (pms->count[0] < nbits)
    pms->count[1]++;

    /* Process an initial partial block. */
    if (offset) {
    int copy = (offset + nbytes > 64 ? 64 - offset : nbytes);

    memcpy(pms->buf + offset, p, copy);
    if (offset + copy < 64)
        return;
    p += copy;
    left -= copy;
    md5_process(pms, pms->buf);
    }

    /* Process full blocks. */
    for (; left >= 64; p += 64, left -= 64)
    md5_process(pms, p);

    /* Process a final partial block. */
    if (left)
    memcpy(pms->buf, p, left);
}

void
md5_finish(md5_state_t *pms, md5_byte_t digest[16])
{
    static const md5_byte_t pad[64] = {
    0x80, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
    };
    md5_byte_t data[8];
    int i;

    /* Save the length before padding. */
    for (i = 0; i < 8; ++i)
    data[i] = (md5_byte_t)(pms->count[i >> 2] >> ((i & 3) << 3));
    /* Pad to 56 bytes mod 64. */
    md5_append(pms, pad, ((55 - (pms->count[0] >> 3)) & 63) + 1);
    /* Append the length. */
    md5_append(pms, data, 8);
    for (i = 0; i < 16; ++i)
    digest[i] = (md5_byte_t)(pms->abcd[i >> 2] >> ((i & 3) << 3));
}

main.cpp

#include <cstdio>
#include <cstring>
#include "md5.h"

int main() {
    md5_state_t s;
    char ss[] = "12345678";
    unsigned char result[16];
    md5_init(&s);
    md5_append(&s, (const unsigned char *)ss, strlen(ss));
    md5_finish(&s, (unsigned char *)result);
    for (int i = 0; i < 16; ++i) {
        printf("%x%x", (result[i] >> 4) & 0x0f, result[i] & 0x0f);
    }
    printf("\n");
    return 0;
}

撲街中的二娃

發佈了60 篇原創文章 · 獲贊 18 · 訪問量 8萬+

私信關注

[C++] MD5加密算法原理及實現

算法概述

一、Append Padding Bits

二、Append Length

三、Initialize MD Buffer

四、Process Message in 16-Word Blocks

Hmd5 的具體步驟大致爲，

Hmd5 的 2~5 步，每輪的一步運算邏輯爲，

各輪迭代中的 X[k]

T 表的生成

五、Output

C++ 實現

一、自己的實現方法（不可加密未知長度的原始消息）

二、可加密未知長度的原始消息

[C++日常小題] 計算二叉查找樹的高度

[LeetCode] 442. Find All Duplicates in an Array

[LeetCode] 135. Candy

[Web] 簡易Markdown可預覽編輯器 —— Codemirror+Marked+Prism

[LeetCode] 650. 2 Keys Keyboard

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結