Expert Day 195

对称加密 — AES设计与GCM模式深度解析

AES 设计原理（SubBytes/ShiftRows/MixColumns/AddRoundKey）、CBC/CTR/GCM 模式对比、IV 重用导致的灾难性后果

2026-11-12

Phase 4 - 经典密码学 (Day 195-208)

密码学对称加密AESGCMIV重用

日期: 2026-11-12 方向: 密码学 / 经典原语阶段: Phase 4 - 经典密码学 (Day 195-208) 标签: #密码学 #对称加密 #AES #GCM #IV重用

今日目标

类型	内容
学习	AES 设计原理（SubBytes/ShiftRows/MixColumns/AddRoundKey）、CBC/CTR/GCM 模式对比、IV 重用导致的灾难性后果
实操	手写 AES-128 块加密 + GCM 认证加密（仅依赖 Python 内置 int 运算）、复现 IV 重用攻击
产出	`sym.py` — 完整对称加密原语库

一、AES 设计与原理

1.1 设计哲学 — Rijndael 的胜出

1997 NIST 公开征集 AES，1998 提交 15 个候选，2000 选定 Rijndael（Daemen & Rijmen）。

胜出原因:

代数结构清晰：基于 GF(2^8) 上的有限域运算
抗差分/线性密码分析：S-box 设计满足 max DDT entry = 4/256，最优分支数
硬件友好：可以用查表（256 字节 S-box）或位切片实现
可证 lower bound：4 轮后至少 25 个 active S-box

1.2 AES 结构 — SPN (Substitution-Permutation Network)

参数	AES-128	AES-192	AES-256
Key size	128 bit	192 bit	256 bit
Block size	128 bit (4×4 字节状态)	128 bit	128 bit
Rounds	10	12	14
Round key	128 bit × 11	128 bit × 13	128 bit × 15

轮函数（除最后一轮）:

state = AddRoundKey(state, k0)
for i in 1..Nr-1:
    state = SubBytes(state)        # 非线性层
    state = ShiftRows(state)       # 字节级置换
    state = MixColumns(state)      # 列混淆（GF(2^8) 上的矩阵乘）
    state = AddRoundKey(state, ki)
# 最后一轮无 MixColumns
state = SubBytes(state)
state = ShiftRows(state)
state = AddRoundKey(state, kNr)

1.3 SubBytes — S-box 设计

S-box 由两步构成：

GF(2^8) 上的乘法逆：$y = x^{-1}$（约定 $0 \mapsto 0$），不可约多项式 $m(x) = x^8 + x^4 + x^3 + x + 1$
仿射变换：$z = Ay + b$，其中 $A$ 是 GF(2) 上的固定 8×8 矩阵，$b = 0x63$

为什么这样设计：

求逆运算的非线性度极高（差分均匀性 = 4，最优）
仿射变换破坏代数简单性（防止线性密码分析）
没有不动点（除被仿射变换打破的）

1.4 ShiftRows / MixColumns / AddRoundKey

ShiftRows: 第 $i$ 行循环左移 $i$ 字节（$i=0,1,2,3$），保证扩散到所有列
MixColumns: 每列乘以 GF(2^8) 上的 MDS 矩阵 $\begin{pmatrix} 2 & 3 & 1 & 1 \ 1 & 2 & 3 & 1 \ 1 & 1 & 2 & 3 \ 3 & 1 & 1 & 2 \end{pmatrix}$
AddRoundKey: 状态 XOR 轮密钥（GF(2) 加法）

MDS 性质：任意 4 字节输入差分被扩散到 ≥ 5 字节输出差分（"分支数 = 5"）。

1.5 Key Schedule — 轮密钥派生

W[0..3] = K (初始密钥分成 4 个 32-bit word)
for i in 4..(4*(Nr+1)):
    temp = W[i-1]
    if i % 4 == 0:
        temp = SubWord(RotWord(temp)) XOR Rcon[i/4]
    W[i] = W[i-4] XOR temp

Rcon[j] = $(2^{j-1}, 0, 0, 0)$ in GF(2^8).

二、加密模式 — 从 ECB 到 GCM

2.1 ECB (Electronic Codebook) — 永远不要用

C_i = AES_K(P_i)

致命问题：相同明文 → 相同密文。著名 Tux 企鹅图加密后仍能看出轮廓。

2.2 CBC (Cipher Block Chaining)

C_0 = IV
C_i = AES_K(P_i XOR C_{i-1})

问题:

串行加密（不可并行）
Padding Oracle Attack（CVE-2013-0169 Lucky 13、POODLE 2014）
IV 必须不可预测（CBC IV 可预测 → BEAST 2011）

2.3 CTR (Counter Mode)

C_i = P_i XOR AES_K(IV || counter_i)

优点: 可并行、流式、无 padding。 关键约束: $(K, IV, \text{counter})$ 三元组绝不能重复。

2.4 GCM (Galois/Counter Mode) — 当代标杆

GCM = CTR 加密 + GHASH 认证（在 GF(2^128) 上的多项式 MAC）。

算法:

$H = AES_K(0^{128})$（哈希子密钥）
$J_0 = IV | 0^{31} | 1$（96-bit IV 时；否则用 GHASH 派生）
CTR 模式加密：$C_i = P_i \oplus AES_K(\text{inc}(J_0, i))$
GHASH：$T = \text{GHASH}_H(A | C | \text{lens}) \oplus AES_K(J_0)$

GHASH 多项式（$X_0 = 0$）: $$X_i = (X_{i-1} \oplus B_i) \cdot H \pmod{x^{128} + x^7 + x^2 + x + 1}$$

GHASH 是 $\epsilon$-XOR-universal（$\epsilon = m/2^{128}$）。

三、安全性论证

3.1 AES 的安全假设

AES 没有"标准困难问题"归约——它是启发式安全：经过 25 年公开分析没有破解。

最佳已知攻击（学术）:

Biclique Attack (2011)：AES-128 复杂度 $2^{126.1}$（vs brute force $2^{128}$）—— 仅理论
Related-key on AES-256 (2009)：$2^{99.5}$ —— 不影响实际使用

3.2 GCM 的安全归约

定理（McGrew-Viega 2004）: 若 AES 是 PRF，GCM 是 IND-CPA + INT-CTXT 安全的，区分优势: $$\text{Adv}^{\text{AE}}{\text{GCM}} \le \text{Adv}^{\text{PRF}}{\text{AES}} + \frac{q^2 \cdot \ell^2}{2^{128}} + \frac{q_v \cdot \ell}{2^{128}}$$

其中 $q$ 是查询数，$\ell$ 是最大消息长度（块），$q_v$ 是验证查询数。

关键边界: $q \cdot \ell$ 不能超过 $2^{32}$（即 64 GB），否则 birthday bound 触发。

3.3 IV 重用的灾难性

GCM IV 重用 → 完全破坏机密性 + 认证性:

机密性: 两条消息 $(M_1, M_2)$ 共享 IV → $C_1 \oplus C_2 = M_1 \oplus M_2$（XOR 流泄漏）
认证性: 两条 IV 重用密文可恢复 GHASH 密钥 $H$，进而伪造任意密文

Joux 论文 (2006): 给定 2 条 IV 重用消息，求解一元多项式方程恢复 $H$。

四、代码实现 — `sym.py`

"""
sym.py - AES-128 + GCM 完整实现（教学用，不要在生产中使用）

设计目标：
- 不依赖 cryptography/pycryptodome 高层 API
- 使用纯 Python int 运算实现 AES 块
- GHASH 在 GF(2^128) 上手写
- 演示 IV 重用攻击
"""
from __future__ import annotations
from typing import Tuple

# --------------------------- AES-128 核心 ---------------------------

# Rijndael S-box (256 entries)
SBOX = bytes.fromhex(
    "637c777bf26b6fc53001672bfed7ab76ca82c97dfa5947f0add4a2af9ca472c0"
    "b7fd9326363ff7cc34a5e5f171d8311504c723c31896059a071280e2eb27b275"
    "09832c1a1b6e5aa0523bd6b329e32f8453d100ed20fcb15b6acbbe394a4c58cf"
    "d0efaafb434d338545f9027f503c9fa851a3408f929d38f5bcb6da2110fff3d2"
    "cd0c13ec5f974417c4a77e3d645d197360814fdc222a908846eeb814de5e0bdb"
    "e0323a0a4906245cc2d3ac629195e479e7c8376d8dd54ea96c56f4ea657aae08"
    "ba78252e1ca6b4c6e8dd741f4bbd8b8a703eb5664803f60e613557b986c11d9e"
    "e1f8981169d98e949b1e87e9ce5528df8ca1890dbfe6426841992d0fb054bb16"
)
RCON = [0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80, 0x1b, 0x36]

def gmul(a: int, b: int) -> int:
    """GF(2^8) multiplication, irreducible poly x^8+x^4+x^3+x+1 = 0x11b"""
    p = 0
    for _ in range(8):
        if b & 1:
            p ^= a
        hi = a & 0x80
        a = (a << 1) & 0xff
        if hi:
            a ^= 0x1b
        b >>= 1
    return p

def sub_bytes(s: bytearray) -> None:
    for i in range(16):
        s[i] = SBOX[s[i]]

def shift_rows(s: bytearray) -> None:
    # state is column-major: s[r + 4*c]
    s[1], s[5], s[9], s[13] = s[5], s[9], s[13], s[1]
    s[2], s[6], s[10], s[14] = s[10], s[14], s[2], s[6]
    s[3], s[7], s[11], s[15] = s[15], s[3], s[7], s[11]

def mix_columns(s: bytearray) -> None:
    for c in range(4):
        a = [s[c*4 + r] for r in range(4)]
        s[c*4 + 0] = gmul(2, a[0]) ^ gmul(3, a[1]) ^ a[2] ^ a[3]
        s[c*4 + 1] = a[0] ^ gmul(2, a[1]) ^ gmul(3, a[2]) ^ a[3]
        s[c*4 + 2] = a[0] ^ a[1] ^ gmul(2, a[2]) ^ gmul(3, a[3])
        s[c*4 + 3] = gmul(3, a[0]) ^ a[1] ^ a[2] ^ gmul(2, a[3])

def add_round_key(s: bytearray, k: bytes) -> None:
    for i in range(16):
        s[i] ^= k[i]

def key_expand(key: bytes) -> list[bytes]:
    assert len(key) == 16
    W = [bytearray(key[i*4:(i+1)*4]) for i in range(4)]
    for i in range(4, 44):
        temp = bytearray(W[i-1])
        if i % 4 == 0:
            # RotWord + SubWord + Rcon
            temp = bytearray([temp[1], temp[2], temp[3], temp[0]])
            temp = bytearray([SBOX[b] for b in temp])
            temp[0] ^= RCON[i // 4 - 1]
        W.append(bytearray(W[i-4][j] ^ temp[j] for j in range(4)))
    # group into 11 round keys
    rks = []
    for r in range(11):
        rk = bytearray()
        for c in range(4):
            rk += W[r*4 + c]
        rks.append(bytes(rk))
    return rks

def aes_encrypt_block(pt: bytes, key: bytes) -> bytes:
    assert len(pt) == 16 and len(key) == 16
    rks = key_expand(key)
    s = bytearray(pt)
    add_round_key(s, rks[0])
    for r in range(1, 10):
        sub_bytes(s)
        shift_rows(s)
        mix_columns(s)
        add_round_key(s, rks[r])
    sub_bytes(s)
    shift_rows(s)
    add_round_key(s, rks[10])
    return bytes(s)

# --------------------------- GHASH (GF(2^128)) ---------------------------

def gf128_mul(x: int, y: int) -> int:
    """GHASH multiplication: GF(2^128) with poly x^128 + x^7 + x^2 + x + 1
    GCM uses 'reflected' bit ordering — bit 0 is MSB.
    """
    R = 0xe1 << 120  # reduction polynomial in reflected form
    z = 0
    v = y
    for i in range(128):
        if (x >> (127 - i)) & 1:
            z ^= v
        if v & 1:
            v = (v >> 1) ^ R
        else:
            v >>= 1
    return z

def ghash(H: bytes, data: bytes) -> bytes:
    h = int.from_bytes(H, 'big')
    y = 0
    for i in range(0, len(data), 16):
        block = data[i:i+16].ljust(16, b'\x00')
        y ^= int.from_bytes(block, 'big')
        y = gf128_mul(y, h)
    return y.to_bytes(16, 'big')

# --------------------------- AES-GCM ---------------------------

def inc32(block: bytes) -> bytes:
    """Increment last 32 bits of a 16-byte block."""
    prefix, ctr = block[:12], int.from_bytes(block[12:], 'big')
    ctr = (ctr + 1) & 0xffffffff
    return prefix + ctr.to_bytes(4, 'big')

def aes_gcm_encrypt(key: bytes, iv: bytes, plaintext: bytes, aad: bytes = b'') -> Tuple[bytes, bytes]:
    """AES-GCM encrypt. Returns (ciphertext, tag).
    IV must be 12 bytes (96 bit) — the standard case.
    """
    assert len(key) == 16 and len(iv) == 12
    H = aes_encrypt_block(b'\x00'*16, key)
    J0 = iv + b'\x00\x00\x00\x01'

    # CTR encryption starting from inc32(J0)
    ct = bytearray()
    counter = inc32(J0)
    for i in range(0, len(plaintext), 16):
        block = plaintext[i:i+16]
        ks = aes_encrypt_block(counter, key)
        ct += bytes(b ^ k for b, k in zip(block, ks[:len(block)]))
        counter = inc32(counter)

    # GHASH over AAD || CT || lengths
    def pad16(d: bytes) -> bytes:
        if len(d) % 16 == 0:
            return d
        return d + b'\x00' * (16 - len(d) % 16)

    lens = (len(aad)*8).to_bytes(8, 'big') + (len(ct)*8).to_bytes(8, 'big')
    ghash_input = pad16(aad) + pad16(bytes(ct)) + lens
    S = ghash(H, ghash_input)
    tag = bytes(s ^ k for s, k in zip(S, aes_encrypt_block(J0, key)))
    return bytes(ct), tag

def aes_gcm_decrypt(key: bytes, iv: bytes, ct: bytes, tag: bytes, aad: bytes = b'') -> bytes:
    """AES-GCM decrypt with authentication."""
    H = aes_encrypt_block(b'\x00'*16, key)
    J0 = iv + b'\x00\x00\x00\x01'

    # verify tag first (constant-time compare in real code)
    def pad16(d):
        return d + b'\x00' * ((-len(d)) % 16)
    lens = (len(aad)*8).to_bytes(8, 'big') + (len(ct)*8).to_bytes(8, 'big')
    S = ghash(H, pad16(aad) + pad16(ct) + lens)
    expected = bytes(s ^ k for s, k in zip(S, aes_encrypt_block(J0, key)))
    if expected != tag:
        raise ValueError("GCM authentication failed")

    pt = bytearray()
    counter = inc32(J0)
    for i in range(0, len(ct), 16):
        block = ct[i:i+16]
        ks = aes_encrypt_block(counter, key)
        pt += bytes(b ^ k for b, k in zip(block, ks[:len(block)]))
        counter = inc32(counter)
    return bytes(pt)

# --------------------------- IV 重用攻击演示 ---------------------------

def iv_reuse_demo():
    """演示：IV 重用 → XOR 流泄漏明文关系"""
    key = b'YELLOW SUBMARINE'
    iv = b'A' * 12  # 故意重用！

    msg1 = b"Attack at dawn, 06:00 hours"
    msg2 = b"Attack at dusk, 18:00 hours"

    ct1, _ = aes_gcm_encrypt(key, iv, msg1)
    ct2, _ = aes_gcm_encrypt(key, iv, msg2)

    # 攻击者计算
    leaked = bytes(a ^ b for a, b in zip(ct1, ct2))
    expected = bytes(a ^ b for a, b in zip(msg1, msg2))
    assert leaked == expected
    print("[IV reuse] Attacker recovers M1 XOR M2:", leaked.hex())
    return leaked

# --------------------------- 自检 ---------------------------

if __name__ == "__main__":
    # NIST AES-128 test vector
    key = bytes.fromhex("000102030405060708090a0b0c0d0e0f")
    pt = bytes.fromhex("00112233445566778899aabbccddeeff")
    expected_ct = bytes.fromhex("69c4e0d86a7b0430d8cdb78070b4c55a")
    assert aes_encrypt_block(pt, key) == expected_ct
    print("AES-128 NIST vector: PASS")

    # GCM round-trip
    key = bytes(16)
    iv = bytes(12)
    ct, tag = aes_gcm_encrypt(key, iv, b"hello world", aad=b"v1")
    assert aes_gcm_decrypt(key, iv, ct, tag, aad=b"v1") == b"hello world"
    print("AES-GCM round-trip: PASS")

    iv_reuse_demo()

五、真实漏洞 / 事件

年份	事件	根因
2008	WPA TKIP "Beck-Tews" attack	流式加密 + 弱 MIC
2011	BEAST (TLS 1.0 CBC)	CBC IV 可预测，链接
2013	Lucky 13 (TLS CBC)	CBC padding oracle 时序泄漏
2014	POODLE (SSL 3.0)	CBC padding oracle
2016	"Forbidden Attack" on TLS GCM	IV 重用 → 服务器伪造
2020	GCM IV 重用 in Bluetooth Mesh	实现错误，nonce 计数器溢出

Forbidden Attack 细节 (Böck-Zauner 2016): 扫描 184 个 HTTPS 服务器，发现 184 个 GCM IV 重用案例（包括 VISA 部分子站）→ 可伪造 TLS 记录。

六、协议应用 — Web3 中的对称加密

场景	算法	用途
Ethereum Keystore (Geth/MetaMask)	AES-128-CTR + Scrypt KDF + Keccak MAC	加密本地私钥
TLS 1.3 (RPC connections)	AES-GCM / ChaCha20-Poly1305	安全连接
libp2p secio/noise	AES-GCM, ChaCha20-Poly1305	P2P 加密信道
Whisper / Waku	AES-256-GCM	链下消息加密
EthereumJS keystore v3	AES-128-CTR + scrypt	浏览器钱包
WalletConnect v2	ChaCha20-Poly1305	端到端会话

为什么 AES-GCM 在 Web3 没有 ChaCha20-Poly1305 流行: 移动端无 AES 硬件加速时（部分 ARM 旧核），GCM 慢；ChaCha20 全软件高性能 + 无侧信道（无查表）。

七、常见陷阱

IV 重用: GCM/CTR 同密钥 + 同 IV 加密两条不同消息 → 立即破解
MAC-then-encrypt vs encrypt-then-MAC: TLS 1.0 CBC 用 MAC-then-encrypt → padding oracle。永远 encrypt-then-MAC（GCM/CCM 已内置）
ECB 模式: 看似加密但保留模式 → 永远不要用
Key + IV 派生混乱: 用 PBKDF2 单独派生 key，IV 用 RNG。不要从口令直接派生 IV
填充神谕 (Padding Oracle): 解密失败时区分"密文错误"和"填充错误" → 信息泄漏
Tag 截断: GCM tag < 128 bit 时安全性指数下降，96 bit 仍可接受，64 bit 危险
Nonce 计数器溢出: GCM 96-bit IV + 32-bit counter，单 key 加密超 64GB 后 counter 重叠

八、关键速查

模式对比

模式	并行	认证	IV 要求	推荐?
ECB	✓	✗	无	❌ 禁用
CBC	加密串行/解密并行	✗	不可预测	⚠️ 仅遗留
CTR	✓	✗	唯一	⚠️ 需配 MAC
CFB/OFB	加密串行	✗	唯一	❌ 已淘汰
GCM	✓	✓	唯一 (96-bit)	✅ TLS 1.3
CCM	串行	✓	唯一	✅ 嵌入式
ChaCha20-Poly1305	✓	✓	唯一 (96-bit)	✅ 移动端
AES-SIV (RFC 5297)	✓	✓	可重用 (耐误用)	✅ 备份场景
AES-GCM-SIV	✓	✓	抗重用	✅ 高安全

性能（AES-NI 启用）

算法	速度 (Skylake)
AES-128-GCM	~5 GB/s
ChaCha20-Poly1305	~2 GB/s (软件)
AES-256-GCM	~3.5 GB/s

九、面试题

Q1: 为什么 GCM 的 IV 可以是 96 bit 而不是 128 bit？ A: 因为 GCM 内部把 96-bit IV 与 32-bit 计数器拼接成 128-bit 块。96 是 birthday bound 的最佳折衷（$2^{32}$ 个唯一 IV 可在 $2^{96}$ 空间内安全使用）。如果 IV 不是 96 bit，需用 GHASH 派生 J0，开销大且更易出错。

Q2: AES-128 vs AES-256 选哪个？ A: AES-128 在抗经典攻击下安全（$2^{128}$ brute force）。AES-256 提供 PQ 时代安全余量（Grover 算法把对称密钥强度减半 → 128 bit 后量子安全）。对长期数据用 AES-256，瞬态会话用 AES-128 即可。

Q3: 为什么不直接用 H(K || M) 做 MAC，而要用 HMAC？ A: 因为 Merkle-Damgård 哈希（如 SHA-256）受长度扩展攻击：知道 H(K||M) 和 |M|，可计算 H(K||M||pad||M')。HMAC 用 H(K_outer || H(K_inner || M)) 结构防御。GCM 用 GHASH (universal hash) 在 GF(2^128) 中也是为防止此问题。

Q4: Web3 钱包 keystore 为什么用 scrypt/argon2 而不是 PBKDF2？ A: PBKDF2 是 CPU-bound，GPU/ASIC 加速比极高（10^4 倍）。Scrypt 是 memory-hard，需 RAM；Argon2 进一步抗 ASIC + 抗 side-channel。攻击者从泄漏的 keystore 字典攻击口令时，scrypt 慢 1000 倍。

Q5: GCM 在 nonce 唯一前提下能保证什么安全？ A: IND-CPA（区分性）+ INT-CTXT（密文完整性）= 认证加密。但若 nonce 重用，两个安全性同时崩塌——攻击者可恢复 GHASH 子密钥 H 后伪造任意密文。这就是 GCM-SIV 被发明的动机（nonce 重用安全）。

十、明日预告

Day 196: 哈希函数 — 深入 SHA-256 状态机（512-bit 块、64 轮压缩、Merkle-Damgård 结构）、Keccak/SHA-3 的海绵结构差异、Blake2/3 的并行哈希设计。手写 SHA-256 完整状态机，对比 SHA-3 和 Blake3 的性能/安全/PQ 抗性。