前言

之前有款換臉軟件不是叫ZAO麼，分析了一下，它的實現原理絕對是3D人臉重建，而非deepfake方法，找了一篇3D重建的論文和源碼看看。這裏對源碼中的部分函數做了自己的理解和改寫。

國際慣例，參考博客：

本博客主要是對PRNet的輸出進行理解。

理論簡介

這篇博客比較系統的介紹了3D人臉重建的方法，就我個人淺顯的理解，分爲兩個流派：1.通過算法估算3DMM的參數，3DMM的思想是有一個平均臉，基於這個平均臉進行變形，就能得到任意的人臉，算法就需要計算這個變形所需要的參數；2. 直接擺脫平均臉的約束，直接使用神經網絡去估算人臉的3D參數。

PRNet就是屬於第二種流派，輸入一張圖片，直接使用神經網絡輸出一張稱爲UV position map的UV位置映射圖。本博客就是爲了對這個輸出進行充分理解。先簡短說一下，他的維度是 $(256,256,3)$ 的三位矩陣，前面兩個維度上輸出的紋理圖的維度，最後一個維度表示紋理圖每個像素在3D空間中的位置信息。

任何的3D人臉重建，包括3DMM，都需要得到頂點圖和紋理圖，這個在圖形學裏面很常見，比如我們看到的遊戲角色就包括骨骼信息和紋理信息。

代碼理解

首先引入必要的庫：

import numpy as np
import os
from skimage.transform import estimate_transform, warp
import cv2
from predictor import PosPrediction
import matplotlib.pyplot as plt

這裏有個額外的predictor庫，是PRNet的網絡結構，直接去這裏下載。

還有一個文件夾需要下載，戳這裏，這裏面定義了UV圖的人臉關鍵點信息uv_kpt_ind，預定義的人臉頂點信息face_ind，三角網格信息triangles。下面會分析他倆的作用。

人臉裁剪

因爲源碼使用dlib檢測人臉關鍵點，其實目的是找到人臉框，然後裁剪人臉。由於在Mac上安裝dlib有點難度，而前面的換臉博客剛好玩過用opencv檢測人臉關鍵點。檢測人臉框的代碼如下：

## 預檢測人臉框或者關鍵點，目的是裁剪人臉
cas = cv2.CascadeClassifier('./Data/cv-data/haarcascade_frontalface_alt2.xml')
img = plt.imread('./images/zly.jpg')
img_gray= cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
faces = cas.detectMultiScale(img_gray,2,3,0,(30,30))
bbox = np.array([faces[0,0],faces[0,1],faces[0,0]+faces[0,2],faces[0,1]+faces[0,3]])

可視化看看：

plt.imshow(cv2.rectangle(img.copy(),(bbox[0],bbox[1]),(bbox[2],bbox[3]),(0,255,0),2))
plt.axis('off')

裁剪人臉

left = bbox[0]; top = bbox[1]; right = bbox[2]; bottom = bbox[3]
old_size = (right - left + bottom - top)/2
center = np.array([right - (right - left) / 2.0, bottom - (bottom - top) / 2.0])
size = int(old_size*1.6)

src_pts = np.array([[center[0]-size/2, center[1]-size/2], 
                    [center[0] - size/2, center[1]+size/2], 
                    [center[0]+size/2, center[1]-size/2]])
DST_PTS = np.array([[0,0], [0,255], [255, 0]]) #圖像大小256*256
tform = estimate_transform('similarity', src_pts, DST_PTS)

img = img/255.
cropped_img = warp(img, tform.inverse, output_shape=(256, 256))

可視化看看

plt.imshow(cropped_img)
plt.axis('off')

網絡推斷

載入網絡結構

pos_predictor = PosPrediction(256, 256)
pos_predictor.restore('./Data/net-data/256_256_resfcn256_weight')

直接把裁剪後的圖片輸入到網絡中，推導UV位置映射圖

cropped_pos = pos_predictor.predict(cropped_img) #網絡推斷

因爲這個結果是裁剪過的圖的重建，所以在重新調整一下，縮放到之前的圖大小：

#將裁剪圖的結果重新調整
cropped_vertices = np.reshape(cropped_pos, [-1, 3]).T
z = cropped_vertices[2,:].copy()/tform.params[0,0]
cropped_vertices[2,:] = 1
vertices = np.dot(np.linalg.inv(tform.params), cropped_vertices)
vertices = np.vstack((vertices[:2,:], z))
pos = np.reshape(vertices.T, [256, 256, 3])

這裏不太好可視化，只看看這個深度信息，也就是第三個通道：

plt.imshow(pos[...,2],cmap='gray')
plt.axis('off')

很明顯，這個是能看出來臉部的不同位置，顏色深淺不同，鼻子的高度最高，所以比較白一點。

人臉關鍵點

需要注意的是，論文所生成的所有人臉的texture都符合uv_face.png所有器官位置，比如鼻子一定會在texutre的鼻子那裏，不管你是側臉還是正臉，uv_kpt_ind.txt這裏面定義的就是texture的人臉關鍵點位置，是固定的。

uv_kpt_ind = np.loadtxt('./Data/uv-data/uv_kpt_ind.txt').astype(np.int32)
uv_face = plt.imread('./Data/uv-data/uv_face.png')
plt.imshow(draw_kps(uv_face,uv_kpt_ind.T))
plt.axis('off')

記住，所有的人臉texture都滿足這個佈局，所有器官一定出現在上圖的對應位置。至於怎麼獲取texture，後面會介紹。

前面說了，網絡輸出的UV位置映射圖，前面兩個 $(256,256)$ 是texture的位置，最後一個維度上texutre在3D圖上的位置。所以根據uv_kpt_ind和UV位置映射圖能找到人臉圖(非紋理圖)上的關鍵點

def draw_kps(img,kps,point_size=2):
    img = np.array(img*255,np.uint8)
    for i in range(kps.shape[0]):
        cv2.circle(img,(int(kps[i,0]),int(kps[i,1])),point_size,(0,255,0),-1)
    return img
face_kps = pos[uv_kpt_ind[1,:],uv_kpt_ind[0,:],:]

可視化看看

plt.imshow(draw_kps(img.copy(),face_kps))
plt.axis('off')

人臉點雲

可視化了人臉關鍵點，順帶將face_ind裏面定義的所有頂點全可視化一下。

直接從face_ind讀到所有需要的頂點信息

face_ind = np.loadtxt('./Data/uv-data/face_ind.txt').astype(np.int32)
all_vertices = np.reshape(pos, [256*256, -1])
vertices = all_vertices[face_ind, :]

根據texture上定義的位置信息，可視化原人臉圖信息：

plt.figure(figsize=(8,8))
plt.imshow(draw_kps(img.copy(),vertices[:,:2],1))
plt.axis('off')

順便也可以看看3D圖

from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax1 = plt.axes(projection='3d')
ax1.scatter3D(vertices[:,2],vertices[:,0],vertices[:,1], cmap='Blues')  #繪製散點圖
ax1.set_xlabel('X Label') 
ax1.set_ylabel('Y Label') 
ax1.set_zlabel('Z Label')

都糊一起了，但是能大概看出來人臉模型。

提取紋理圖

上面說了，所有的人臉經過網絡得到的texture都滿足uv_face.png中的器官位置。

怎麼根據UV位置映射圖獲取texture呢？一個函數remap:

texture = cv2.remap(img, pos[:,:,:2].astype(np.float32), None, interpolation=cv2.INTER_NEAREST, borderMode=cv2.BORDER_CONSTANT,borderValue=(0))

可視化texture和固定的uv_kpt_ind看看：

plt.imshow(draw_kps(texture,uv_kpt_ind.T))
plt.axis('off')

因爲使用的圖片上趙麗穎的正臉，所以側面的texture不清晰，但是正臉的五官位置的確如所料，在固定的位置上出現。

渲染紋理圖/3D人臉

能用一句話把紋理圖獲取到，那麼我們就能根據texture和頂點位置將紋理圖重建爲3D圖。原理就是利用triangles.txt定義的網格信息，獲取每個網格的顏色，再把顏色貼到對應的3D位置。

首先從texture中找到每個頂點的膚色:

#找到每個三角形每個頂點的膚色
triangles = np.loadtxt('./Data/uv-data/triangles.txt').astype(np.int32)
all_colors = np.reshape(texture, [256*256, -1])
colors = all_colors[face_ind, :]

print(vertices.shape) # texutre每個像素對應的3D座標
print(triangles.shape) #每個三角網格對應的像素索引
print(colors.shape) #每個三角形的顏色
'''
(43867, 3)
(86906, 3)
(43867, 3)
'''

獲取每個三角網格的3D位置和貼圖顏色:

#獲取三角形每個頂點的depth，平均值作爲三角形高度
tri_depth = (vertices[triangles[:,0],2 ] + vertices[triangles[:,1],2] + vertices[triangles[:,2],2])/3. 
#獲取三角形每個頂點的color，平均值作爲三角形顏色
tri_tex = (colors[triangles[:,0] ,:] + colors[triangles[:,1],:] + colors[triangles[:,2],:])/3.
tri_tex = tri_tex*255

接下來對每個三角網格進行貼圖，這裏和源碼不同，我用了opencv的畫圖函數來填充三角網格的顏色

img_3D = np.zeros_like(img,dtype=np.uint8)
for i in range(triangles.shape[0]):
    cnt = np.array([(vertices[triangles[i,0],0],vertices[triangles[i,0],1]),
           (vertices[triangles[i,1],0],vertices[triangles[i,1],1]),
           (vertices[triangles[i,2],0],vertices[triangles[i,2],1])],dtype=np.int32)
    img_3D = cv2.drawContours(img_3D,[cnt],0,tri_tex[i],-1)
plt.imshow(img_3D/255.0)

旋轉人臉

既然我們獲取的是3D人臉，當然可以對他進行旋轉操作咯，可以繞x、y、z三個座標軸分別旋轉，原理就是旋轉所有頂點的定義的3D信息，也就是UV位置映射的最後一個維度定義的座標。

通過旋轉角度計算旋轉矩陣的方法是:

# 找到旋轉矩陣，參考https://github.com/YadiraF/face3d
def angle2matrix(angles):
    x, y, z = np.deg2rad(angles[0]), np.deg2rad(angles[1]), np.deg2rad(angles[2])
    # x
    Rx=np.array([[1,              0,                0],
                 [0, np.math.cos(x),  -np.math.sin(x)],
                 [0, np.math.sin(x),   np.math.cos(x)]])
    # y
    Ry=np.array([[ np.math.cos(y), 0, np.math.sin(y)],
                 [              0, 1,              0],
                 [-np.math.sin(y), 0, np.math.cos(y)]])
    # z
    Rz=np.array([[np.math.cos(z), -np.math.sin(z), 0],
                 [np.math.sin(z),  np.math.cos(z), 0],
                 [             0,               0, 1]])

    R=Rz.dot(Ry.dot(Rx))
    return R.astype(np.float32)

繞垂直方向旋轉30度，調用方法就是

trans_mat = angle2matrix((0,30,0))

旋轉頂點位置

# 旋轉座標
rotated_vertices = vertices.dot(trans_mat.T)

因爲是繞遠點旋轉，搞不好會旋轉出去，所以要矯正一下位置

# 把圖像拉到畫布上
ori_x = np.min(vertices[:,0])
ori_y = np.min(vertices[:,1])
rot_x = np.min(rotated_vertices[:,0])
rot_y = np.min(rotated_vertices[:,1])
shift_x = ori_x-rot_x
shift_y = ori_y-rot_y
rotated_vertices[:,0]=rotated_vertices[:,0]+shift_x
rotated_vertices[:,1]=rotated_vertices[:,1]+shift_y

老樣子把texture可視化：

img_3D = np.zeros_like(img,dtype=np.uint8)
mask = np.zeros_like(img,dtype=np.uint8)
fill_area=0
for i in range(triangles.shape[0]):
    cnt = np.array([(rotated_vertices[triangles[i,0],0],rotated_vertices[triangles[i,0],1]),
           (rotated_vertices[triangles[i,1],0],rotated_vertices[triangles[i,1],1]),
           (rotated_vertices[triangles[i,2],0],rotated_vertices[triangles[i,2],1])],dtype=np.int32)
    mask = cv2.drawContours(mask,[cnt],0,(255,255,255),-1)
    if(np.sum(mask[...,0])>fill_area):
        fill_area = np.sum(mask[...,0])
        img_3D = cv2.drawContours(img_3D,[cnt],0,tri_tex[i],-1)
plt.imshow(img_3D)

從視覺效果上的確是旋轉過了。

後記

本博客主要是驗證了PRNet網絡輸出的各種信息代表什麼意思。

後面的研究可能會分爲：

網絡結構的研究
換臉

當然，博客源碼

鏈接: https://pan.baidu.com/s/18z2b6Sut6qFecOpGqNc8YA

提取碼: ad77

風翼冰舟

發佈了130 篇原創文章 · 獲贊 618 · 訪問量 148萬+

他的留言板關注

3D人臉重建——PRNet網絡輸出的理解

前言

理論簡介

代碼理解

人臉裁剪

網絡推斷

人臉關鍵點

人臉點雲

提取紋理圖

渲染紋理圖/3D人臉

旋轉人臉

後記

【簡寫Mybatis-02】註冊機的實現以及SqlSession處理

手繪二維碼

.NET藉助虛擬網卡實現一個簡單異地組網工具

換臉系列——眼鼻口替換

【TensorFlow-windows】學習筆記七——生成對抗網絡

【TensorFlow-windows】部分損失函數測試

Openpose推斷階段原理

【TensorFlow-windows】keras接口——ImageDataGenerator裁剪

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結