本文是 tf.keras 系列文章的第十篇。介紹了使用 Keras 保存和序列化模型的方法。

文章目錄

代碼環境：

python version: 3.7.6
tensorflow version: 2.1.0

導入必要的包：

import numpy as np
import tensorflow as tf
from tensorflow import keras

注：本文所有代碼在 jupyter notebook編寫並測試通過。

Keras模型包含多個組件：

一種體系結構或配置，它指定模型包含的層以及如何連接。
一組權重值（“模型狀態”）。
優化器（通過編譯模型定義）。
一組損失和指標（通過編譯模型或調用add_loss()或定義add_metric()）。

Keras API使得可以將這些片段一次保存到磁盤，或者僅選擇性地保存其中一些：

將所有內容以TensorFlow SavedModel格式（或更舊的Keras H5格式）保存到單個存檔中。這是標準做法。
僅保存架構/配置，通常保存爲JSON文件。
僅保存權重值。通常在訓練模型時使用。

保存Keras模型：

model = ...  # Get model (Sequential, Functional Model, or Model subclass)
model.save('path/to/location')

重新加載模型：

from tensorflow import keras
model = keras.models.load_model('path/to/location')

1. 保存和加載整個模型

將整個模型保存到單個工件中。包括：

模型的架構/配置
模型的權重值
模型的編譯信息
優化器及其狀態（以便在中斷的位置重新開始訓練）

常用API:

model.save()
tf.keras.models.save_model()
tf.keras.load_model()

有兩種方式可以將整個模型保存到磁盤：

TensorFlow SavedModel format （默認方式，官方文檔推薦）
Keras H5 format（指定文件後綴名爲’.h5’）

1.1 TensorFlow SaveModel 格式

def get_model():
    # 創建一個簡單的模型
    inputs = keras.Input(shape=(32,))
    outputs = keras.layers.Dense(1)(inputs)
    model = keras.Model(inputs, outputs)
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

model = get_model()

# 訓練模型
test_input = np.random.random((128, 32))
test_target = np.random.random((128, 1))
model.fit(test_input, test_target)

# 不加後綴名，默認創建 Tensorflow SavedModel 格式，創建一個名稱爲my_model的文件夾保存
model.save('my_model')

# 加載保存的模型
reconstructed_model = keras.models.load_model('my_model')

查看保存的模型：

! dir my_model # windows系統使用這個命令
# ! ls my_model # linux 系統使用這個命令

輸出：

 驅動器 D 中的卷是 本地磁盤
 卷的序列號是 A094-EFBB

 D:\01 TF.Keras Tutorial\my_model 的目錄

2020/04/30  15:36    <DIR>          .
2020/04/30  15:36    <DIR>          ..
2020/04/30  15:36    <DIR>          assets
2020/04/30  15:36            42,168 saved_model.pb
2020/04/30  15:36    <DIR>          variables
               1 個文件         42,168 字節
               4 個目錄 223,605,207,040 可用字節

saved_model.pb 中保存的是模型架構，訓練配置（包括optimizer，loss和metrics）；權重保存在 variables/文件夾下。

1.2 Keras H5 格式

Keras還支持保存單個HDF5文件，其中包含模型的體系結構，權重值和compile()信息。

model = get_model()

test_input = np.random.random((128, 32))
test_target = np.random.random((128, 1))
model.fit(test_input, test_target)

model.save('my_h5_model.h5')

1.3 兩種方法的比較

與 SavedModel 格式相比，H5 格式的文件缺少以下兩點：

不包括通過 model.add_loss() 和 model.add_metric() 添加的額外損失和指標。如果模型有這樣的損失和指標，並且想要恢復訓練，則需要在加載模型後重新添加這些損失。注意：這不適用於通過self.add_loss() 和 self.add_metric() 在圖層內部創建的損失或指標。只要該層被加載，這些損耗和度量就被保留，因爲它們是該層的調用方法的一部分。
自定義對象（如自定義圖層）的計算圖不包含在保存的文件中。在加載時，Keras需要訪問這些對象的Python類/函數以重建模型。

2. 保存模型架構

模型的配置（或架構）指定模型包含的層以及這些層的連接方式。如果具有模型的配置，則可以使用權重的新初始化狀態創建模型，而無需編譯信息。

注意：這僅適用於使用Function API或Sequential API定義的模型，不適用於子類模型。

2.1 Function API或Sequential API定義的模型配置

API：

get_config() 和 from_config()
tf.keras.models.model_to_json() 和 tf.keras.models.model_from_json()

get_config() 和 from_config()
調用 config = model.get_config() 將返回一個包含模型配置的Python字典。然後可以通過Sequential.from_config(config)（對於Sequential模型）或 Model.from_config(config)（對於功能API模型）重建相同的模型。

相同的工作流程也適用於任何可序列化層。

1.layer 示例：

layer = keras.layers.Dense(3, activation='relu')
layer_config = layer.get_config()
new_layer = keras.layers.Dense.from_config(layer_config)

2.Sequential模型示例：

model = keras.Sequential([keras.Input((32,)), keras.layers.Dense(1)])
config = model.get_config()
new_model = keras.Sequential.from_config(config)

3.Function模型示例

inputs = keras.Input((32,))
outputs = keras.layers.Dense(1)(inputs)
model = keras.Model(inputs, outputs)
config = model.get_config()
new_model = keras.Model.from_config(config)

API：

to_json() 和 tf.keras.models.model_from_json()

這類似於 get_config / from_config，不同之處在於它將模型轉換爲JSON字符串，然後可以在不使用原始模型類的情況下進行加載。它也特定於模型，並不適用於圖層。

model = keras.Sequential([keras.Input((32,)), keras.layers.Dense(1)])
json_config = model.to_json()
new_model = keras.models.model_from_json(json_config)

2.2 自定義對象

__init__ 方法中定義了子類化模型和層的體系結構call。它們被視爲Python字節碼，無法將其序列化爲JSON兼容的config。

爲了保存/加載帶有自定義圖層的模型或子類模型，應該覆蓋 get_config 和 from_config 方法（可選）。另外，應該註冊自定義對象，Keras調用。

可以嘗試序列化字節碼（例如通過pickle），但這是完全不安全的，因爲模型無法加載到其它系統上。

自定義函數：
自定義函數（例如，激活損失或初始化）不需要get_config方法。只要將函數名稱註冊爲自定義對象，就可以加載該函數。

1.定義配置方法：

get_config 應該返回一個JSON可序列化的字典，以便與Keras節省架構和模型的API兼容。
from_config(config)（classmethod）應該返回從配置中創建的新圖層或模型對象。默認實現返回cls(**config)。

例：

class CustomLayer(keras.layers.Layer):
    def __init__(self, a):
        self.var = tf.Variable(a, name='var_a')
    def call(self, inputs, training=False):
        if training:
            return inputs * self.var
        else:
            return inputs
    
    def get_config(self):
        return {'a': self.var.numpy()}

    # There's actually no need to define `from_config` here, since returning
    # `cls(**config)` is the default behavior.
    @classmethod
    def from_config(cls, config):
        return cls(**config)

layer = CustomLayer(5)
layer.var.assign(2)

serialized_layer = keras.layers.serialize(layer)
new_layer = keras.layers.deserialize(serialized_layer, custom_objects={'CustomLayer': CustomLayer})

2.註冊自定義對象
Keras記錄了哪個類生成了配置。從上面的示例中，tf.keras.layers.serialize 生成自定義層的序列化形式：

{'class_name': 'CustomLayer', 'config': {'a': 2} }

Keras保留所有內置層，模型，優化器和度量標準類的列表，該列表用於查找正確類去調用 from_config。如果找不到該類，則會引發錯誤（Value Error: Unknown layer）。有幾種方法可以將自定義類註冊到此列表中：

custom_objects 在加載函數中設置參數。
tf.keras.utils.custom_object_scope 或 tf.keras.utils.CustomObjectScope
tf.keras.utils.register_keras_serializable

自定義層和函數示例：

class CustomLayer(keras.layers.Layer):
    def __init__(self, units=32, **kwargs):
        super(CustomLayer, self).__init__(**kwargs)
        self.units = units

    def build(self, input_shape):
        self.w = self.add_weight(
            shape=(input_shape[-1], self.units),
            initializer="random_normal",
            trainable=True,
        )
        self.b = self.add_weight(
            shape=(self.units,), initializer="random_normal", trainable=True
        )

    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b

    def get_config(self):
        config = super(CustomLayer, self).get_config()
        config.update({"units": self.units})
        return config

def custom_activation(x):
  return tf.nn.tanh(x) ** 2


# Make a model with the CustomLayer and custom_activation
inputs = keras.Input((32,))
x = CustomLayer(32)(inputs)
outputs = keras.layers.Activation(custom_activation)(x)
model = keras.Model(inputs, outputs)

# Retrieve the config
config = model.get_config()

# At loading time, register the custom objects with a `custom_object_scope`:
custom_objects = {'CustomLayer': CustomLayer,
                  'custom_activation': custom_activation}
with keras.utils.custom_object_scope(custom_objects):
	new_model = keras.Model.from_config(config)

2.3 內存中克隆模型

可以通過 tf.keras.models.clone_model() 從內存中克隆模型。這相當於獲取配置，然後從其配置中重新創建模型（因此它不保留編譯信息或圖層權重值）。

with keras.utils.custom_object_scope(custom_objects):
	new_model = keras.models.clone_model(model)

3. 模型權重的保存和加載

可以選擇僅保存和加載模型的權重。這在以下情況下可能有用：

只需要模型進行推斷：在這種情況下，無需重新開始訓練，因此不需要編譯信息或優化器狀態。
正在進行遷移學習：在這種情況下，使用現有模型的狀態來訓練新模型，因此不需要先前模型的編譯信息。

3.1 內存中權重傳遞API

可以使用 get_weights 和在不同對象之間複製權重 set_weights：

tf.keras.layers.Layer.get_weights()：返回numpy數組的列表。
tf.keras.layers.Layer.set_weights()：將模型權重設置爲weights參數中的值。

1.在內存中將權重從一層賦給另一層

def create_layer():
    layer = keras.layers.Dense(64, activation='relu', name='dense_2')
    layer.build((None, 784))
    return layer

layer_1 = create_layer()
layer_2 = create_layer()

# 將第一層的權重賦給第二層
layer_2.set_weights(layer_1.get_weights())

2.在內存中將權重從一個模型賦給具有兼容架構的另一個模型

# Create a simple functional model
inputs = keras.Input(shape=(784,), name='digits')
x = keras.layers.Dense(64, activation='relu', name='dense_1')(inputs)
x = keras.layers.Dense(64, activation='relu', name='dense_2')(x)
outputs = keras.layers.Dense(10, name='predictions')(x)
functional_model = keras.Model(inputs=inputs, outputs=outputs, name='3_layer_mlp')

# Define a subclassed model with the same architecture
class SubclassedModel(keras.Model):
    def __init__(self, output_dim, name=None):
        super(SubclassedModel, self).__init__(name=name)
        self.output_dim = output_dim
        self.dense_1 = keras.layers.Dense(64, activation='relu', name='dense_1')
        self.dense_2 = keras.layers.Dense(64, activation='relu', name='dense_2')
        self.dense_3 = keras.layers.Dense(output_dim, name='predictions')

    def call(self, inputs):
        x = self.dense_1(inputs)
        x = self.dense_2(x)
        x = self.dense_3(x)
        return x
    
    def get_config(self):
        return {'output_dim': self.output_dim, 'name': self.name}

subclassed_model = SubclassedModel(10)
# Call the subclassed model once to create the weights.
subclassed_model(tf.ones((1, 784)))

# Copy weights from functional_model to subclassed_model.
subclassed_model.set_weights(functional_model.get_weights())

assert len(functional_model.weights) == len(subclassed_model.weights)
for a, b in zip(functional_model.weights, subclassed_model.weights):
    np.testing.assert_allclose(a.numpy(), b.numpy())

3.無狀態層的處理
因爲無狀態層不會更改權重的順序或數量，所以即使存在額外的/缺少的無狀態層，模型也可以具有兼容的體系結構。

inputs = keras.Input(shape=(784,), name='digits')
x = keras.layers.Dense(64, activation='relu', name='dense_1')(inputs)
x = keras.layers.Dense(64, activation='relu', name='dense_2')(x)
outputs = keras.layers.Dense(10, name='predictions')(x)
functional_model = keras.Model(inputs=inputs, outputs=outputs, name='3_layer_mlp')

inputs = keras.Input(shape=(784,), name='digits')
x = keras.layers.Dense(64, activation='relu', name='dense_1')(inputs)
x = keras.layers.Dense(64, activation='relu', name='dense_2')(x)

# Add a dropout layer, which does not contain any weights.
x = keras.layers.Dropout(.5)(x)
outputs = keras.layers.Dense(10, name='predictions')(x)
functional_model_with_dropout = keras.Model(inputs=inputs, outputs=outputs, name='3_layer_mlp')

functional_model_with_dropout.set_weights(functional_model.get_weights())

3.2 權重保存加載API

可以通過調用 model.save_weights 將權重保存到磁盤：

TensorFlow checkpoint （檢查點文件）
HDF5 （hdf5格式）

model.save_weights 保存權重的默認格式爲 TensorFlow checkpoint。有兩種指定保存格式的方法：

save_format 參數：將值設置爲 save_format="tf" 或 save_format="h5"。
path 參數：如果路徑以 .h5 或結尾 .hdf5 ，則使用 HDF5 格式。其他後綴將保存爲TensorFlow checkpoint。

還可以選擇將權重作爲內存中的numpy數組進行檢索。每個API都有其優缺點，下面將詳細介紹。

3.2.1 TF Checkpoint 格式

sequential_model = keras.Sequential(
    [keras.Input(shape=(784,), name='digits'),
     keras.layers.Dense(64, activation='relu', name='dense_1'), 
     keras.layers.Dense(64, activation='relu', name='dense_2'),
     keras.layers.Dense(10, name='predictions')])

sequential_model.save_weights('ckpt')
load_status = sequential_model.load_weights('ckpt')

# assert_consumed 可以用作驗證是否已從檢查點恢復所有變量值。有關Status對象中的其他方法，可參考 tf.train.Checkpoint.restore。
load_status.assert_consumed()

遷移學習的例子：
本質上，只要兩個模型具有相同的結構，它們就可以共享相同的檢查點。

inputs = keras.Input(shape=(784,), name='digits')
x = keras.layers.Dense(64, activation='relu', name='dense_1')(inputs)
x = keras.layers.Dense(64, activation='relu', name='dense_2')(x)
outputs = keras.layers.Dense(10, name='predictions')(x)
functional_model = keras.Model(inputs=inputs, outputs=outputs, name='3_layer_mlp')

# Extract a portion of the functional model defined in the Setup section.
# The following lines produce a new model that excludes the final output
# layer of the functional model.
pretrained = keras.Model(functional_model.inputs, 
                            functional_model.layers[-1].input,
                            name='pretrained_model')
# Randomly assign "trained" weights.
for w in pretrained.weights:
    w.assign(tf.random.normal(w.shape))
pretrained.save_weights('pretrained_ckpt')
pretrained.summary()

# Assume this is a separate program where only 'pretrained_ckpt' exists.
# Create a new functional model with a different output dimension.
inputs = keras.Input(shape=(784,), name='digits')
x = keras.layers.Dense(64, activation='relu', name='dense_1')(inputs)
x = keras.layers.Dense(64, activation='relu', name='dense_2')(x)
outputs = keras.layers.Dense(5, name='predictions')(x)
model = keras.Model(inputs=inputs, outputs=outputs, name='new_model')

# Load the weights from pretrained_ckpt into model. 
model.load_weights('pretrained_ckpt')

# Check that all of the pretrained weights have been loaded.
for a, b in zip(pretrained.weights, model.weights):
    np.testing.assert_allclose(a.numpy(), b.numpy())

print('\n','-'*50)
model.summary()

建議使用相同的API來構建模型。如果在“Sequential”和“function”或“function and subclass”之間切換，應該重建預訓練模型並將預訓練權重加載到該模型。

如果模型架構完全不同，如何將權重保存並加載到不同的模型中？解決方案是使用tf.train.Checkpoint 保存和還原確切的圖層/變量。

3.2.2 HDF5格式

HDF5格式包含按圖層名稱分組的權重。權重是通過將可訓練權重列表與不可訓練權重列表（與layer.weights）連接起來而排序的列表。因此，如果模型具有與保存在檢查點中相同的圖層和可訓練狀態，則可以使用hdf5檢查點。

sequential_model = keras.Sequential(
    [keras.Input(shape=(784,), name='digits'),
     keras.layers.Dense(64, activation='relu', name='dense_1'), 
     keras.layers.Dense(64, activation='relu', name='dense_2'),
     keras.layers.Dense(10, name='predictions')])
sequential_model.save_weights('weights.h5')
sequential_model.load_weights('weights.h5')

注意，當模型包含嵌套圖層時，更改 layer.trainable 可能會導致 layer.weights 順序不同。

class NestedDenseLayer(keras.layers.Layer):
    def __init__(self, units, name=None):
        super(NestedDenseLayer, self).__init__(name=name)
        self.dense_1 = keras.layers.Dense(units, name='dense_1')
        self.dense_2 = keras.layers.Dense(units, name='dense_2')

    def call(self, inputs):
        return self.dense_2(self.dense_1(inputs))

nested_model = keras.Sequential([keras.Input((784,)), NestedDenseLayer(10, 'nested')])
variable_names = [v.name for v in nested_model.weights]
print('variables: {}'.format(variable_names))

print('\nChanging trainable status of one of the nested layers...')
nested_model.get_layer('nested').dense_1.trainable = False

variable_names_2 = [v.name for v in nested_model.weights]
print('\nvariables: {}'.format(variable_names_2))
print('variable ordering changed:', variable_names != variable_names_2)

輸出：

variables: ['nested/dense_1/kernel:0', 'nested/dense_1/bias:0', 'nested/dense_2/kernel:0', 'nested/dense_2/bias:0']

Changing trainable status of one of the nested layers...

variables: ['nested/dense_2/kernel:0', 'nested/dense_2/bias:0', 'nested/dense_1/kernel:0', 'nested/dense_1/bias:0']
variable ordering changed: True

遷移學習的例子：

def create_functional_model():
    inputs = keras.Input(shape=(784,), name='digits')
    x = keras.layers.Dense(64, activation='relu', name='dense_1')(inputs)
    x = keras.layers.Dense(64, activation='relu', name='dense_2')(x)
    outputs = keras.layers.Dense(10, name='predictions')(x)
    return keras.Model(inputs=inputs, outputs=outputs, name='3_layer_mlp')

functional_model = create_functional_model()  
functional_model.save_weights('pretrained_weights.h5')

pretrained_model = create_functional_model()
pretrained_model.load_weights('pretrained_weights.h5')

extracted_layers = pretrained_model.layers[:-1]
extracted_layers.append(keras.layers.Dense(5, name='dense_3'))
model = keras.Sequential(extracted_layers)
model.summary()

參考：https://www.tensorflow.org/guide/keras/save_and_serialize#introduction

【tf.keras】10: 使用 Keras 保存和加載模型

文章目錄

1. 保存和加載整個模型

1.1 TensorFlow SaveModel 格式

1.2 Keras H5 格式

1.3 兩種方法的比較

2. 保存模型架構

2.1 Function API或Sequential API定義的模型配置

2.2 自定義對象

2.3 內存中克隆模型

3. 模型權重的保存和加載

3.1 內存中權重傳遞API

3.2 權重保存加載API

3.2.1 TF Checkpoint 格式

3.2.2 HDF5格式

【CV12】如何在Keras使用 Mask R-CNN 進行目標檢測

【CV13】如何在Keras中使用 YOLO v3 進行目標檢測

【CV10】經典CNN模型中圖像數據增強方法簡介

【CV09】如何可視化CNN中的卷積核和特徵圖

【CV11】如何從頭開發於CIFAR-10圖像分類的CNN

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結