Rasa Core開發指南

注：本文寫作時rasa版本比較老，新版rasa core有改動，有關rasa請參考：

RASA 開發中文指南系列博文：

Rasa中文聊天機器人開發指南(1)：入門篇
Rasa中文聊天機器人開發指南(2)：NLU篇
Rasa中文聊天機器人開發指南(3)：Core篇
Rasa中文聊天機器人開發指南(4)：RasaX與模型評估
Rasa中文聊天機器人開發指南(5)：淺析Mitie、spaCy和CRF實體識別器
Rasa中文聊天機器人開發指南(6)：淺析Mitie、Sklearn和Embedding意圖分類器

注：本系列博客翻譯自Rasa官方文檔，並融合了自己的理解和項目實戰，同時對文檔中涉及到的技術點進行了一定程度的擴展，目的是爲了更好的理解Rasa工作機制。與本系列博文配套的項目GitHub地址：ChitChatAssistant，歡迎star和issues，我們共同討論、學習！

1. Rasa Core簡介

Rasa Core是Rasa框架提供的對話管理模塊，它類似於聊天機器人的大腦，主要的任務是維護更新對話狀態和動作選擇，然後對用戶的輸入作出響應。所謂對話狀態是一種機器能夠處理的對聊天數據的表徵，對話狀態中包含所有可能會影響下一步決策的信息，如自然語言理解模塊的輸出、用戶的特徵等；所謂動作選擇，是指基於當前的對話狀態，選擇接下來合適的動作，例如向用戶追問需補充的信息、執行用戶要求的動作等。舉一個具體的例子，用戶說“幫我媽媽預定一束花”，此時對話狀態包括自然語言理解模塊的輸出、用戶的位置、歷史行爲等特徵。在這個狀態下，系統接下來的動作可能是：

向用戶詢問可接受的價格，如“請問預期價位是多少？”；
向用戶確認可接受的價格，如“像上次一樣買價值200的花可以嗎？”
直接爲用戶預訂

下面是Rasa Core文檔中給出的一個對話場景：

[外鏈圖片轉存失敗,源站可能有防盜鏈機制,建議將圖片保存下來直接上傳(img-3LH8Avuk-1582980505120)(http://rasa.com/docs/rasa/_images/mood_bot.png)]

1.1 Rasa Core消息處理流程

由前面描述的對話管理模塊瞭解到，它應該是負責協調聊天機器人的各個模塊，起到維護人機對話的結構和狀態的作用。對話管理模塊涉及到的關鍵技術包括對話行爲識別、對話狀態識別、對話策略學習以及行爲預測、對話獎勵等。下面是Rasa Core消息處理流程：

[外鏈圖片轉存失敗,源站可能有防盜鏈機制,建議將圖片保存下來直接上傳(img-9vb6YiSI-1582980505121)(http://rasa.com/docs/rasa/_images/rasa-message-processing.png)]

首先，將用戶輸入的Message傳遞到Interpreter(NLU模塊)，該模塊負責識別Message中的"意圖(intent)“和提取所有"實體”(entity)數據；
其次，Rasa Core會將Interpreter提取到的意圖和識別傳給Tracker對象，該對象的主要作用是跟蹤會話狀態(conversation state)；
第三，利用policy記錄Tracker對象的當前狀態，並選擇執行相應的action，其中，這個action是被記錄在Track對象中的；
最後，將執行action返回的結果輸出即完成一次人機交互。

注：整個執行過程由Rasa Core框架中的rasa_core.agent.Agent類處理。

1.2 安裝Rasa Core

pip install rasa_core

2. Dialogue模型訓練

2.1 Story樣本數據

Story樣本數據就是Rasa Core對話系統要訓練的樣本，它描述了人機對話交互過程成可能出現的故事情節，通過對Stories樣本和domain的訓練得到人機對話系統所需的對話模型。每個story的格式基本是一樣的，只是組成的內容不一樣，即以##開頭的行表示一個story的開始，跟隨的文本只用於描述；以*開始的行表示一個意圖和填充的slot；以縮進 - 開始表示Rasa NLU識別到該意圖後Rasa Core要執行的action。以下是stories.md文件的部分內容：

## story1：greet only
* greet
    - utter_answer_greet
> check_greet

## story2：
* goodbye
	- utter_answer_goodbye

## story3:thanks
* thanks
    - utter_answer_thanks
    
## story4:change address or data-time withe greet
> check_greet
* weather_address_date-time{"address": "上海", "date-time": "明天"}
    - action_report_weather

## story5:change address or data-time withe greet
> check_greet
* weather_address_date-time{"address": "上海", "date-time": "明天"}
    - action_report_weather
    - utter_report_weather
* weather_address{"address": "北京"} OR weather_date-time{"date-time": "明天"}
    - action_report_weather
    - utter_report_weather
...
...

其中，> check_*用於模塊化和簡化訓練數據，即story複用；OR Statements用於處理同一個story中可能出現2個以上走向(意圖)，這有利於簡化story，但是相應的訓練時間相當於訓練了兩個以上故事，但也不建議使用的太密集。

Visualizing Stories：可視化Stories

Rasa Core中提供了rasa_core.visualize模塊可視化故事，這有利於我們更容易掌握設計故事流程。
命令如下:

python -m rasa_core.visualize -d domain.yml -s data/stories.md -o graph.html -c config.yml

其中，-m指定運行模塊；-d指定domain.yml文件路徑；-s指定story路徑；-o指定輸出文件名；-c指定Policy配置文件。最終，在項目根目錄得到一個graph.html，用瀏覽器打開可見：

[外鏈圖片轉存失敗,源站可能有防盜鏈機制,建議將圖片保存下來直接上傳(img-tXXmwE7w-1582980505121)(http://rasa.com/docs/rasa/_images/interactive_learning_graph.gif)]

當然，除了使用命令生成stories的可視化關係圖，我們還可以創建visualize.py代碼實現。

from rasa_core.agent import Agent
from rasa_core.policies.keras_policy import KerasPolicy
from rasa_core.policies.memoization import MemoizationPolicy

if __name__ == '__main__':
    agent = Agent("domain.yml",
                  policies=[MemoizationPolicy(), KerasPolicy()])

    agent.visualize("data/stories.md",
                    output_file="graph.html", max_history=2)

2.2 Domain

domain.yml定義了對話機器人應知道的所有信息，它相當於大腦框架，指定了意圖intents、實體entities、插槽slots以及動作actions，其中，intents、entities應與NLU模型訓練樣本中標記的致，slots應與標記的entities一致，actions爲對話機器人對應用戶的請求需作出的動作。此外，domain.yml中的templates部分針對utter_類型action定義了模板消息，便於對話機器人對相關動作自動回覆。假如我們要做一個天氣資訊的人機對話系統，並且定義一個查詢天氣和污染程度的action，那麼我們要這麼做。domain.yml示例：

intents:
  - greet
  - goodbye
  - thanks
  - whoareyou
  - whattodo
  - whereyoufrom
  - search_weather
  - search_weather_quality
  - search_datetime
  - search_city

slots:
  city:
    type: text
  datetime:
    type: text
  matches:
    type: unfeaturized

entities:
  - city
  - datetime

actions:
  - utter_answer_greet
  - utter_answer_goodbye
  - utter_answer_thanks
  - utter_introduce_self
  - utter_introduce_selfcando
  - utter_introduce_selffrom
  - action_search_wether
  - action_search_weather_quality

templates:
  utter_answer_goodbye:
    - text: "再見"
    - text: "拜拜"
    - text: "雖然我有萬般捨不得，但是天下沒有不散的宴席~祝您安好！"
    - text: "期待下次再見！"
    - text: "嗯嗯，下次需要時隨時記得我喲~"
    - text: "88"

  utter_answer_thanks:
    - text: "嗯呢。不用客氣~"
    - text: "這是我應該做的，主人~"
    - text: "嗯嗯，合作愉快！"

  utter_introduce_self:
    - text: "您好！我是您的AI機器人呀~"

  utter_introduce_selfcando:
    - text: "我能幫你查詢天氣信息"

  utter_introduce_selffrom:
    - text: "我來自xxx"

  utter_ask_city:
    - text: "請問您要查詢哪裏的天氣？"

  utter_ask_datetime:
    - text: "請問您要查詢哪天的天氣"

  utter_report_search_result:
    - text: "{matches}"

  utter_default:
    - text: "小x還在學習中，請換種說法吧~"
    - text: "小x正在學習中，等我升級了您再試試吧~"
    - text: "對不起，主人，您要查詢的功能小x還沒學會呢~"

說明：

`intents`	things you expect users to say. See Rasa NLU
`actions`	things your bot can do and say
`templates`	template strings for the things your bot can say
`entities`	pieces of info you want to extract from messages. See Rasa NLU
`slots`	information to keep track of during a conversation (e.g. a users age) - see Using Slots

2.2.0 intents

intents，即意圖，這裏枚舉了在訓練NLU模型樣本時，樣本中標出的所有intent。

intents:
  - greet
  - goodbye
  - thanks
  - search_weather
  - search_weather_quality
  - search_datetime
  - search_city

2.2.1 actions

當Rasa NLU識別到用戶輸入Message的意圖後，Rasa Core對話管理模塊就會對其作出迴應，而完成這個迴應的模塊就是action。Rasa Core支持三種action，即default actions、utter actions以及 custom actions，它們的作用和區別如下：

1. default actions

DefaultAction是Rasa Core默認的一組actions，我們無需定義它們，直接可以story和domain中使用。包括以下三種action：

action_listen：監聽action，Rasa Core在會話過程中通常會自動調用該action；
action_restart：重置狀態，比初始化Slots(插槽)的值等；
action_default_fallback：當Rasa Core得到的置信度低於設置的閾值時，默認執行該action；

2. utter actions

UtterAction是以utter_爲開頭，僅僅用於向用戶發送一條消息作爲反饋的一類actions。定義一個UtterAction很簡單，只需要在domain.yml文件中的actions:字段定義以utter_爲開頭的action即可，而具體回覆內容將被定義在templates:部分，這個我們下面有專門講解。定義utter actions示例如下：

actions:
  - utter_answer_greet
  - utter_answer_goodbye
  - utter_answer_thanks
  - utter_introduce_self
  - utter_introduce_selfcando
  - utter_introduce_selffrom

3. custom actions

CustomAction，即自定義action，允許開發者執行任何操作並反饋給用戶，比如簡單的返回一串字符串，或者控制家電、檢查銀行賬戶餘額等等。它與DefaultAction不同，自定義action需要我們在domain.yml文件中的actions部分先進行定義，然後在指定的webserver中實現它，其中，這個webserver的url地址在endpoint.yml文件中指定，並且這個webserver可以通過任何語言實現，當然這裏首先推薦python來做，畢竟Rasa Core爲我們封裝好了一個rasa-core-sdk專門用來處理自定義action。關於action web的搭建和action的具體實現，我們在後面詳細講解，這裏我們看下在在Rasa Core項目中需要做什麼。假如我們在天氣資訊的人機對話系統需提供查詢天氣和空氣質量兩個業務，那麼我們就需要在domain.yml文件中定義查詢天氣和空氣質量的action，即：

actions:
  ...	
  - action_search_weather
  - action_search_weather_quality

2.2.2 templates

在前面我們說的，domain.yml的templates:部分就是描述UtterActions具體的回覆內容，並且每個UtterAction下可以定義多條信息，當用戶發起一個意圖，比如"你好！"，就觸發utter_answer_greet操作，Rasa Core會從該action的模板中自動選擇其中的一條信息作爲結果反饋給用戶。templates部分示例如下：

templates:
  utter_answer_greet:
    - text: "您好！請問我可以幫到您嗎？"
    - text: "您好！請說出您要查詢的具體業務，比如跟我說'查詢身份證號碼'"
    - text: "您好！

  utter_answer_goodbye:
    - text: "再見"
    - text: "拜拜"
    - text: "雖然我有萬般捨不得，但是天下沒有不散的宴席~祝您安好！"
    - text: "期待下次再見！"
    - text: "嗯嗯，下次需要時隨時記得我喲~"
    - text: "88"

  utter_answer_thanks:
    - text: "嗯呢。不用客氣~"
    - text: "這是我應該做的，主人~"
    - text: "嗯嗯，合作愉快！"

  utter_introduce_self:
    - text: "您好！我是您的AI機器人呀~"

  utter_introduce_selfcando:
    - text: "我能幫你查詢天氣信息"

  utter_introduce_selffrom:
    - text: "我來自xxx"

  utter_ask_city:
    - text: "請問您要查詢哪裏的天氣？"

  utter_ask_datetime:
    - text: "請問您要查詢哪天的天氣"

  utter_report_search_result:
    - text: "{matches}"

  utter_default:
    - text: "小x還在學習中，請換種說法吧~"
    - text: "小x正在學習中，等我升級了您再試試吧~"
    - text: "對不起，主人，您要查詢的功能小x還沒學會呢~"

注：utter_default是Rasa Core默認的action_default_fallback，當Rasa NLU識別該意圖時，它的置信度低於設定的閾值時，就會默認執行utter_default中的模板。
除了回覆簡單的Text Message，Rasa Core還支持在Text Message後添加按鈕和圖片，以及訪問插槽中的值（如果該插槽的值有被填充的話，否則返回None）。舉個栗子：

  utter_introduce_self:
    - text: "您好！我是您的AI機器人呀~"
      image: "https://i.imgur.com/sayhello.jpg"
  utter_introduce_selfcando:
    - text: "我能幫你查詢天氣信息"
      buttons:
    	- title: "好的"
          payload: "ok"
   	 	- title: "不了"
          payload: "no"
  utter_ask_city:
    - text: "請問您要查詢{ datetime }哪裏的天氣？"  
  utter_ask_datetime:
    - text: "請問您要查詢{ city }哪天的天氣"

當然，上面描述的template reponse是我們通過訓練後，對話機器人識別到相應的意圖後，自動觸發對應的UtterAction，然後選取一條template文本作爲回覆消息。其實，我們還可以在自定義action中使用dispatcher.utter_template("utter_greet")函數生成一條消息Message反饋給用戶。示例代碼：

from rasa_core_sdk.actions import Action

class ActionGreet(Action):
  def name(self):
      return 'action_greet'

  def run(self, dispatcher, tracker, domain):
      dispatcher.utter_template("utter_greet")
      return []

2.2.3 entities

entities，即實體，這裏枚舉了在訓練NLU模型樣本時，樣本中標出的所有entity。一般而言，entities和slots的內容應該是slots包含entities關係。

entities:
  - city
  - datetime

2.2.4 slots

Slots，即插槽，它就像對話機器人的內存，它通過鍵值對的形式可用來收集存儲用戶輸入的信息(實體)或者查詢數據庫的數據等。以天氣查詢爲例，那就意味着對話機器人必須知道具體的時間和地點方能查詢，因爲在domain.yml文件中我們就需要在slots部分定義兩個插槽，即city、datetime，而matches則用來存儲最後查詢的結果，其中，type表示slot存儲的數據類型；initial_value爲slot初始值，該值可有可無(無意義)。示例如下：

slots:
  city:
    type: text
    initial_value: "北京"
  datetime:
    type: text
    initial_value: "明天"
  matches:
    type: unfeaturized
    initial_value: "none"

(1) slot類型

關於**rasa_core.slots.Slot值的type**，Rasa Core爲我們提供了多種類型，以滿足不同情況需求，具體分析如下：

text：存儲文本信息；
bool：存儲布爾值，True or False；

categorical：指定接收枚舉所列的一個值，如：

slots:
   risk_level:
      type: categorical
      values:
      - low
      - medium
      - high

float：存儲浮點連續值，其中，max_value=1.0, min_value=0.0爲默認值，當要存儲的值大於max_value，那麼slot只會存儲max_value；當要存儲的值小於min_value，那麼slot只會存儲min_value。示例如下：
```
slots:
   temperature:
      type: float
      min_value: -100.0
      max_value:  100.0
```
list：存儲列表數據，且列表的長度不影響對話；
unfeaturized：用於存儲不影響會話流程的數據。這個槽不會有任何的特性，因此它的值不會影響對話流，並且在預測機器人應該運行的下一個動作時被忽略。
自定義slot類型：詳見Custom Slot Types

(2) 填充slots的值

在一次對話中，存在多種方式填充slots的值，下面我們詳細分析下這幾種情況。

Slots Set from NLU

當我們在訓練NLU模型時，標記了一個名爲name的實體，並且在Rasa Core的domain.yml文件中也包含一個具有相同名稱的slot(插槽)，那麼當用戶輸入一條Message時，NLU模型會對這個name進行實體提取，並自動填充到這個名爲name的slot中。示例如下:

# story_01
* greet{"name": "老蔣"}
  - slot{"name": "老蔣"}
  - utter_greet

注：在上述情況下，就算我們不包含- slot{"name": "老蔣"}部分，name也會被自動填充。

Slots Set By Clicking Buttons

前面說到，在domain.yml的templates:部分，Rasa Core還支持在Text Message後添加按鈕，當我們點擊這個按鈕後，Rasa Core就會向RegexInterpreter發送以/開頭的Message，當然對於RegexInterpreter來說，NLU的輸入文本的格式應該與story中的一致，即/intent{entities}。假設對話機器人詢問是否需要查詢天氣信息時，我們在NLU訓練樣本中標記一個choose意圖和decision實體，然後再在domain.yml中將decision標記爲slot，當我們點擊按鈕後，"好的"或“不了”會被自動填充到decision的這個slot中。也就是說，當我們想Rasa Core發送"/choose{"decision": "好的"}"後，會直接識別到意圖choose，並提取實體decision的值。

 templates:
    utter_introduce_selfcando:
    - text: "我能幫你查詢天氣信息"
      buttons:
    	- title: "好的"
          payload: "/choose{"decision": "好的"}"
   	 	- title: "不了"
          payload: "/choose{"decision": "不了"}"
     ...

Slots Set by Actions

以查詢天氣質量爲例，先看下Rasa Core項目中domain.yml和stories.md：

# domain.yml
...
actions:
  - action_search_weather_quality

slots:
   weather_quality:
      type: categorical
      values:
      - 優
      - 良
      - 差
...

# stories.md
* greet
  - action_search_weather_quality
  - slot{"weather_quality" : "優"}
  - utter_answer_high
    
* greet
  - action_search_weather_quality
  - slot{"weather_quality" : "中"}
  - utter_answer_midddle 

* greet
  - action_search_weather_quality
  - slot{"weather_quality" : "差"}
  - utter_answer_low
# 注：官方文檔這裏說，如果slot的類型是categorical時，在stories.md的故事情節中使用- slot設置值有利於提高正確action的執行率？

在自定義action中，我們先查詢天氣數據庫，以json格式返回，然後提取出json中weather_quality字段的值填充到weather_quality slot中返回。代碼如下：

from rasa_core_sdk.actions import Action
from rasa_core_sdk.events import SlotSet
import requests

class ActionSearchWeatherQuality(Action):
    def name(self):
        return "action_search_weather_quality"

    def run(self, dispatcher, tracker, domain):
        url = "http://myprofileurl.com"
        data = requests.get(url).json
        # 解析json，填充slot
        return [SlotSet("weather_quality", data["weather_quality"])]

3 .訓練和使用對話模型

3.1 訓練對話模型

python -m rasa_core.train -d domain.yml -s data/stories.md -o models/current/dialogue -c config.yml

命令說明：

usage: train.py default [-h] [--augmentation AUGMENTATION] [--dump_stories]
                        [--debug_plots] [-v] [-vv] [--quiet] [-c CONFIG] -o
                        OUT (-s STORIES | --url URL) -d DOMAIN

-m mod 指定要運行的module
-d或--domain 指定對話機器人的domain.yml文件路徑；
-s或--stories 指定stories.md文件路徑，需要注意的是，我們可以將故事情節根據某種分類保存在多個.md文件中，比如將所有.md文件存放在data目錄的stories目錄下，此時命令行的參數應該改爲-s data/stories/；
-o或--out 指定對話模型輸出路徑，訓練好的模型會自動保存到該路徑下；

-c或--c 指定Policy規範文件，config.yml配置文件(默認參數)如下：

policies:
  - name: KerasPolicy
    epochs: 100
    max_history: 5
  - name: FallbackPolicy
    fallback_action_name: 'action_default_fallback'
  - name: MemoizationPolicy
    max_history: 5
  - name: FormPolicy

--augmentation AUGMENTATION 該參數默認開啓，Rasa Core將通過將故事文件中的故事隨機地粘合在一起來創建更長的故事。如果我們希望每次回覆都執行相同的action，無論之前的會話歷史是什麼，可以通過--augmentation 0關閉這種功能。(訓練時該參數可選)。
--url URL 從URL網絡中下載一個stories.md文件用於訓練。

接下來，我們重點介紹下Policy(策略)模塊。Policies是Rasa Core中的策略模塊，即類rasa_core.policies.Policy，它的作用就是預測對話中，而具體選擇哪個action將由預測的置信度決定，哪個的置信度越高就執行哪個。下面是我在測試過程中的debug信息，當我向機器人輸入"幫我查手機號碼12345"時，Policies模塊就會預測到執行action_search_num_business這個action：

2019-04-15 16:46:07 DEBUG    rasa_core.processor  - Received user message '幫我查手機號碼12345' with intent '{'name': 'search_plate_number', 'confidence': 0.4848141755831445}' and enti
ties '[{'entity': 'item', 'value': '手機號碼12345', 'start': 3, 'end': 12, 'confidence': None, 'extractor': 'ner_mitie'}]'


2019-04-15 16:46:07 DEBUG    rasa_core.policies.memoization  - There is no memorised next action
2019-04-15 16:46:07 DEBUG    rasa_core.policies.form_policy  - There is no active form
2019-04-15 16:46:07 DEBUG    rasa_core.policies.ensemble  - Predicted next action using policy_0_KerasPolicy
2019-04-15 16:46:07 DEBUG    rasa_core.processor  - Predicted next action 'action_search_num_business' with prob 0.98.
2019-04-15 16:46:07 DEBUG    rasa_core.actions.action  - Calling action endpoint to run action 'action_search_num_business'.
2019-04-15 16:46:08 DEBUG    rasa_core.processor  - Action 'action_search_num_business' ended with events '[]'
2019-04-15 16:46:08 DEBUG    rasa_core.processor  - Bot utterance 'BotUttered(text:  ['手機號碼12345', '123556', None] 所屬人爲張三，這是他的業務信息, data: {
  "elements": null,
  "buttons": null,
  "attachment": null
})'

DPL(Dialogue Policy Learning)，即對話策略學習，也被稱爲對話策略(Policy)優化，根據當前對話狀態，對話策略決定下一步執行什麼系統動作(action)。系統行動與用戶意圖類似，也由意圖和槽位構成。DLP模塊的輸入時DST(Dialogue state tracker，對話狀態跟蹤)輸出的當前對話狀態，通過預設的對話策略選擇系統動作作爲輸出。Rasa Core中擁有不同的policy，且策略配置文件可以同時包含不同的policy。

Memoization Policy

MemoizationPolicy只記住(memorizes)訓練數據中的對話。如果訓練數據中存在這樣的對話，那麼它將以置信度爲1.0預測下一個動作，否則將預測爲None，此時置信度爲0.0。下面演示瞭如何在策略配置文件config.yml文件中，配置MemoizationPlicy策略，其中，max_history(超參數)決定了模型查看多少個對話歷史以決定下一個執行的action。

 policies:
    - name: "MemoizationPolicy"
    max_history: 5

注：max_history值越大訓練得到的模型就越大並且訓練時間會變長，關於該值到底該設置多少，我們可以舉這麼個例子，比如有這麼一個Intent：out_of_scope來描述用戶輸入的消息off-topic(離題)，當用戶連續三次觸發out_of_scope意圖，這時候我們就需要主動告知用戶需要向其提供幫助，如果要Rasa Core能夠學習這種模型，max_history應該至少爲3。story.md中表現如下：

* out_of_scope
   - utter_default
* out_of_scope
   - utter_default
* out_of_scope
   - utter_help_message

Keras Policy

KerasPolicy策略是Keras框架中實現的神經網絡來預測選擇執行下一個action，它默認的框架使用LSTM(Long Short-Term Memory，長短期記憶網絡)算法，但是我們也可以重寫KerasPolicy.model_architecture函數來實現自己的框架(architecture)。KerasPolicy的模型很簡單，只是單一的LSTM+Dense+softmax，這就需要我們不斷地完善自己的story來把各種情況下的story進行補充。下面演示瞭如何在策略配置文件config.yml文件中，配置KerasPolicy策略，其中，epochs表示訓練的次數，max_history同上。

policies:
  - name: KerasPolicy
    epochs: 100
    max_history: 5

Embedding Policy

基於機器學習的對話管理能夠學習複雜的行爲以完成任務，但是將其功能擴展到新領域並不簡單，尤其是不同策略處理不合作用戶行爲的能力，以及在學習新任務(如預訂酒店)時，如何將完成一項任務(如餐廳預訂)重新應用於該任務時的情況。EmbeddingPolicy，即循環嵌入式對話策略(Recurrent Embedding Dialogue Policy，REDP)，它通過將actions和對話狀態嵌入到相同的向量空間(vector space)能夠獲得較好的效果，REDP包含一個基於改進的Neural Turing Machine的記憶組件和注意機制，在該任務上顯著優於基線LSTM分類器。EmbeddingPolicy包含以下的步驟：

(1) apply dense layers to create embeddings for user intents, entities and system actions including previous actions and slots（稠密嵌入，包括用戶意圖、實體和系統行爲：以前的動作和槽）
(2) use the embeddings of previous user inputs as a user memory and embeddings of previous system actions as a system memory.（使用以前的用戶輸入作爲用戶memory和以前的系統行爲作爲系統memory）
(3) concatenate user input, previous system action and slots embeddings for current time into an input vertor to rnn（合併用戶輸入，以前系統行爲和槽作爲當前時間用戶的輸入向量給rnn模型）
(4) using user and previous system action embeddings from the input vector, calculate attention probabilities over the user and system memories（使用輸入向量中用戶輸入和以前的系統行爲嵌入，來計算用戶和系統的注意力向量）
(5) sum the user embedding and user attention vector and feed it and the embeddings of the slots as an input to an LSTM cell（用戶詞嵌入和用戶注意力向量相加再和槽向量一起作爲LSTM的輸入）
(6) apply a dense layer to the output of the LSTM to get a raw recurrent embedding of a dialogue（應用LSTM的輸出來獲得一個對話的原始循環嵌入）
(7) sum this raw recurrent embedding of a dialogue with system attention vector to create dialogue level embedding, this step allows the algorithm to repeat previous system action by copying its embedding vector directly to the current time output（將對話的原始循環嵌入和系統注意力向量相加，來創建對話層的嵌入。這一步允許算法通過直接拷貝它的向量到當前的輸出來重複之前的系統行爲）
(8) weight previous LSTM states with system attention probabilities to get the previous action embedding, the policy is likely payed attention to（加權以前的LSTM狀態和系統注意力來獲取以前的行爲嵌入，policy最有可能需要注意的）
(9) if the similarity between this previous action embedding and current time dialogue embedding is high, overwrite current LSTM state with the one from the time when this action happened（如果以前的行爲嵌入和當前的對話嵌入相似度很高，overwrite當前的LSTM狀態）
(10) for each LSTM time step, calculate the similarity between the dialogue embedding and embedded system actions（對於LSTM的每一步，計算對話嵌入和系統行爲嵌入的相似度）

所以EmbeddingPolicy效果上來說會比較好，但是它有個問題是耗時，而且尤其是官網的源碼，它並沒有使用GPU、沒有充分利用CPU資源。下圖爲KerasPolicy和EmbeddingPolicy比較效果圖，可見EmbeddingPolicy效果明顯優於KerasPolicy：

配置EmbeddingPolicy參數：

policies:
  - name: EmbeddingPolicy
    epochs: 20
    featurizer:
    - name: FullDialogueTrackerFeaturizer
      state_featurizer:
        - name: LabelTokenizerSingleStateFeaturizer

注：詳情請參考Embedding Policy

Form Policy

FormPolicy是MemoizationPolicy的擴展，用於處理(form)表單的填充事項。當一個FormAction被調用時，FormPolicy將持續預測表單動作，直到表單中的所有槽都被填滿，然後再執行對應的FormAction，詳情可查閱Slot Filling。

Mapping Policy

MappingPolicy可用於直接將意圖映射到要執行的action，從而實現被映射的action總會被執行，其中，這種映射是通過triggers屬性實現的。舉個栗子（domain.yml文件中）：

intents:
 - greet: {triggers: utter_goodbye}

其中，greet是意圖；utter_goodbye是action。一個意圖最多隻能映射到一個action，我們的機器人一旦收到映射意圖的消息，它將執行對應的action。然後，繼續監聽下一條message。需要注意的是，對於上述映射，我們還需要要在story.md文件中添加如下樣本，否則，任何機器學習策略都可能被預測的action_greet在dialouge歷史中突然出現而混淆。

Fallback Policy

如果意圖識別的置信度低於nlu_threshold，或者沒有任何對話策略預測的action置信度高於core_threshold，FallbackPolicy將執行fallback action。通俗來說，就是我們的對話機器人意圖識別和action預測的置信度沒有滿足對應的閾值，該策略將使機器人執行指定的默認action。configs.yml配置如下：

policies:
  - name: "FallbackPolicy"
    # 意圖理解置信度閾值
    nlu_threshold: 0.3
    # action預測置信度閾值
    core_threshold: 0.3
    # fallback action
    fallback_action_name: 'action_default_fallback'

其中，action_default_fallback是Rasa Core中的一個默認操作，它將向用戶發送utter_default模板消息，因此我們需要確保在domain.yml文件中指定此模板。當然，我們也可以在fallback_action_name字段自定義默認回覆的action，比如my_fallback_cation，就可以這麼改：

policies:
  - name: "FallbackPolicy"
    nlu_threshold: 0.4
    core_threshold: 0.3
    fallback_action_name: "my_fallback_action"

3.2 使用對話模型

運行Rasa Core模塊命令：

python -m rasa_core.run -d models/dialogue -u models/nlu/current  --port 5002 --credentials credentials.yml --endpoints endpoints.yml --debug -o out.log

參數說明：

-m mod 指定運行模塊；
-d modeldir 指定dialog對話路徑；
-u modeldir 指定nlu模型路徑；
--port 指定Rasa Core web應用運行的端口號；
--credentials credentials.yml 指定通道(input channels)屬性；
----endpoints endpoints.yml 該文件用於指定Rasa Core連接其他web server的url地址，比如nlu web或custom action web；
-o file 指定輸出log日誌文件路徑；
--debug 打印調試信息，在顯示的信息中，我們可以瞭解到用戶輸入Message後NLU模塊是否提出出實體、意圖及其置信度；插槽的填充情況和使用哪個policy(策略)來預測要執行的下一個action，如果這個 exact story已經在訓練數據中，並且MemoizationPolicy是集成的一部分，那麼它將被用於預測下一次動作的概率爲1。注意(重要)：如果所有的插槽值和NLU信息均是符合預期，但是仍然預測執行錯誤的action，我們就需要檢測是哪個policy決定的這個action，如果是MemoizationPolicy，則說明在stories.md中我們設計的故事情節有問題；如果是KerasPolicy ，說明我們的模型預測得不對，這裏就建議開啓交互式學習來創建故事相關(relevant stories)的數據，並添加到我們的stories中。

接下來，我們重要解釋下credentials.yml和endpoints.yml。

(1) credentials.yml

當我們的AI對話系統需要跟外部世界聯繫時，比如需要連接到facebook, slack, telegram, mattermost and twilio時，就需要使用credentials.yml來存儲對應的訪問權限信息。如果要連接到這些channels，Rasa Core將從yaml格式的憑證文件中讀取這些屬性。示例枚舉如下：

twilio:
  account_sid: "ACbc2dxxxxxxxxxxxx19d54bdcd6e41186"
  auth_token: "e231c197493a7122d475b4xxxxxxxxxx"
  twilio_number: "+440123456789"

slack:
  slack_token: "xoxb-286425452756-safjasdf7sl38KLls"
  slack_channel: "@my_channel"

telegram:
  access_token: "490161424:AAGlRxinBRtKGb21_rlOEMtDFZMXBl6EC0o"
  verify: "your_bot"
  webhook_url: "your_url.com/webhook"

mattermost:
  url: "https://chat.example.com/api/v4"
  team: "community"
  user: "[email protected]"
  pw: "password"

facebook:
  verify: "rasa-bot"
  secret: "3e34709d01ea89032asdebfe5a74518"
  page-access-token: "EAAbHPa7H9rEBAAuFk4Q3gPKbDedQnx4djJJ1JmQ7CAqO4iJKrQcNT0wtD"

webexteams:
  access_token: "ADD-YOUR-BOT-ACCESS-TOKEN"
  room: "YOUR-WEBEXTEAMS-ROOM-ID"
    
rocketchat:
  user: "yourbotname"
  password: "YOUR_PASSWORD"
  server_url: "https://demo.rocket.chat"

# socket通道
# 前兩個配置值定義Rasa Core在通過socket.io發送或接收消息時使用的事件名稱
socketio:
  user_message_evt: user_uttered
  bot_message_evt: bot_uttered
  session_persistence: true/false
    
# rest通道
rest:
  # you don't need to provide anything here - this channel doesn't
  # require any credentials
 
# CallbackInput通道
callback:
  # URL to which Core will send the bot responses
  url: "http://localhost:5034/bot"

當我們需要從自己開發的客戶端訪問人機對話系統，可以通過使用socket和http通道來實現，它們分別對應socketio輸入通道和rest輸入通道，而callback與rest通道類似，均是走HTTP協議，但是它不會直接將bot消息返回給發送消息的HTTP請求，而是調用一個URL，我們可以指定該URL來發送bot消息。由於我們使用HTTP情況比較多，該情況下credentials.yml的配置如下：

rest:
  # you don't need to provide anything here - this channel doesn't
  # require any credentials

(2) endpoints.yml

在Rasa Core項目創建endpoint.yml文件，該文件用於指定Rasa Core將要訪問的CustomeAction web和nlu web，當rasa core需要執行意圖、實體提取和執行action時，就會根據url找到對應的web進行執行。這裏假設CustomeAction web和NLU web是獨立的項目，其中，localhost，即IP地址，表示部署在本地，如果部署在其他終端，改成對應的IP即可；5055默認爲CustomeAction web端口；5000默認爲NLU web。

# 指定custom action web url
action_endpoint:
  url: "http://localhost:5055/webhook"
# 指定nlu web url
nlu:
  url: "http://localhost:5000"
  # you can also specify additional parameters, if you need them:
  # headers:
  #   my-custom-header: value
  # token: "my_authentication_token"    # will be passed as a get parameter
  # basic_auth:
  #   username: user
  #   password: pass
# 指定models url，即從其他服務器獲取模型數據
models:
  url: http://my-server.com/models/default_core@latest
  wait_time_between_pulls:  10   # [optional](default: 100)

注：HTTP的POST方式訪問url，其中，POST方式的body舉例如下：

{
"tracker": {
 "latest_message": {
   "text": "/greet",
   "intent_ranking": [
     {
       "confidence": 1.0,
       "name": "greet"
     }
   ],
   "intent": {
     "confidence": 1.0,
     "name": "greet"
   },
   "entities": []
 },
 "sender_id": "22ae96a6-85cd-11e8-b1c3-f40f241f6547",
 "paused": false,
 "latest_event_time": 1531397673.293572,
 "slots": {
   "name": null
 },
 "events": [
   {
     "timestamp": 1531397673.291998,
     "event": "action",
     "name": "action_listen"
   },
   {
     "timestamp": 1531397673.293572,
     "parse_data": {
       "text": "/greet",
       "intent_ranking": [
         {
           "confidence": 1.0,
           "name": "greet"
         }
       ],
       "intent": {
         "confidence": 1.0,
         "name": "greet"
       },
       "entities": []
     },
     "event": "user",
     "text": "/greet"
   }
 ]
},
"arguments": {},
"template": "utter_greet",
"channel": {
 "name": "collector"
}
}

endpoint的response舉例如下：

{
 "text": "hey there",
 "buttons": [],
 "image": null,
 "elements": [],
 "attachments": []
}

4. 搭建CustomActions服務器

前面說到，CustomAction的具體業務邏輯是實現在一個獨立的web server中，當然，我們也可以直接寫到Rasa Core項目中。但是考慮到代碼的可維護性和模塊化，這裏還是建議重新創建一個web server項目。對於這個web server的開發語言，雖然Rasa官方基本沒有什麼限制，但是我還是建議使用python，因爲Rasa Core專門爲此提供了一個SDK，即rasa-core-sdk，便於我們快速開發action web。基本步驟如下：

第一步：創建action web項目，安裝rasa-core-sdk

pip install rasa_core_sdk

注：目前最新版本爲0.13.0。

第二步：在web項目的根目錄下創建actions.py文件，該文件實現具體的action業務邏輯，當然這個文件的名字可以任意命名，也不必要一定要放在根目錄下，只是在啓動web時需要改下參數。這裏仍然以查詢天氣和空氣質量舉例，actions.py代碼如下：

from rasa_core_sdk import Action
from rasa_core_sdk.events import SlotSet

# 查詢天氣action
class ActionSearchWeather(Action):
   def name(self):
      # type: () -> Text
      return "action_search_weather"

   def run(self, dispatcher, tracker, domain):
      city = tracker.get_slot('city')
	  datetime =  tracker.get_slot('datetime')
      # 執行天氣查詢業務邏輯
   	  ....
      
      # 回覆用戶Message方法1：使用dipatcher
      # dipatcher.utter_message(‘result if result is not None else []’)  
      # return []
      # 回覆用戶Message方法2：使用SlotSet
      return [SlotSet("matches", result if result is not None else [])
              
  
# 查詢空氣質量action
class ActionSearchWeatherQuality(Action):
   def name(self):
      # type: () -> Text
      return "action_search_weather_quality"

   def run(self, dispatcher, tracker, domain):
      city = tracker.get_slot('city')
	  datetime =  tracker.get_slot('datetime')
      # 執行查詢空氣質量業務邏輯
   	  ....
              
      # 回覆用戶Message方法1：使用dipatcher
      # dipatcher.utter_message(‘result if result is not None else []’)  
      # return []
      # 回覆用戶Message方法2：使用SlotSet
      return [SlotSet("matches", result if result is not None else [])]

需要注意的是，編寫action的響應代碼，必須遵守以下三個規則：

創建一個繼承於Action的類，這個類的名字可以任意，但是這裏還是建議直接根據action的名使用駝峯命名法命名；
重寫Action的name函數，返回值必須爲對應的action名，因爲這是Rasa Core定位到該action類的關鍵所在；
重寫Action的run函數，這個函數就是我們具體的業務所在，即當Rasa Core匹配action成功後，會自動執行該函數完成具體的操作並返回響應給用戶。run函數需要傳遞四個參數，即self、dispatcher、tracker以及domain，其中後三個尤其重要。
函數原型：Action.run(dispatcher, tracker, domain)

(1) 參數說明
- dispatche：該對象用於向用戶回覆消息，通過dipatcher.utter_message()、dispatcher.utter_template以及其他rasa_core_sdk.executor
  
  .CollectingDispatcher方法。
- tracker：該對象描述當前會話的狀態，通過該對象的tracker.get_slot(slot_name)方法可以輕鬆地獲得指定插槽中的值，或者通過tracker.latest_message.text獲取最新的用戶信息等；
- domain：該對象即爲domain.yml
（2）返回值

返回一個列表[]，該列表可包含多個rasa_core_sdk.events.Event對象，比如上面代碼中的SlotSet對象就是一個event，它的作用就是完成插槽值設定這麼一個時間。

從分析Action.run()函數原型可知，它的返回值是一個包含多個rasa_core_sdk.events.Event對象的列表，下面我們就具體介紹下這個Event對象。Event對象是Rasa Core描述會話中發生的所有事件和明確rasa.core.trackers.DialogueStateTracker該如何更新其狀態的基類，因此不能被直接使用，而是通過它包含的具體事件(event)實現。具體如下：

通用事件(General purpose events)

(a) SlotSet：設置插槽值

Class	rasa_core.events.SlotSet(key, value=None, timestamp=None)
描述	該事件用於實現設置對話tracker中插槽(slot)的值，其中，參數key表示插槽名，參數value表示要設置的值
JSON	{ ‘event’: ‘slot’, ‘name’: ‘departure_airport’, ‘value’: ‘BER’ }

(b) Restarted：重置tracker

Class	rasa_core.events.Restarted(timestamp=None)
描述	該事件用於重置tracker，即初始化tracker的狀態，包括所有的會話歷史，slots值。
JSON	{ ‘event’: ‘restart’ }

(3) AllSlotsReset：重置一次會話中所有的插槽

Class	rasa_core.events.AllSlotsReset(timestamp=None)
描述	該事件用於初始化會話中所有插槽(slots)，當我們希望保留對話歷史，僅僅是重置所有slot的值就可以使用AllSlotsReset事件。
JSON	{ ‘event’: ‘reset_slots’ }

(4) ReminderScheduled：定時執行某個action

Class	rasa_core.events.ReminderScheduled(action_name, trigger_date_time, name=None, kill_on_user_message=True, timestamp=None)
描述	該事件用於設置定時執行某個事件，其中，參數action_name爲需要執行的action名字，參數trigger_date_time爲時間。
JSON	{ ‘event’: ‘reminder’,

'action': 'my_action',
'date_time': '2018-09-03T11:41:10.128172',
'name': 'my_reminder',
'kill_on_user_msg': True

} |

(5) ConversationPaused：暫停會話

Class	rasa_core.events.ConversationPaused(timestamp=None)
描述	該事件用於暫停會話，即對話機器人忽略用戶輸入的Message，不會執行預測到的action，直到執行了resume事件。通過該事件，我們可以人工接管對用戶輸入的Message作出響應。
JSON	{ ‘event’: ‘pause’,
}

(6) ConversationResumed：設置插槽值

Class	rasa_core.events.ConversationResumed
描述	該事件用於恢復之前被暫停的會話，對話機器人繼續負責對用戶輸入作出響應。
JSON	{ ‘event’: ‘resume’,
}

(7) FollowupAction：強制設置next action爲指定的action

Class	rasa_core.events.FollowupAction(name, timestamp=None)
描述	該事件的作用是強制設定Rasa Core的next action爲某個固定的action，而不再通過預測來確定next action是什麼。
JSON	{ ‘event’: ‘followup’,

'name': 'my_action'

} |

自動跟蹤事件(Automatically tracked events)

(1) UserUttered：用戶向對話機器人發送一條Message

Class	rasa_core.events.UserUttered(text, intent=None, entities=None, parse_data=None, timestamp=None, input_channel=None, message_id=None)[source]
描述	該事件的作用爲用戶向對話機器人發送一個條Message，但是當執行該事件後，會自動在Tracker對象中創建一個`Turn`(注：不知道這個Turn是什麼意思)。
JSON	{ ‘event’: ‘user’,

'text': 'Hey',
'parse_data': {
    'intent': {'name': 'greet', 'confidence': 0.9},
    'entities': []
}

} |

(2) BotUttered：對話機器人向用戶發送一條Message

Class	rasa_core.events.BotUttered(text=None, data=None, timestamp=None)
描述	該事件用於對話機器人向用戶發送一條Message，需要注意的是，BotUttered並不需要訓練，它被包含在ActionExecuted類中，Track對象默認有一個實體。
JSON	{ ‘event’: ‘bot’,

'text': 'Hey there!',
'data': {}

} |

(3) UserUtteranceReverted：當最後一個UserUttered被執行後撤銷所有 side effects

Class	rasa_core.events.UserUtteranceReverted(timestamp=None)
描述	當最後一個UserUttered被執行後，該事件用於撤銷所有side effects
JSON	{ ‘event’: ‘rewind’ }

(4) ActionReverted：當最後一個action被執行後，撤銷所有 side effects

Class	rasa_core.events.ActionReverted(timestamp=None)
描述	機器人撤消它的最後一個動作
JSON	{ ‘event’: ‘undo’,
}

(5) ActionExecuted：Logs an action the bot executed to the conversation. Events that action created are logged separately.

Class	rasa_core.events.ActionExecuted(action_name, policy=None, confidence=None, timestamp=None)[source]
描述	An operation describes an action taken + its result.It comprises an action and a list of events. operations will be appended to the latest Turn in the Tracker.turns.
JSON	{ ‘event’: ‘action’,

'name': 'my_action'

} |

第三步：啓動action web服務器，其中bot即bot.py，爲action具體實現文件。

python -m rasa_core_sdk.endpoint --actions bot

5.交互式訓練

雖然我們可以容易的人工構建story樣本數據，但是往往會出現一些考慮不全，甚至出錯等問題，基於此，Rasa Core框架爲我們提供了一種交互式學習(Interactive Learning)來獲得所需的樣本數據。在互動學習模式中，當你與機器人交談時，你會向它提供反饋，這是一種強大的方法來探索您的機器人可以做什麼，也是修復它所犯錯誤的最簡單的方法。基於機器學習的對話的一個優點是，當你的機器人還不知道如何做某事時，你可以直接教它。
首先，我們創建一些story樣本。

## story1：greet only
* greet
    - utter_answer_greet
> check_greet

## story2：
* goodbye
	- utter_answer_goodbye

## story3:thanks
* thanks
    - utter_answer_thanks
    
## story4:change address or data-time withe greet
> check_greet
* weather_address_date-time{"address": "上海", "date-time": "明天"}
    - action_report_weather

## story5:change address or data-time withe greet
> check_greet
* weather_address_date-time{"address": "上海", "date-time": "明天"}
    - action_report_weather
    - utter_report_weather
* weather_address{"address": "北京"} OR weather_date-time{"date-time": "明天"}
    - action_report_weather
    - utter_report_weather
...
...

其次，啓動交互式學習，其中，第一行命令用於啓動custom action服務器；第二行命令用於啓動交互式學習模式。在交互模式下，機器人會要求你確認NLU和Core做出的每一個預測。

python -m rasa_core_sdk.endpoint --actions actions&

python -m rasa_core.train \
  interactive -o models/dialogue \
  -d domain.yml -c policy_config.yml \
  -s data/stories.md \
  --nlu models/current/nlu \
  --endpoints endpoints.yml
    
# 或者直接使用已經訓練好的對話模型
# python -m rasa_core.train \
#  interactive --core models/dialogue \
#  --nlu models/current/nlu \
#  --endpoints endpoints.yml

當然，在交互式學習過程中，我們還可以通過Rasa Core提供的可視化圖形跟蹤會話狀態，只需要在瀏覽器輸入http://localhost:5005/visualization.html即可。

[外鏈圖片轉存失敗,源站可能有防盜鏈機制,建議將圖片保存下來直接上傳(img-cY9VFTDq-1582980505125)(http://rasa.com/docs/rasa/_images/interactive_learning_graph.gif)]

6. 評估模型

6.1 評估對話模型

爲了評估訓練好的對話模型，Rasa Core提供了rasa_core.evaluate模塊來實現。當執行下列命令後，rasa_core.evaluate模塊會將預測action不正確的所有story存儲到results/failed_stories.md文件中。此外，該命令還會自動生成一個混淆矩陣(confusion matrix)文件results/story_confmat.pdf，所謂的混淆矩陣表示的是預測domain.yml文件中每個action被預測命中的頻率以及一個錯誤action被預測的頻率。

python -m rasa_core.evaluate --core models/dialogue --stories test_stories.md -o results

參數說明：

usage: evaluate.py default [-h] [-m MAX_STORIES] [-u NLU] [-o OUTPUT] [--e2e]
                           [--endpoints ENDPOINTS]
                           [--fail_on_prediction_errors] [--core CORE]
                           (-s STORIES | --url URL) [-v] [-vv] [--quiet]

-m module 指定運行模塊；
-- core dir 指定對話模型路徑；
-- stories file或-s file 指定用於測試的story文件路徑；
-o dir 指定評估結果輸出路徑；
-m number或--max_stories number 指定要測試的story的最大數量;
-u NLU或--nlu NLU 指定NLU模型路徑；
--url URL 指定訓練對話模型的story文件URL；
--endpoints ENDPOINTS 指定端點文件；
--e2e, --end-to-end 爲combined action和意圖預測運行一個端到端(end-to-end)評估；

6.2 評估NLU和Core

假設我們的機器人使用對話模型與Rasa NLU模型組合來解析意圖，希望評估這兩個模型如何在整個對話中一起執行。rasa_core.evaluate通過使用--e2e參數選項，允許將Rasa NLU意圖預測和Rasa Core行爲預測結合，端對端地評估對話模型。執行腳本如下：

python -m rasa_core.evaluate default --core models/dialogue --nlu models/nlu/current  --stories e2e_stories.md --e2e

需要注意的是，用於端到端評估的story格式與標準的Rasa Core story格式略有不同，前者必須使用自然語言包含用戶消息，而不僅僅是它們的意圖，即* <intent>:<Rasa NLU example>。
e2e_stories.md示例如下：

## end-to-end story 1
* greet: hello
   - utter_ask_howcanhelp
* inform: show me [chinese](cuisine) restaurants
   - utter_ask_location
* inform: in [Paris](location)
   - utter_ask_price

## end-to-end story 2
...

6.3 對比評估Policies

在評估模型的時候，需要我們構造一些對話測試集，通常這些測試集從訓練的對話數據中抽取而來。但是，作爲一個剛啓動的項目，畢竟用於訓練的對話數據非常有限，肯定是捨不得抽一部分用於測試評估的。因此，爲了緩解這個問題，Rasa Core爲我們提供一些腳本，允許我們對使用的Policies方案進行對比，以選擇一個性能最好的方案來訓練我們的數據。步驟如下：
(1) 創建多個Policies配置文件，即不同方案，執行下列腳本

python -m rasa_core.train compare -c policy_config1.yml policy_config2.yml -d domain.yml -s stories_folder -o comparison_models --runs 3 --percentages 0 5 25 50 70 90 95

其中，-c file1 file2..指定要對比的方案；-o dir 指定輸出結果路徑；--run num 指定run次數，以確保結果一致；-- percentages指定Rasa Core將使用0、5、25、50、70進行多次訓練

(2) 評估訓練得到的對比模型，確定那種policies方法更好

python -m rasa_core.evaluate compare --stories stories_folder --core comparison_models -o comparison_results

其中，--core指定對比模型所在路徑；-o 指定輸出結果。需要注意的是，如果不確定要比較哪些策略，官方建議可以從EmbeddingPolicy和KerasPolicy入手，確定哪種更適合。

7. FormAction

在Rasa Core中，當我們執行一個action需要同時填充多個slot時，可以使用FormAction來實現，因爲FormAction會遍歷監管的所有slot，當發現相關的slot未被填充時，就會向用戶主動發起詢問，直到所有slot被填充完畢，纔會執行接下來的業務邏輯。

7.1 添加form字段到Domain

在doamin文件下新增forms:部分，並將所有用到的form名稱添加到該字段下：

intents:
	- request_restaurant
    - request_weather
    
slots:
    - cuisine
    	type: unfeaturized
    - num_people
    	type: unfeaturized
    - outdoor_seating
    	type: unfeaturized
    - preferences
    	type: unfeaturized
    - feedback
    	type: unfeaturized
            
actions:
    ...

# form action？
forms:
  - restaurant_form
  - weather_form
  ...

templates:
	- utter_ask_cuisine
	- utter_ask_num_people
	- utter_ask_outdoor_seating
	- utter_ask_preferences
	- utter_ask_feedback
    - utter_ask_continue
    ...

7.2 重新構造story

在story中，不僅需要考慮用戶按照我們的設計準確的提供有效信息，而且還要考慮用戶在中間過程改變要執行的意圖情況或稱輸入無效信息，因爲對於FormAction來說，如果無法獲得預期的信息，就會報``ActionExecutionRejection異常，這裏我們分別稱這兩種情況爲happy path、unhappy path。示例如下：

## dhappy path for restaurant
* request_restaurant
    - restaurant_form
    - form{"name": "restaurant_form"}
    - form{"name": null}

## unhappy path for restaurant,chitchat exclude ActionExecutionRejection
* request_restaurant
    - restaurant_form
    - form{"name": "restaurant_form"}
* chitchat
    - utter_chitchat
    - restaurant_form
    - form{"name": null}
 
## unhappy path for restaurant,change mind suddenly
* request_restaurant
    - restaurant_form
    - form{"name": "restaurant_form"}
* stop
    - utter_ask_continue
* deny
    - action_deactivate_form
    - form{"name": null}
...

注：* request_restaurant爲意圖；- restaurant_form爲form action；- form{"name": "restaurant_form"}爲激活表單(form)；- form{"name": null}爲禁止表單；- action_deactivate_form爲默認的action，它的作用是用戶可能在表單操作過程中改變主意，決定不繼續最初的請求，我們使用這個default action來禁止(取消)表單，同時重置要請求的所有slots。

7.3 實現自定義action

class RestaurantForm(FormAction):
   """Example of a custom form action"""

   def name(self):
       # type: () -> Text
       """Unique identifier of the form"""

       return "restaurant_form"

   @staticmethod
   def required_slots(tracker):
       # type: () -> List[Text]
       """A list of required slots that the form has to fill"""

       return ["cuisine", "num_people", "outdoor_seating",
               "preferences", "feedback"]

   def submit(self, dispatcher, tracker, domain):
       # type: (CollectingDispatcher, Tracker, Dict[Text, Any]) -> List[Dict]
       """Define what the form has to do
           after all required slots are filled"""

       # utter submit template
       dispatcher.utter_template('utter_submit', tracker)
       return []

當form action第一被調用時，form就會被激活並進入FormPolicy策略模式。每次執行form action，required_slots會被調用，當發現某個還未被填充時，會主動去調用形式爲uuter_ask_{slotname}的模板(注：定義在domain.yml的templates字段中)；當所有slot被填充完畢，submit方法就會被調用，此時本次form操作完畢被取消激活。

7.4 配置Configuration

修改Rasa Core的Policy配置文件configs.yml，新增FormPolicy策略：

policies:
  - name: KerasPolicy
    epochs: 500
    max_history: 5
  - name: FallbackPolicy
    fallback_action_name: 'action_default_fallback'
  - name: MemoizationPolicy
    max_history: 5
  - name: "FormPolicy"