Rm@i,P@i,MAP,MRR

Rm@i

Let {(ci,ri),1 <= i <=n} be the list of m context-response pairs from the test set. For each context ci, we create a set of m alternative responses, one response being the actual response ri, and the m-1 other responses being sampled at random from the same corpus. The m alternative responses are then ranked based on the output from the conversational model, and the Recallm@i measures how often the correct response appears in the top i results of this ranked list. The Recallm@i metric is often used for the evaluation of retrieval models as several responses may be equally “correct” given a particular context.

Precision@K

Set a rank threshold K
Compute % relevant in top K
Ignores documents ranked lower than K

Ex: 這裏寫圖片描述
Prec@3 of 2/3
Prec@4 of 2/4
Prec@5 of 3/5

Mean Average Precision

這裏寫圖片描述

這裏寫圖片描述

這裏寫圖片描述

MRR

這裏寫圖片描述

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章