最大似然估計，最大後驗估計，概率相關知識

原創

2020-06-25 22:26

1、什麼是似然函數

The likelihood of a set of parameter values, θ, given outcomes x, is equal to the probability of those observed outcomes given those parameter values, that is

\mathcal{L}(\theta |x) = P(x | \theta)

The likelihood function is defined differently for discrete and continuous probability distributions.

Discrete probability distribution

Let X be a random variable with a discrete probability distribution p depending on a parameter θ. Then the function

\mathcal{L}(\theta |x) = p_\theta (x) = P_\theta (X=x), \,

considered as a function of θ, is called the likelihood function (of θ, given the outcome x of the random variable X). Sometimes the probability of the value x of X for the parameter value θ is written as {\displaystyle P(X=x|\theta )} $P(X=x|\theta)$ ; often written as {\displaystyle P(X=x;\theta )} $P(X=x;\theta)$ to emphasize that this differs from {\displaystyle {\mathcal {L}}(\theta |x)} $\mathcal{L}(\theta |x)$ which is not a conditional probability, because θ is a parameter and not a random variable.

Continuous probability distribution

Let X be a random variable following an absolutely continuous probability distribution with density function f depending on a parameter θ. Then the function

\mathcal{L}(\theta |x) = f_{\theta} (x), \,

considered as a function of θ, is called the likelihood function (of θ, given the outcome x of X). Sometimes the density function for the value x of X for the parameter value θ is written as {\displaystyle f(x|\theta )} $f(x|\theta )$ ; this should not be confused with {\displaystyle {\mathcal {L}}(\theta |x)} $\mathcal{L}(\theta |x)$ which should not be considered a conditional probability density.

總的來說，似然函數就是，一個概率模型的參數θ還沒有確定時，給定一組已經發生的樣本（輸出給定）X，這個參數θ的似然L(θ|X)定義爲：

在參數爲θ時，樣本X發生的概率。

2、最大似然估計的步驟

2.1離散型變量

我們現在有一組樣本，樣本數量爲n，分別是，X1，X2，X3，X4，...，Xn

我們現在的概率模型中有k個參數θ1，...，θk，記做θall

（1）得到表達式

若爲離散型隨機變量，一般情況下我們都會假設變量之間相互獨立，那麼似然函數爲

L（θall|X1，X2，..Xn）=P(X1|θall)*P（X2|θall）*....*P(Xn|θall)

L（似然值）=各個樣本在θ1，...，θk這一組參數下的概率的乘積

（2）求解最大值

這是一個關於θ1，...，θk的k元函數，以爲這組樣本已經發生，所以概率值越大越好

我們要求這個函數的最大值，這也就變成了一個最優化問題。

關於離散型求解方法，進一步研究，不知道求導是否可行

2.2連續性變量

（1）連續性變量道理一致，只需要將概率P改成概率密度函數。

L（θall|X1，X2，..Xn）=f(X1|θall)*f（X2|θall）*....*f(Xn|θall)

（2）兩邊取對數，因爲對數函數是單調遞增，所以最大值點相同，不受影響

ln L（θall|X1，X2，..Xn）=ln f(X1|θall)+ ln f(X2|θall)+...+ln f(Xn|θall)

（3）求ln(L)對θ1，θ2，....θn的偏導數，另各階偏導數爲0，得到n個方程，這樣就能解得函數極值點。

（4）如果不能求根，或者導數不存在，就要考慮其他方法。

3、貝葉斯公式

貝葉斯定理由英國數學家貝葉斯 ( Thomas Bayes 1702-1761 ) 發展，用來描述兩個條件概率之間的關係，

比如 P(A|B) 和 P(B|A)。按照乘法法則，可以立刻導出：

P(A∩B) = P(A)*P(B|A)=P(B)*P(A|B)。

如上公式等式的後兩項也可變形爲：

P(B|A) = P(A|B)*P(B) / P(A)。

貝葉斯公式就是刻畫了兩個條件概率的相互關係，並沒有什麼特別之處。

後驗估計時，把參數當成了隨機變量，那麼參數和樣本就是兩個互相作用的條件概率。

3、後驗概率

The posterior probability is the probability of the parameters {\displaystyle \theta } $\theta$ given the evidence {\displaystyle X} $X$ : {\displaystyle p(\theta |X)} $p(\theta |X)$ .

注意後驗概率把參數當成了隨機變量，求的是在樣本發生的情況下，參數是 $\theta$ 的概率

這與參數似然不同，參數 $\theta$ 的似然實際上還是求的在參數是 $\theta$ 的時候，給定的那組樣本發生的概率。

後驗概率可以通過先驗概率和似然函數求得，也就是通過貝葉斯公式，

P（A|B）=P（B|A）*P(A）/ P（B）

理解：

（1）P(參數|樣本）=P(樣本|參數）*P（參數的先驗概率）/P(樣本的先驗概率）

（2）後驗概率 = (似然度 * 先驗概率)/標準化常量　也就是說，後驗概率與先驗概率和似然度的乘積成正比。

很明顯 P(樣本|參數）就是參數的似然，所以我們有一個正比關係：

${\text{Posterior probability}}\propto {\text{Likelihood}}\times {\text{Prior probability}}$ .

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

最大似然估計，最大後驗估計，概率相關知識

Discrete probability distribution

Continuous probability distribution

python gdal 安裝使用（Windows， python 3.6.8）

協方差矩陣與PCA深入原理剖析

最優化理論之牛頓法

最大似然估計，最大後驗估計，概率相關知識

C++類繼承構造函數的語法 & initialization list初始化

&和*

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結