20180713DLday1課程筆記

Introduction


0

1

  • History 1950-1970 logic rules; 1980-1990 knowledge acquisition; 2010

-. machine learning

  • DeepLearningMachineLearningArtificialIntelligence
  • machine learning
    • use statistical techniques, “learn” with data
    • extract features automatically, instead of by domain experts
    • learn automatically, instead of explicit programming
  • Big Data-Big Computation-Big Model : Why deep learning now
  • usage

2Probability

  • Bayes’ Theorem

    • p(Y|X)=p(X|Y)p(Y)p(X),p(X)=Yp(X|Y)p(Y)
    • posterior likelihood * prior
  • variables

    • E[f] := the average value of f(X) under the distribution p(x)
    • E[f]=xp(x)f(x)
    • V[f], cov[x, y]
  • distributions

    • binomial distribution
    • Bin(m|N,μ)=(Nm)μm(1μ)Nm
    • E[m]=Nμ,var[m]=Nμ(1μ)
  • multinomial variables

    • x可以取k種值,x=(0,0,1,0,0,0)T 表示x取了六種中的第三種

    μ=(μ1,μ2,...,μk)T ,對應x向量每個位置上爲1的概率

    從而某個特定的x出現的概率 p(x|μ)=k=1Kμkxk (也就是μk )

    • E[x|mu]=xp(x|μ)x=(μ1,μ2,...,μk)T=μ

    • maximum likelihood estimation

    μk=mkN,mk=Nxnk

  • gaussian univariate distribution正態分佈

    • multivariate gaussian distribution
    • maximum likelihood estimation
    • mixture of gaussians-可以模擬其他各種分佈
  • gradient descent梯度下降

    • a way to minimize an object function J(θ)
    • η : learning rate, which determines the size of the steps we take to reach a local minimum
    • update equation: θ=θηθJ(θ)

草稿紙

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章