generalization & overfitting

overfitting occurs when a model tries to fit the training data so closely that it does not generalize well to new data.

the following basic assumptions guide generalization:

  • we draw examples independently and identically (i.i.d) at random from the distribution. in other words, examples do not influence each other. i.i.d is a way of referring to the randomness of variables.
  • the distribution is stationary; that is the distribution doesn’t change within the data set.
  • we draw examples from partitions from the same distribution.

examples violate above assumptions:

  • consider a model that chooses ads to display. the i.i.d assumption would be violate if the model bases its choice of ads, in part, on what ads the user has previously seen.
  • consider a data set that contains retail sales information for a year. user’s purchases change seasonally, which would violate stationarity.
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章