L1、L2範式
假設需要求解的目標函數爲:
E(x) = f(x) + r(x)
其中f(x)爲損失函數,用來評價模型訓練損失,必須是任意的可微凸函數,r(x)爲規範化約束因子,用來對模型進行限制,根據模型參數的概率分佈不同,r(x)一般有:L1範式約束(模型服從高斯分佈),L2範式約束(模型服從拉普拉斯分佈);其它的約束一般爲兩者組合形式。
L1範式約束一般爲:
L2範式約束一般爲:
L1範式可以產生比較稀疏的解,具備一定的特徵選擇的能力,在對高維特徵空間進行求解的時候比較有用;L2範式主要是爲了防止過擬合。
稀疏性約束
在文章Non-negative Matrix Factorization With Sparseness Constraints中,將L1範式和L2範式組合起來形成新的約束條件,用稀疏度來表示L1範式和L2範式之間的關係:
當向量x中只有一個非零的值時,稀疏度爲1,當所有元素非零且相等的時候稀疏度爲0。n表示向量x的維度。不同稀疏度的向量表示如下:
NMF with Sparseness Constraint
目標函數:
算法流程如下:
算法中一個很重要的步驟是投影算法,即給定向量x和L2、L1值,找到給定稀疏度的投影向量。投影算法如下:
算法至多迭代dim(x)次就會收斂,因爲每次迭代的時候至少會產生一個新的非零值,所以速度還是很快的。算法的matlab代碼在 http://www.cs.helsinki.fi/patrik.hoyer/上,投影部分的python代碼如下:
- #!/usr/bin/python
- #-*-coding:utf-8-*-
- from __future__ import division
- import math
- import sys
- #import numpy
- """desiredsparseness can be set [0.1,0.2,0.3,0.4,0.5]"""
- def l1sparse(dimension,desiredsparseness):
- return math.sqrt(dimension) - (math.sqrt(dimension)-1)*desiredsparseness
- def vsum(vector):
- sum = 0
- for v in vector:
- sum += v
- return sum
- def v2sum(vector):
- sum = 0
- for v in vector:
- sum += v*v
- return sum
- def vadd(vector,factor):
- vresult = []
- for v in vector:
- v += factor
- vresult.append(v)
- return vresult
- def vmultip(vector,factor):
- vresult = []
- for v in vector:
- v = v*factor
- vresult.append(v)
- return v
- def ones(dimension,num):
- v = []
- for i in xrange(dimension):
- v.append(num)
- return v
- def vdec(svector,dvector):
- vresult = []
- for i in xrange(len(svector)):
- t = svector[i]-dvector[i]
- vresult.append(t)
- return vresult
- def vaddv(svector,dvector):
- vresult = []
- for i in xrange(len(svector)):
- t = svector[i] + dvector[i]
- vresult.append(t)
- return vresult
- """This should inverse svector first
- svector:N*1
- dvector 1*N"""
- def vmultipv(svector,dvector):
- sum = 0
- for i in xrange(svector):
- sum += svector[i]*dvector[i]
- return sum
- def checknon(svector):
- valid = True
- for v in svector:
- if v<0:
- valid = False
- break
- return valid
- def findne(svector):
- vresult = []
- for i in xrange(len(svector)):
- if svector[i]<0:
- vresult.append(i)
- return vresult
- """ This function solves following :
- Given a vector svector,find a vector k which
- having sum(abs(k))=l1norm;sum(k,2)=l2norm ;
- and is closest to svector in euclidian distance
- if nn is set to 1 ,and the elements of k is
- restricted to non-nagative"""
- def projfuc(svector,l1norm,l2norm,nn):
- N = len(svector)
- sum = vsum(svector)
- factor = (l1norm-sum)/N
- v = vadd(svector,factor)
- zerov = []
- j = 0#iter times
- while 1:
- p = ones(N,1)
- factor = l1norm/(N-len(zerov))
- midpoint = vmultip(p,factor)
- for vp in zerov:
- midpoint[vp] = 0
- w = vdec(v,midpoint)
- a = v2sum(w)
- b = vmultipv(w,v)*2
- c = v2sum(v)-l2norm
- alphap = (-b+float(math.sqrt(b*b-4*a*c)))/(2*a)
- v1 = vmultip(w,alphap)
- vnew = vaddv(v1,v)
- valid = checknon(vnew)
- if valid:
- j +=1
- v = vnew
- break;
- j+=1
- zerov = findne(vnew)
- for vp in zerov:
- vnew[vp] = 0
- sum = vsum(vnew)
- factor = (l1norm-sum)/(N-len(zerov))
- v = vadd(vnew,factor)
- for vp in zerov:
- v[vp] = 0
- return