【時間】2020.01.07
【題目】【SciPy庫】scipy.optimize.fmin_l_bfgs_b進行L-BFGS優化
具體用法參考官方文檔:scipy.optimize.fmin_l_bfgs_b
x,min_val,info=scipy.optimize.fmin_l_bfgs_b(func, x0, fprime=None, args=(), approx_grad=0, bounds=None, m=10, factr=10000000.0, pgtol=1e-05, epsilon=1e-08, iprint=-1, maxfun=15000, disp=None)
一、參數:主要是loss函數func、待更新參數初始值x0、梯度函數fprime以及maxfun(梯度更新的次數)
注意grad必須是展開的向量(2D),如果x是3D矩陣,需要先flaten.
func : callable f(x,*args)
Function to minimise.最小化的目標,一般是loss函數
x0 : ndarray
Initial guess.最初的猜測,即待更新參數初始值。
fprime : callable fprime(x,*args)
The gradient of func. 梯度函數
If None, then func returns the function value and the gradient (f, g = func(x, *args)), unless approx_grad is True in which case func returns only f.
args : sequence
Arguments to pass to func and fprime. func and fprime函數的參數
approx_grad : bool
Whether to approximate the gradient numerically (in which case func returns only the function value).
bounds : list
(min, max) pairs for each element in x, defining the bounds on that parameter. Use None for one of min or max when there is no bound in that direction.
m : int
The maximum number of variable metric corrections used to define the limited memory matrix. (The limited memory BFGS method does not store the full hessian but uses this many terms in an approximation to it.)
factr : float
The iteration stops when (f^k - f^{k+1})/max{|f^k|,|f^{k+1}|,1} <= factr * eps, where eps is the machine precision, which is automatically generated by the code. Typical values for factr are: 1e12 for low accuracy; 1e7 for moderate accuracy; 10.0 for extremely high accuracy.
pgtol : float
The iteration will stop when max{|proj g_i | i = 1, ..., n} <= pgtol where pg_i is the i-th component of the projected gradient.
epsilon : float
Step size used when approx_grad is True, for numerically calculating the gradient
iprint : int
Controls the frequency of output. iprint < 0 means no output.
disp : int, optional
If zero, then no output. If positive number, then this over-rides iprint.
maxfun : int
Maximum number of function evaluations.功能評估的最大數量
二、返回值
x : array_like
Estimated position of the minimum.估計最小值的位置,即loss最小時對應的x
f : float
Value of func at the minimum.最小的Func值,即loss值。
d : dict
Information dictionary.
- d[‘warnflag’] is
- 0 if converged,
- 1 if too many function evaluations,
- 2 if stopped for another reason, given in d[‘task’]
- d[‘grad’] is the gradient at the minimum (should be 0 ish)
- d[‘funcalls’] is the number of function calls made. 即梯度更新的次數。
info舉例:
{'grad': array([-7.65604162, -2.14013386, 3.16267967, ..., -1.03821039, -4.23868084, -3.17428398]), 'task': b'STOP: TOTAL NO. of f AND g EVALUATIONS EXCEEDS LIMIT', 'funcalls': 51, 'nit': 47, 'warnflag': 1}
補充:更多scipy庫知識: