【SciPy庫】scipy.optimize.fmin_l_bfgs_b進行L-BFGS優化

【時間】2020.01.07

【題目】【SciPy庫】scipy.optimize.fmin_l_bfgs_b進行L-BFGS優化

具體用法參考官方文檔:scipy.optimize.fmin_l_bfgs_b

x,min_val,info=scipy.optimize.fmin_l_bfgs_b(func, x0, fprime=None, args=(), approx_grad=0, bounds=None, m=10, factr=10000000.0, pgtol=1e-05, epsilon=1e-08, iprint=-1, maxfun=15000, disp=None)

一、參數:主要是loss函數func、待更新參數初始值x0、梯度函數fprime以及maxfun(梯度更新的次數)

注意grad必須是展開的向量(2D),如果x是3D矩陣,需要先flaten.

func : callable f(x,*args)   

Function to minimise.最小化的目標,一般是loss函數

x0 : ndarray

Initial guess.最初的猜測,即待更新參數初始值。

fprime : callable fprime(x,*args)   

The gradient of func. 梯度函數

If None, then func returns the function value and the gradient (f, g = func(x, *args)), unless approx_grad is True in which case func returns only f.

args : sequence

Arguments to pass to func and fprimefunc and fprime函數的參數

approx_grad : bool

Whether to approximate the gradient numerically (in which case func returns only the function value).

bounds : list

(min, max) pairs for each element in x, defining the bounds on that parameter. Use None for one of min or max when there is no bound in that direction.

m : int

The maximum number of variable metric corrections used to define the limited memory matrix. (The limited memory BFGS method does not store the full hessian but uses this many terms in an approximation to it.)

factr : float

The iteration stops when (f^k - f^{k+1})/max{|f^k|,|f^{k+1}|,1} <= factr * eps, where eps is the machine precision, which is automatically generated by the code. Typical values for factr are: 1e12 for low accuracy; 1e7 for moderate accuracy; 10.0 for extremely high accuracy.

pgtol : float

The iteration will stop when max{|proj g_i | i = 1, ..., n} <= pgtol where pg_i is the i-th component of the projected gradient.

epsilon : float

Step size used when approx_grad is True, for numerically calculating the gradient

iprint : int

Controls the frequency of output. iprint < 0 means no output.

disp : int, optional

If zero, then no output. If positive number, then this over-rides iprint.

maxfun : int

Maximum number of function evaluations.功能評估的最大數量

二、返回值

x : array_like

Estimated position of the minimum.估計最小值的位置,即loss最小時對應的x

f : float

Value of func at the minimum.最小的Func值,即loss值。

d : dict

Information dictionary.

  • d[‘warnflag’] is
    • 0 if converged,
    • 1 if too many function evaluations,
    • 2 if stopped for another reason, given in d[‘task’]
  • d[‘grad’] is the gradient at the minimum (should be 0 ish)
  • d[‘funcalls’] is the number of function calls made. 即梯度更新的次數。

 

 info舉例:

{'grad': array([-7.65604162, -2.14013386,  3.16267967, ..., -1.03821039,
       -4.23868084, -3.17428398]),
 'task': b'STOP: TOTAL NO. of f AND g EVALUATIONS EXCEEDS LIMIT', 
'funcalls': 51, 
'nit': 47, 'warnflag': 1}

補充:更多scipy庫知識:

Python機器學習及分析工具:Scipy篇

Scipy lecture note中文文檔 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章