python製作數據透視表pivot_table方法詳解

數據透視表(Pivot Table)是一種交互式的表,可以進行某些計算,如求和與計數等。所進行的計算與數據跟數據透視表中的排列有關。

之所以稱爲數據透視表,是因爲可以動態地改變它們的版面佈置,以便按照不同方式分析數據,也可以重新安排行號、列標和頁字段。每一次改變版面佈置時,數據透視表會立即按照新的佈置重新計算數據。另外,如果原始數據發生更改,則可以更新數據透視表。

函數詳解

df.pivot_table(values=None, index=[列名],columns=[列名], aggfunc='mean', fill_value=None,  dropna=True, margins=False,margins_name='All')

#df:  要進行統計的數據集,類似與excel數據透視表裏的選擇數據區域,在該區域裏進行計算
#values: 要進行彙總結算的列名,類似於數據透視表中的‘數值’
#index:   數據透視表的行標籤,類似於excel透視表中的‘行標籤’
#columns:數據透視表的列標籤,類似於excel透視表中的‘列標籤’
#aggfunc="mean":  彙總結算的計算方式,類似於在excel數據中選定列了以後選擇是求和還是取平均
#margins: 是否對計算結果再進行求和計算,默認爲Flase,若爲True則會添加分項的的小計,即每一行和列的和
#margins_name='All':求和結果的命名,默認爲‘ALL'

示列

    Examples
    --------
    >>> df
       A   B   C      D
    0  foo one small  1
    1  foo one large  2
    2  foo one large  2
    3  foo two small  3
    4  foo two small  3
    5  bar one large  4
    6  bar one small  5
    7  bar two small  6
    8  bar two large  7
    
    >>> table = pivot_table(df, values='D', index=['A', 'B'],
    ...                     columns=['C'], aggfunc=np.sum)
    >>> table
              small  large
    foo  one  1      4
         two  6      NaN
    bar  one  5      4
         two  6      7

源碼:

pivot_table(data, values=None, index=None, columns=None, aggfunc='mean', fill_value=None, margins=False, dropna=True, margins_name='All')

values : column to aggregate, optional

index : column, Grouper, array, or list of the previous If an array is passed, it must be the same length as the data. The list can contain any of the other types (except list). Keys to group by on the pivot table index. If an array is passed, it is being used as the same manner as column values.

columns : column, Grouper, array, or list of the previous If an array is passed, it must be the same length as the data. The list can contain any of the other types (except list). Keys to group by on the pivot table column. If an array is passed, it is being used as the same manner as column values.

aggfunc : function or list of functions, default numpy.mean If list of functions passed, the resulting pivot table will have hierarchical columns whose top level are the function names (inferred from the function objects themselves)

fill_value : scalar, default None Value to replace missing values with

margins : boolean, default False Add all row / columns (e.g. for subtotal / grand totals)

dropna : boolean, default True Do not include columns whose entries are all NaN

margins_name : string, default 'All' Name of the row / column that will contain the totals when margins is True.

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章