pandas之dataframe

pandas之dataframe

建立 dataframe

訪問dataframe數據

  • df.loc[,] , df.iloc[]
  • df[][]

新建行,列

實例

// A code block
import pandas as pd
import numpy as np
df = pd.read_csv('data.csv')
df = df.sort_values(['user','date'])
df_B = df[df['indc'] == 'B']
df_S = df[df['indc'] == 'S']
df['vol-sign'] = np.where(df['indc']=='B',df['vol'],-df['vol']
df['cde'] = df.groupby('user')['vol-sign'].cumsum()
\\data.csv
user,vol,prc,date,indc,cde
a01,42,72,2019.07.22,B,
a01,42,72,2019.07.20,B,
a01,42,72,2019.07.22,S,
a01,42,72,2019.07.22,B,
a02,42,72,2019.07.22,B,
a02,42,72,2019.07.22,B,
a02,42,72,2019.07.20,S,
a03,42,72,2019.07.22,B,
a03,42,72,2019.07.20,B,
a03,42,72,2019.07.22,S,
a03,42,72,2019.07.22,B,

注意

dataframe比較適合整體操作,需要進行逐行運算時,效率太低!
建議轉回numpy操作。

比如對於問題:
for i in range(1,10000000):
df.iloc[i,3] = df.iloc[i-1,3]*df[i,1]+df[i,2]

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章