09 ,df 列操作 : 空值處理,每列空值數,單列空值數,刪除空值列,行,空值默認值

1 ,每列有多少空值 : data.isnull().sum()

  1. 代碼 :
if __name__ == '__main__':
    # 讀文件 csv
    data = pd.read_csv("titanic_train.csv")
    # 空值統計
    res = data.isnull().sum()
    print(res)
================================
PassengerId      0
Survived         0
Pclass           0
Name             0
Sex              0
Age            177
SibSp            0
Parch            0
Ticket           0
Fare             0
Cabin          687
Embarked         2

2 ,Age 列有多少空值 : data[“Age”].isnull().sum()

  1. 代碼 :
if __name__ == '__main__':
    # 讀文件 csv
    data = pd.read_csv("titanic_train.csv")
    # 空值統計
    res = data["Age"].isnull().sum()
    print(res)
=========================
177

3 ,刪除空值,行 : res = data.dropna()

  1. 代碼 :
if __name__ == '__main__':
    # 讀文件 csv
    data = pd.read_csv("titanic_train.csv")
    # 取出兩列
    print(data.shape)
    res = data.dropna()
    print(res.shape)
=========================
(891, 12)
(183, 12)

4 ,刪除空值,列 : res = data.dropna(axis=1)

  1. 代碼 :
if __name__ == '__main__':
    # 讀文件 csv
    data = pd.read_csv("titanic_train.csv")
    # 取出兩列
    print(data.shape)
    res = data.dropna(axis=1)
    print(res.shape)
==================================
(891, 12)
(891, 9)

5 ,空值,全部填默認值 :data.fillna(0)

  1. 代碼 : 用 0 填充
if __name__ == '__main__':
    # 讀文件 csv
    data = pd.read_csv("titanic_train.csv")
    # 空值統計
    res = data.isnull().sum()
    print(res)
    # 空值用 0 補全
    data02 = data.fillna(0)
    # 空值統計
    res = data02.isnull().sum()
    print("=======================================")
    print(res)
===========================================================
PassengerId      0
Survived         0
Pclass           0
Name             0
Sex              0
Age            177
SibSp            0
Parch            0
Ticket           0
Fare             0
Cabin          687
Embarked         2
dtype: int64
=======================================
PassengerId    0
Survived       0
Pclass         0
Name           0
Sex            0
Age            0
SibSp          0
Parch          0
Ticket         0
Fare           0
Cabin          0
Embarked       0
dtype: int64

6 ,空值,指定列填值 :data[“Age”] = data[“Age”].fillna(0)

  1. 代碼 :
if __name__ == '__main__':
    # 讀文件 csv
    data = pd.read_csv("titanic_train.csv")
    # 空值統計
    res = data.isnull().sum()
    print(res)
    # 空值用 0 補全,指定列
    data["Age"] = data["Age"].fillna(0)
    # 空值統計
    res = data.isnull().sum()
    print("=======================================")
    print(res)
=======================================================
PassengerId      0
Survived         0
Pclass           0
Name             0
Sex              0
Age            177
SibSp            0
Parch            0
Ticket           0
Fare             0
Cabin          687
Embarked         2
dtype: int64
=======================================
PassengerId      0
Survived         0
Pclass           0
Name             0
Sex              0
Age              0
SibSp            0
Parch            0
Ticket           0
Fare             0
Cabin          687
Embarked         2
dtype: int64
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章