python 取出 Mongdb 中的數據 轉化成DataFrame 然後用pandas處理數據

這段時間再玩python ,數據源來源於mongdb ,數據處理方式用的是pandas

剛開始是用的一個比較麻煩的轉化,直接上代碼:

方法一:

import pandas as pd
from pymongo import MongoClient

client = MongoClient('192.168.1.5',10070)

db = client.dbtest

collection=db.data_table
items = collection.find()

dateId = []
ai_type = []
ai_name = []
quorum = []
priceUSD = []
ai_disageform = []
country = []
continent  = []
company = []
ai_cap_tr = []
n = 0
for i in items:
     n= n+1
     print("正在輸出 %s 條"%n)
     keys = i.keys()
     if 'ai_disageform' in keys:
         ai_disageform.append(i['ai_disageform'])
     else:
         ai_disageform.append('')
     if 'date' in keys:
         t = str(i['date'])
         dateId.append(t[:10])
     else:
         dateId.append('')
     if 'ai_type' in keys:
         ai_type.append(i['ai_type'])
     else:
         ai_type.append('')
     if 'continent' in keys:
         continent.append(i['continent'])
     else:
         continent.append('')
     if 'quorum' in keys:
         quorum.append(i['quorum'])
     else:
         quorum.append('')
     if 'priceUSD' in keys:
         priceUSD.append(i['priceUSD'])
     else:
         priceUSD.append('')
     if 'country' in keys:
         country.append(i['country'])
     else:
         country.append('')
     if 'ai_name' in keys:
         ai_name.append(i['ai_name'])
     else:
         ai_name.append('')
     if 'company' in keys:
         company.append(i['company'])
     else:
         company.append('')
     if 'ai_cap_tr' in keys:
         ai_cap_tr.append(i['ai_cap_tr'])
     else:
         ai_cap_tr.append('')

df = pd.DataFrame({'dateId':dateId,
                   'ai_type':ai_type,
                   'ai_name':ai_name,
                   'quorum':quorum,
                   'priceUSD':priceUSD,
                   'ai_disageform':ai_disageform,
                   'country':country,
                   'continent':continent,
                   'ai_cap_tr':ai_cap_tr,
                   'company':company})

df.to_csv('../ncbdata/b.csv', encoding = "utf-8",index=None)



具體思路:經測驗,每條記錄是dict類型的,將每個鍵裏的值放到不同的數組中,然後創建dataframe對象。

方法二:

import pandas as pd
import numpy as np
import  pymongo
from pymongo import MongoClient
import json

#連接mongdb
def connectMongdb():

    client = MongoClient('192.168.1.5',10070)

    db = client.dbtest

    collection = db.data_table
    items = collection.find()
    return items

#轉化爲df
def tran_df():
    items = connectMongdb()
    temp = []
    for dict in items:
        del dict['_id']
        dict['date'] = dict['date'].strftime("%Y-%m-%d")
        temp.append(dict)
    data_employee = pd.read_json(json.dumps(temp))
    data_employee_ri = data_employee.reindex(columns=['date', 'ai_type', 'ai_name'])
    data_employee_ri.to_csv('data/a.csv')


def main():
    tran_df()

if __name__ == "__main__":
    main()

具體思路:將每一個字典放到一個數組裏,然後通過read_json() 方法轉化爲df對象。

發佈了51 篇原創文章 · 獲贊 20 · 訪問量 10萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章