PM2.5數據的清洗,彙總與製作散點圖(含源數據鏈接)

相關數據:

北京空氣質量(2012-2018年)

點擊打開鏈接

全國空氣質量歷史數據 | 北京市空氣質量歷史數據(每週更新)

點擊打開鏈接

相關程序:
程序:(補全單個csv中的數據,用的是該天每個站點的中位數,並整理成date,type,mean三類的csv文件)
# -*- coding: UTF-8 -*-
import pandas as pd
import datetime
import csv

def writer_data_extra(date,type,mean):
    csvfile = open('.\\beijing_20180101-20180324/aqi1.csv', 'a', newline='')
    writer = csv.writer(csvfile)
    info = [date,type,mean]
    writer.writerow(info)
    csvfile.close()

def run_extra():
    begin = datetime.date(2018,1,1)
    end = datetime.date(2018,3,24)
    d = begin
    delta = datetime.timedelta(days=1)
    q = 0
    while d <= end:
        num = d.strftime('%m%d')
        filename = pd.read_csv('./beijing_20180101-20180324/beijing_extra_2018' + num + '.csv')
        for j in range(0, 8, 2):
            nf = filename[j::8]
            #奇數的語句print x[::2]
            #偶數的語句print x[1::2]
            for i in nf.columns[3:]:
                a = nf[str(i)].median()
                nf.fillna(a, inplace=True)
            date = list(set(nf['date']))[0]
            type = list(set(nf['type']))[0]
            sum = 0
            for i in nf.columns[3:]:
                b = nf[str(i)].mean()
                sum += b
            mean = round(sum / len(nf.columns[3:]), 1)
            # print('date:{} type:{} val:{}'.format(date, type, mean))
            writer_data_extra(date, type, mean)
            q += 1
            if q % 10 == 0:
                print("正在轉錄...")
        d += delta
    print("**********轉錄完畢**************")

if __name__ == '__main__':
    run_extra()

程序:將兩個表根據相同項(date)合併

import  pandas as pd
import csv

def writer_data_all(date,type,val):
    csvfile = open('.\\beijing_20180101-20180324/aqi_all.csv', 'a', newline='')
    writer = csv.writer(csvfile)
    # writer.writerow(('date', 'type', 'val'))
    info = [date,type,val]
    writer.writerow(info)
    csvfile.close()

def main():
    filename1 = pd.read_csv('./beijing_20180101-20180324/aqi1.csv')
    filename2 = pd.read_csv('./beijing_20180101-20180324/aqi2.csv')
    fn3 = pd.concat([filename1,filename2])
    fn4= fn3.sort_values(by='date',ascending=True).reset_index(drop=True)
    print(fn4.T)


if __name__ == '__main__':
    main()

程序:製作散點圖

import pandas as pd
import matplotlib.pyplot as plt

def main():
    df = pd.read_csv('./data.csv')
    list = ['NO2', 'SO2', 'O3', 'CO', 'PM10', 'AQI']
    for i in list:
        item = df[i]
        PM2_5 = df['PM2.5']

        plt.scatter(item,PM2_5)
        plt.title(i + ' And PM2.5')
        plt.xlabel(i)
        plt.ylabel('PM2.5')
        plt.savefig('./'+ i + 'AndPM2.5.png')
        plt.show()

if __name__ == '__main__':
    main()

效果:






發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章