我們的數據並非沒有爭議無需改造——用戶需要確保 geoJSON 鍵與熊貓數據框架之間具有1:1的映射。下面就是之前實例所需的簡明的數據框架映射:我們的國家信息是一個列有FIPS 碼、國家名稱、以及經濟信息(列名省略)的
CSV 文件:
1
2
3
4
5
|
00000 ,US,United
States, 154505871 , 140674478 , 13831393 , 9 , 50502 , 100
01000 ,AL,Alabama, 2190519 , 1993977 , 196542 , 9 , 41427 , 100
01001 ,AL,Autauga
County, 25930 , 23854 , 2076 , 8 , 48863 , 117.9
01003 ,AL,Baldwin
County, 85407 , 78491 , 6916 , 8.1 , 50144 , 121
01005 ,AL,Barbour
County, 9761 , 8651 , 1110 , 11.4 , 30117 , 72.7
|
在 geoJSON 中,我們的國家形狀是以 FIPS 碼爲id 的(感謝 fork 自 Trifacta 的相關信息)。爲了簡便,實際形狀已經做了簡略,在示例數據可以找到完整的數據集:
1
2
3
4
5
6
7
8
9
10
11
|
{ "type" : "FeatureCollection" , "features" :[
{ "type" : "Feature" , "id" : "1001" , "properties" :{ "name" : "Autauga" }
{ "type" : "Feature" , "id" : "1003" , "properties" :{ "name" : "Baldwin" }
{ "type" : "Feature" , "id" : "1005" , "properties" :{ "name" : "Barbour" }
{ "type" : "Feature" , "id" : "1007" , "properties" :{ "name" : "Bibb" }
{ "type" : "Feature" , "id" : "1009" , "properties" :{ "name" : "Blount" }
{ "type" : "Feature" , "id" : "1011" , "properties" :{ "name" : "Bullock" }
{ "type" : "Feature" , "id" : "1013" , "properties" :{ "name" : "Butler" }
{ "type" : "Feature" , "id" : "1015" , "properties" :{ "name" : "Calhoun" }
{ "type" : "Feature" , "id" : "1017" , "properties" :{ "name" : "Chambers" }
{ "type" : "Feature" , "id" : "1019" , "properties" :{ "name" : "Cherokee" }
|
我們需要匹配 FIPS 碼,確保匹配正確,否則 Vega 無法正確的壓縮數據:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
|
import
json
import
pandas as pd
with open (county_geo, 'r' )
as f:
get_id =
json.load(f)
county_codes =
[x[ 'id' ] for
x in
get_id[ 'features' ]]
county_df =
pd.DataFrame({ 'FIPS_Code' :
county_codes}, dtype = str )
df =
pd.read_csv(county_data, na_values = [ '
' ])
df[ 'FIPS_Code' ] =
df[ 'FIPS_Code' ].astype( str )
merged =
pd.merge(df, county_df, on = 'FIPS_Code' ,
how = 'inner' )
merged =
merged.fillna(method = 'pad' )
>>>merged.head()
FIPS_Code
State Area_name Civilian_labor_force_2011 Employed_2011 \
0
1001
AL Autauga County 25930
23854
1
1003
AL Baldwin County 85407
78491
2
1005
AL Barbour County 9761
8651
3
1007
AL Bibb County 9216
8303
4
1009
AL Blount County 26347
24156
Unemployed_2011
Unemployment_rate_2011 Median_Household_Income_2011 \
0
2076
8.0
48863
1
6916
8.1
50144
2
1110
11.4
30117
3
913
9.9
37347
4
2191
8.3
41940
Med_HH_Income_Percent_of_StateTotal_2011
0
117.9
1
121.0
2
72.7
3
90.2
4
101.2
|
現在,我們可以快速生成不同的等值線:
1
2
|
vis.tabular_data(merged,
columns = [ 'FIPS_Code' , 'Civilian_labor_force_2011' ])
vis.to_json(path)
|
|
頂 翻譯的不錯哦!
|