我们如何计算前n行对Total的贡献率为80%?
Item Number Item Amount State
1 Agriculture, forestry and fishing 308507 Oregon
--
10 Gross State Domestic Product
更多数据位于gdrive中的文件中:
https://drive.google.com/open?id=10l84MVcIDIwyWyKa_ftEYrNDwB0C3HWS
我有以下代码来计算前n个贡献行
for col in cols:
PercentageCol = col + ' %'
result_pivot[PercentageCol] = round((result_pivot[col] / result_pivot['Gross State Domestic Product']) * 100,2)
cols = result_pivot.columns[result_pivot.columns.str.contains('%')]
result_pivot = result_pivot[cols].T
result_pivot['Avg'] = round(result_pivot.mean(axis=1), 2)
result_pivot = result_pivot.sort_values(['Avg'], ascending=False)
result_pivot = result_pivot.nlargest(5, columns='Avg')
有没有其他方法可以做到这一点?
如果我理解正确,我们有两个场景。有和没有排序:
我们可以用Amount / sum of Amount
计算百分比。然后我们计算这些值的累积和,并使用布尔索引过滤所有小于0.8
的行,即80%
:
df_80 = df[(df['Amount'] / df['Amount'].sum()).cumsum() < 0.8]
print(df_80)
Item Number Item \
0 1.0 Agriculture, forestry and fishing
1 1.1 Crops
2 1.2 Livestock
3 1.3 Forestry and logging
4 1.4 Fishing and aquaculture
5 2.0 Mining and quarrying
6 3.0 Manufacturing
7 4.0 Electricity, gas, water supply & other utility...
8 5.0 Construction
9 6.0 Trade, repair, hotels and restaurants
10 6.1 Trade & repair services
11 6.2 Hotels & restaurants
12 7.0 Transport, storage, communication & services r...
13 7.1 Railways
14 7.2 Road transport
15 7.3 Water transport
16 7.4 Air transport
17 7.5 Services incidental to transport
18 7.6 Storage
19 7.7 Communication & services related to broadcasting
20 8.0 Financial services
21 9.0 Real estate, ownership of dwelling & professio...
22 10.0 Public administration
23 11.0 Other services
24 12.0 TOTAL GSVA at basic prices
25 13.0 Taxes on Products
26 14.0 Subsidies on products
27 15.0 Gross State Domestic Product
28 16.0 Population ('00)
29 17.0 Per Capita GSDP (Rs.)
.. ... ...
54 12.0 TOTAL GSVA at basic prices
55 13.0 Taxes on Products
56 14.0 Subsidies on products
57 15.0 Gross State Domestic Product
58 16.0 Population ('00)
59 17.0 Per Capita GSDP (Rs.)
60 1.0 Agriculture, forestry and fishing
61 1.1 Crops
62 1.2 Livestock
63 1.3 Forestry and logging
64 1.4 Fishing and aquaculture
65 2.0 Mining and quarrying
66 3.0 Manufacturing
67 4.0 Electricity, gas, water supply & other utility...
68 5.0 Construction
69 6.0 Trade, repair, hotels and restaurants
70 6.1 Trade & repair services*
71 6.2 Hotels & restaurants
72 7.0 Transport, storage, communication & services r...
73 7.1 Railways
74 7.2 Road transport**
75 7.3 Water transport
76 7.4 Air transport
77 7.5 Services incidental to transport
78 7.6 Storage
79 7.7 Communication & services related to broadcasting
80 8.0 Financial services
81 9.0 Real estate, ownership of dwelling & professio...
82 10.0 Public administration
83 11.0 Other services
Amount State
0 308507.0 Oregon
1 140421.0 Oregon
2 30141.0 Oregon
3 15744.0 Oregon
4 122201.0 Oregon
5 3622.0 Oregon
6 1177608.0 Oregon
7 204110.0 Oregon
8 165819.0 Oregon
9 380927.0 Oregon
10 343492.0 Oregon
11 37434.0 Oregon
12 189656.0 Oregon
13 15649.0 Oregon
14 46171.0 Oregon
15 17820.0 Oregon
16 46359.0 Oregon
17 19272.0 Oregon
18 357.0 Oregon
19 44028.0 Oregon
20 233618.0 Oregon
21 407099.0 Oregon
22 346486.0 Oregon
23 180431.0 Oregon
24 3597882.0 Oregon
25 527279.0 Oregon
26 61854.0 Oregon
27 4063307.0 Oregon
28 14950.0 Oregon
29 271793.0 Oregon
.. ... ...
54 39828404.0 Washington
55 4985670.0 Washington
56 1067867.0 Washington
57 43746207.0 Washington
58 266620.0 Washington
59 164077.0 Washington
60 5930617.0 Idaho
61 3070386.0 Idaho
62 1656104.0 Idaho
63 499808.0 Idaho
64 704319.0 Idaho
65 558824.0 Idaho
66 4273567.0 Idaho
67 482470.0 Idaho
68 7314003.0 Idaho
69 8557345.0 Idaho
70 7763847.0 Idaho
71 793498.0 Idaho
72 4020934.0 Idaho
73 147897.0 Idaho
74 2761427.0 Idaho
75 26956.0 Idaho
76 125029.0 Idaho
77 71567.0 Idaho
78 3290.0 Idaho
79 884767.0 Idaho
80 2010306.0 Idaho
81 7287633.0 Idaho
82 2068915.0 Idaho
83 5728645.0 Idaho
df_80 = df[(df['Amount'] / df['Amount'].sum()).sort_values(ascending=False).cumsum() < 0.8]
print(df_80)
Item Number Item \
30 1.0 Agriculture, forestry and fishing
36 3.0 Manufacturing
39 6.0 Trade, repair, hotels and restaurants
51 9.0 Real estate, ownership of dwelling & professio...
54 12.0 TOTAL GSVA at basic prices
55 13.0 Taxes on Products
57 15.0 Gross State Domestic Product
60 1.0 Agriculture, forestry and fishing
68 5.0 Construction
69 6.0 Trade, repair, hotels and restaurants
70 6.1 Trade & repair services*
81 9.0 Real estate, ownership of dwelling & professio...
83 11.0 Other services
84 12.0 TOTAL GSVA at basic prices
85 13.0 Taxes on Products
87 15.0 Gross State Domestic Product
Amount State
30 8015238.0 Washington
36 7756921.0 Washington
39 4986319.0 Washington
51 6970183.0 Washington
54 39828404.0 Washington
55 4985670.0 Washington
57 43746207.0 Washington
60 5930617.0 Idaho
68 7314003.0 Idaho
69 8557345.0 Idaho
70 7763847.0 Idaho
81 7287633.0 Idaho
83 5728645.0 Idaho
84 48233259.0 Idaho
85 5189352.0 Idaho
87 52600230.0 Idaho