我在Stata进行了主成分分析(PCA)。
我的数据集包括8个不同国家的8个财务指标。
例如:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str7 Country double(Investment Profit Income Tax Repayment Leverage Interest Liquidity) int Year
"France" -.1916055239385184 .046331346724579184 .16438012750896466 .073106839282063 30.373216652548326 4.116650784492168 3.222219873614461 .01453109309122077 2010
"UK" -.09287803170279468 .10772082765154019 .19475363707485557 .05803923583546618 31.746409646181174 9.669982727208433 1.2958094802269167 .014273374324088752 2010
"US" -.06262935107629553 .08674901201182428 .1241593221865416 .13387194413811226 25.336612638526013 11.14330064161111 1.954785887176916 .008355601163285917 2010
"Italy" -.038025847122363045 .1523162032749684 .23885658237030563 .2057478638900476 31.02007902336988 2.9660938817562292 6.12544787693943 .011694993164234125 2010
"Germany" -.05454795914578491 .06287079763890834 .09347194572148769 .08730237262847926 35.614342337621174 12.03770488195981 1.1958205191308358 .012467084153714813 2010
"Spain " -.09133982259799572 .1520056836126315 .20905656056324853 .21054797530580743 30.133833346916546 2.0623245902645073 5.122615899157435 .013545432336873187 2010
"Sweden" -.05403262462960799 .20463787181576967 .22924827352771968 .05655833155565016 20.30540887860061 10.392313613725324 .8634381995636089 .008030624504967313 2010
"Norway " -.07560184571862992 .08383822093909514 .15469418498932822 .06569716455818478 29.568228705840234 14.383460621594622 1.5561013535825234 .012843159364225464 2010
"Algeria" -.0494187835163535 .056252436429004446 .09174672864585759 .08143181185307143 34.74103858167055 15.045254276254616 1.2074942921860699 .011578038401820303 2010
"France" -.03831442432584342 .14722819896988698 .22035417794604084 .12183886462162773 28.44763045286005 12.727100288710087 1.405629911115614 .011186908059399987 2011
"UK" -.05002189329928202 .16833493262244398 .2288402623558823 .04977050186975224 27.640103129372747 11.17376089844228 1.1764542835994092 .008386726178729322 2011
"US" -.0871005985124144 .10270482619857023 .1523559355903486 .06775742210623094 26.840586700880362 10.783899184031576 1.454011947763254 .013501919089967212 2011
"Italy" -.1069324103590126 -.5877872620957578 -.47469302172710803 .2004436360021364 23.133243742952658 5.3936761686065875 4.532771849692548 .012586313916956204 2011
"Germany" -.05851794344524515 .09960345907923154 .136805115392161 .1373407846168154 32.6182637042919 14.109738344526052 1.5077699357228835 .013200993625042274 2011
"Spain " -.10650743527105216 -.015785638597076792 .1808727613216441 .05038848927405154 28.22206251292902 10.839614113486853 1.5021425852392374 .012076771099482617 2011
"Sweden" -.09678946710644694 .11801761803893955 .18569993056826523 .1481844716617448 27.439283362903794 5.771154420635893 5.493437819181101 .013820243145673811 2011
"Norway " -.04263379351591438 .09931719473864983 .14469611775596314 .0796835513869996 26.68561168581991 14.06385602832082 1.5200488174887825 .01029136242440406 2011
"Algeria" -.04871983526465598 .2139061303228528 .2728647845448156 .056537570099712456 22.50263575072073 16.919641035094685 .7539881754626142 .009734650338902404 2011
end
轮换后,我将我的第一个组件“负债”和我的第二个组件称为“盈利能力”。
我有2011年,2012年,2013年,2014年等相同的数据。我想使用2010年计算的权重矩阵Stata,并分别应用于2011年,2012年,2013年。我的目标是比较各国之间的债务和盈利能力。
为此,我使用estimate save
和estimates use
命令(Stata手册关于估计的第20章和估计后的PCA命令帮助)。
但是,我无法理解Stata正在拯救什么。是保存2010年计算的分数还是特征值和特征向量?
这是我使用的代码:
tempfile pca
save `pca'
use `pca' if Year==2010
global xlist Investment Profit Income Tax Repayment Leverage Interest Liquidity
pca $xlist, components(2)
estimates save pcaest, replace
predict score
summarize score
use `pca' if Year==2011, clear
estimates use pcaest
predict score
summarize score
Z=b|1,1]*investment+...
。使用2010年的玩具示例:
clear
input str7 Country double(Investment Profit Income Tax Repayment Leverage Interest Liquidity) int Year
"France" -.1916055239385184 .046331346724579184 .16438012750896466 .073106839282063 30.373216652548326 4.116650784492168 3.222219873614461 .01453109309122077 2010
"UK" -.09287803170279468 .10772082765154019 .19475363707485557 .05803923583546618 31.746409646181174 9.669982727208433 1.2958094802269167 .014273374324088752 2010
"US" -.06262935107629553 .08674901201182428 .1241593221865416 .13387194413811226 25.336612638526013 11.14330064161111 1.954785887176916 .008355601163285917 2010
"Italy" -.038025847122363045 .1523162032749684 .23885658237030563 .2057478638900476 31.02007902336988 2.9660938817562292 6.12544787693943 .011694993164234125 2010
"Germany" -.05454795914578491 .06287079763890834 .09347194572148769 .08730237262847926 35.614342337621174 12.03770488195981 1.1958205191308358 .012467084153714813 2010
"Spain " -.09133982259799572 .1520056836126315 .20905656056324853 .21054797530580743 30.133833346916546 2.0623245902645073 5.122615899157435 .013545432336873187 2010
"Sweden" -.05403262462960799 .20463787181576967 .22924827352771968 .05655833155565016 20.30540887860061 10.392313613725324 .8634381995636089 .008030624504967313 2010
"Norway " -.07560184571862992 .08383822093909514 .15469418498932822 .06569716455818478 29.568228705840234 14.383460621594622 1.5561013535825234 .012843159364225464 2010
"Algeria" -.0494187835163535 .056252436429004446 .09174672864585759 .08143181185307143 34.74103858167055 15.045254276254616 1.2074942921860699 .011578038401820303 2010
end
我得到以下结果:
local xlist Investment Profit Income Tax Repayment Leverage Interest Liquidity
pca `xlist', components(2)
Principal components/correlation Number of obs = 9
Number of comp. = 2
Trace = 8
Rotation: (unrotated = principal) Rho = 0.7468
--------------------------------------------------------------------------
Component | Eigenvalue Difference Proportion Cumulative
-------------+------------------------------------------------------------
Comp1 | 3.43566 .896796 0.4295 0.4295
Comp2 | 2.53887 1.23215 0.3174 0.7468
Comp3 | 1.30672 .750756 0.1633 0.9102
Comp4 | .555959 .472866 0.0695 0.9797
Comp5 | .0830926 .0181769 0.0104 0.9900
Comp6 | .0649157 .0526462 0.0081 0.9982
Comp7 | .0122695 .00975098 0.0015 0.9997
Comp8 | .00251849 . 0.0003 1.0000
--------------------------------------------------------------------------
Principal components (eigenvectors)
------------------------------------------------
Variable | Comp1 Comp2 | Unexplained
-------------+--------------------+-------------
Investment | 0.0004 -0.3837 | .6262
Profit | 0.3896 -0.3794 | .1131
Income | 0.4621 -0.1162 | .232
Tax | 0.4146 0.1236 | .3706
Repayment | -0.1829 0.4747 | .3131
Leverage | -0.4685 -0.2596 | .07464
Interest | 0.4580 0.2625 | .1045
Liquidity | -0.0082 0.5643 | .1913
------------------------------------------------
要查看pca
命令返回的项目类型:
ereturn list
scalars:
e(N) = 9
e(f) = 2
e(rho) = .7468162625387222
e(trace) = 8
e(lndet) = -13.76082122673546
e(cond) = 36.93476257313668
macros:
e(cmdline) : "pca Investment Profit Income Tax Repayment Leverage Interest Liquidity, components(2)"
e(cmd) : "pca"
e(title) : "Principal components"
e(marginsnotok) : "_ALL"
e(estat_cmd) : "pca_estat"
e(rotate_cmd) : "pca_rotate"
e(predict) : "pca_p"
e(Ctype) : "correlation"
e(properties) : "nob noV eigen"
matrices:
e(sds) : 1 x 8
e(means) : 1 x 8
e(C) : 8 x 8
e(Psi) : 1 x 8
e(Ev) : 1 x 8
e(L) : 8 x 2
functions:
e(sample)
将包含特征向量的返回矩阵保存为下一年的变量的一种方法是创建矩阵的副本并加载2011数据:
matrix A = e(L)
clear
input str7 Country double(Investment Profit Income Tax Repayment Leverage Interest Liquidity) int Year
"France" -.03831442432584342 .14722819896988698 .22035417794604084 .12183886462162773 28.44763045286005 12.727100288710087 1.405629911115614 .011186908059399987 2011
"UK" -.05002189329928202 .16833493262244398 .2288402623558823 .04977050186975224 27.640103129372747 11.17376089844228 1.1764542835994092 .008386726178729322 2011
"US" -.0871005985124144 .10270482619857023 .1523559355903486 .06775742210623094 26.840586700880362 10.783899184031576 1.454011947763254 .013501919089967212 2011
"Italy" -.1069324103590126 -.5877872620957578 -.47469302172710803 .2004436360021364 23.133243742952658 5.3936761686065875 4.532771849692548 .012586313916956204 2011
"Germany" -.05851794344524515 .09960345907923154 .136805115392161 .1373407846168154 32.6182637042919 14.109738344526052 1.5077699357228835 .013200993625042274 2011
"Spain " -.10650743527105216 -.015785638597076792 .1808727613216441 .05038848927405154 28.22206251292902 10.839614113486853 1.5021425852392374 .012076771099482617 2011
"Sweden" -.09678946710644694 .11801761803893955 .18569993056826523 .1481844716617448 27.439283362903794 5.771154420635893 5.493437819181101 .013820243145673811 2011
"Norway " -.04263379351591438 .09931719473864983 .14469611775596314 .0796835513869996 26.68561168581991 14.06385602832082 1.5200488174887825 .01029136242440406 2011
"Algeria" -.04871983526465598 .2139061303228528 .2728647845448156 .056537570099712456 22.50263575072073 16.919641035094685 .7539881754626142 .009734650338902404 2011
end
然后你可以简单地使用svmat
命令:
svmat A
list A* if _n < 9
+-----------------------+
| A1 A2 |
|-----------------------|
1. | .0003921 -.383703 |
2. | .3895898 -.3793983 |
3. | .4621098 -.1162487 |
4. | .4146066 .1235683 |
5. | -.1828703 .4746658 |
|-----------------------|
6. | -.4685374 -.2596268 |
7. | .457974 .2624738 |
8. | -.0081538 .5643047 |
+-----------------------+
编辑:
根据评论修改:
use X1, clear
local xlist Investment Profit Income Tax Repayment Leverage Interest Liquidity
forvalues i = 1 / 5 {
pca `xlist' if year == 201`i', components(2)
matrix A201`i' = e(L)
svmat A201`i'
generate B201`i'1 = (A201`i'1 * Investment) + (A201`i'1 * Profit) + ///
(A201`i'1 * Income) + (A201`i'1 * Tax) + ///
(A201`i'1 * Repayment) + (A201`i'1 * Leverage) + ///
(A201`i'1 * Interest) + (A201`i'1 * Liquidity)
generate B201`i'2 = (A201`i'2 * Investment) + (A201`i'2 * Profit) + ///
(A201`i'2 * Income) + (A201`i'2 * Tax) + ///
(A201`i'2 * Repayment) + (A201`i'2 * Leverage) + ///
(A201`i'2 * Interest) + (A201`i'2 * Liquidity)
}