相依误差线性模型中的主成分s-K估计

2015-08-16 09:20何道江

吉林大学学报（理学版） 2015年3期

关键词：估计量共线性均方

周玲,何道江

(安徽师范大学数学计算机科学学院,安徽芜湖 241003)

相依误差线性模型中的主成分s-K估计

周玲,何道江

(安徽师范大学数学计算机科学学院,安徽芜湖 241003)

为同时克服线性回归模型的自相关性和回归变量间的复共线性,通过融合主成分回归估计和s-K估计,提出一类新估计,称为主成分s-K估计;并在均方误差阵意义下,得到了这类估计分别优于广义最小二乘估计、主成分估计、r-k和s-K估计的充要条件.Monto Carlo数值模拟表明,新估计是一种同时克服自相关性和复共线性的有效方法.

自相关性;复共线性;主成分回归估计;s-K估计;均方误差阵

为了克服统计学中线性模型的复共线性问题,常用的方法是使用有偏估计.如Stein估计[1]、主成分回归(PCR)估计[2]、普通岭(ORR)估计[3]、Liu估计[4]和s-K估计[5]等.此外,融合两种不同估计可能会保留这两种估计的优点.Baye等[6]将PCR估计与ORR估计融合,提出了r-k估计;Chang等[7]将PCR估计与两参数估计[8]融合,提出了主成分两参数估计(PCTP).为了克服模型中自相关的影响,Aitken[9]运用OLS技术引入了广义最小二乘(GLS)估计;吴燕等[10]基于模型的参数信息提出了一类新的s-K估计.但此时模型中的复共线性可能仍然存在,进而GLS估计由于具有很大的方差而给出不可靠的估计.目前,同时解决自相关和复共线性问题的研究已有许多结果[11-17].本文为同时克服自相关误差和复共线性问题,通过融合PCR估计和s-K估计,提出一类新的估计,称为主成分s-K估计,并进一步考察新估计相对于这些现有估计的优良性.

1 新估计量的定义

考虑如下线性回归模型:

(1)

其中:Y是n×1维可观测随机向量;X是n×p维列满秩阵;β是p×1维未知参数向量;ε是n×1维误差向量;V是一个已知的n×n阶正定矩阵.于是,存在一个n×n阶非奇异阵P,使得P′P=V-1.用P左乘式(1),则模型(1)可写成

(2)

记Y*=PY,X*=PX,ε*=Pε,则式(2)可表达为

(3)

式(3)即为转换模型[11].

Λr=diag(λ1,λ2,…,λr),Λp-r=diag(λr+1,λr+2,…,λp).

对于转换模型,由文献[18]可知,r-k估计[6]可写为

(4)

(5)

其中k≥0和0

(6)

将X*和Y*分别代换成X和Y的关系式,则模型(1)的s-K估计可写成

(7)

其中:s≥1;K=diag(k1,k2,…,kp),且ki≥0,i=1,2,…,p.

下面给出β的一个新估计,它由PCR估计和s-K估计融合而成,形式如下:

(8)

(9)

(10)

(11)

(12)

2 新估计量在均方误差阵意义下的优良性

(13)

(14)

(15)

(16)

(17)

(18)

(19)

(20)

(21)

证明:由式(14),(15)得

(22)

且C可写为

(23)

因此,有

(24)

(25)

等价于式(21).证毕.

在式(21)中,取r=p,可得:

(26)

(27)

此为文献[16]的结论.

(28)

此为文献[11]的结论.

这里(U⋮v)是一个酉矩阵(U可能不存在),Δ是一个正定对角阵(当U存在时才出现),且λ是一个正数.进一步,条件1)～3)均不依赖于广义逆D-∈G(D)的选择.

(29)

有时候也会想，其实现实世界并不是全然美好的，而是曲折、复杂的，要不要把这样的面貌如实呈现在小人儿面前呢？可就好像盖楼房，首先要做的是打地基，你可以说楼房是高高地往上去盖的，可是地基却得深深地向下去打啊！2岁多的孩子，还处于主要是模仿、重复大人的语言，而自己的思考能力才刚刚起步的阶段，我选择先用那些光明、美好、积极的材料为他打下地基，为他将来面对世界的复杂性准备下基本的心理和情感资源。

(30)

另一方面,

因此,充分条件化为

类似地,可得:

3 数值模拟

为了进一步考察所提估计类的均方误差,下面进行Monte Carlo数值模拟.设计矩阵X=(xij)n×p由下式给出:

(31)

其中ωij(i=1,2,…,n;j=1,2,…,p+1)是独立的标准正态伪随机数,且γ是给定的数,γ2表示任意两个解释变量之间的相关系数.响应变量由下式给出:

(32)

这里ε=(ε1,ε2,…,εn)′是均值为0、协方差阵为σ2V的正态随机变量.

(33)

分别取ρ=0.5,0.8.与文献[12,16]一致,取β的真实值为X′V-1X最大特征值所对应的标准化特征向量.此外,取s=1.01,1.001.为方便,K=diag(k1,k2,k3,k4,k5)分别取为A1,A2,A3,B1,B2,B3,其中:

A1=diag(0.1,0.1,0.1,0.1,0.1);A2=diag(0.1,0.1,1,1,1);A3=diag(0.1,1,1,1,1);

B1=diag(1.5,1.5,1.5,1.5,1.5);B2=diag(1.5,1.5,15,15,15);B3=diag(1.5,15,15,15,15).

表1 当s=1.01,ρ=0.5,k=0.1时各估计的均方误差Table 1 Estimated MSE values with s=1.01,ρ=0.5,k=0.1

表2 当s=1.01,ρ=0.5,k=1.5时各估计的均方误差Table 2 Estimated MSE values with s=1.01,ρ=0.5,k=1.5

表3 当s=1.01,ρ=0.8,k=0.1时各估计的均方误差Table 3 Estimated MSE values with s=1.01,ρ=0.8,k=0.1

表4 当s=1.01,ρ=0.8,k=1.5时各估计的均方误差Table 4 Estimated MSE values with s=1.01,ρ=0.8,k=1.5

表5 当s=1.001,ρ=0.5,k=0.1时各估计的均方误差Table 5 Estimated MSE values with s=1.001,ρ=0.5,k=0.1

表6 当s=1.001,ρ=0.5,k=1.5时各估计的均方误差Table 6 Estimated MSE values with s=1.001,ρ=0.5,k=1.5

表7 当s=1.001,ρ=0.8,k=0.1时各估计的均方误差Table 7 Estimated MSE values with s=1.001,ρ=0.8,k=0.1

表8 当s=1.001,ρ=0.8,k=1.5时各估计的均方误差Table 8 Estimated MSE values with s=1.001,ρ=0.8,k=1.5

综上,本文提出了一个新的估计量同时克服模型的自相关性和复共线性.在均方误差阵意义下,比较了新估计量与GLS,PCR,r-k和s-K估计量,并给出了新估计量优于其他估计量的条件.数值模拟表明,新估计是一种同时克服自相关性和复共线性的有效方法.

[1] Stein C.Inadmissibility of the Usual Estimator for the Mean of Multivariate Normal Distribution [C]//Proceedings of the Third Berkley Symposium on Mathematical and Statistics Probability.[S.l.]:University of Califorinia Press,1956,1:197-206.

[2] Massy W F.Principal Components Regression in Exploratory Statistical Research [J].Journal of the American Statistical Association,1965,60(309):234-256.

[3] Hoerl A E,Kennard R W.Ridge Regression:Biased Estimation for Nonorthogonal Problems [J].Technometrics,1970,12(1):55-67.

[4] LIU Kejian.A New Class of Biased Estimate in Linear Regression [J].Communications in Statistics:Theory and Methods,1993,22(2):393-402.

[5] 许莹,何道江.混合系数线性模型参数的一类新估计 [J].数学物理学报,2013,33A(4):702-708.(XU Ying,HE Daojiang.A New Class of Estimators for Coefficients in Mixed Effect Linear Model [J].Acta Mathematica Scientia,2013,33A(4):702-708.)

[6] Baye M R,Parker D F.Combining Ridge and Principal Component Regression:A Money Demand Illustration [J].Communications in Statistics:Theory and Methods,1984,13(2):197-205.

[7] CHANG Xinfeng,YANG Hu.Combining Two-Parameter and Principal Component Regression Estimators [J].Statistical Papers,2012,53(3):549-562.

[8] YANG Hu,CHANG Xinfeng.A New Two-Parameter Estimator in Linear Regression [J].Communications in Statistics:Theory and Methods,2010,39(6):923-934.

[9] Aitken A C.On Least Squares and Linear Combinations of Observations [J].Proceedings of the Royal Society of Edinburgh,1936,55:42-48.

[10] 吴燕,何道江.线性模型参数一类新的s-K估计 [J].吉林大学学报:理学版,2014,52(1):45-50.(WU Yan,HE Daojiang.A New Class ofs-KEstimators in the Linear Model [J].Journal of Jilin University:Science Edition,2014,52(1):45-50.)

[11] Trenkler G.On the Performance of Biased Estimators in the Linear Regression Model with Correlated or Heteroscedastic Errors [J].Journal of Econometrics,1984,25(1/2):179-190.

[12] Firinguetti L L.A Simulation Study of Ridge Regression Estimators with Autocorrelated Errors [J].Communications in Statistics:Simulation and Computation,1989,18(2):673-702.

[13] Bayhan G M,Bayhan M.Forecasting Using Autocorrelated Errors and Multicollinear Predictor Variables [J].Computers &Industrial Engineering,1998,34(2):413-421.

[14] Güler H,Kaciranlar S.A Comparison of Mixed and Ridge Estimators of Linear Models [J].Communications in Statistics:Simulation and Computation,2009,38(2):368-401.

[15] Özkale M R.A Stochastic Restricted Ridge Regression Estimator [J].Journal of Multivariate Analysis,2009,100(8):1706-1716.

[17] HUANG Jiewu,YANG Hu.On a Principal Component Two-Parameter Estimator in Linear Model with Autocorrelated Errors [J].Statistical Papers,2015,56(1):217-230.

[18] XU Jianwen,YANG Hu.On the Restrictedr-kClass Estimator and the Restrictedr-dClass Estimator in Linear Regression [J].Journal of Statistical Computation and Simulation,2011,81(6):679-691.

[19] Trenkler G,Trenkler D.A Note on Superiority Comparisons of Homogeneous Linear Estimators [J].Communications in Statistics:Theory and Methods,1983,12(7):799-808.

[20] Baksalary J K,Trenkler G.Nonnegative and Positive Definiteness of Matrices Modified by Two Matrices of Rank One [J].Linear Algebra and Its Applications,1991,151:169-184.

[21] Judge G G,Griffiths W E,Hill R C,et al.The Theory and Practice of Econometrics [M].2nd ed.New York:John Wiley and Sons,1985.

(责任编辑：赵立芹)

PrincipalComponentss-KClassEstimatorintheLinearModelwithCorrelatedErrors

ZHOU Ling,HE Daojiang

(SchoolofMathematicsandComputerScience,AnhuiNormalUniversity,Wuhu241003,AnhuiProvince,China)

To combat autocorrelation in errors and multicollinearity among the regressors in linear regression model,we proposed a new estimator by combining the principal components regression (PCR)estimator and thes-Kestimator.Then necessary and sufficient conditions for the superiority of the new estimator over the GLS,the PCR,ther-kand thes-Kestimators were derived by the mean squared error matrix criterion.Finally,a Monte Carlo simulation study was carried out to investigate the performance of the proposed estimator.

autocorrelation;multicollinearity;principal components regression estimator;s-Kestimator;mean squared error matrix

10.13413/j.cnki.jdxblxb.2015.03.17

2014-07-16.

周玲(1989—),女,汉族,硕士研究生,从事数理统计的研究,E-mail:lingzhou1989@163.com.通信作者:何道江(1980—),男,汉族,博士,教授,从事数理统计的研究,E-mail:djheahnu@163.com.

安徽省自然科学基金(批准号:1308085QA13).

O212.2

：A

：1671-5489(2015)03-0444-07