相依函数型数据均值检验的样本量确定
Determination of Sample Size for Mean Test of Dependent Functional Data
摘要: 随着科学技术的进步,收集和储存函数型数据成为了可能。像金融市场的高频股票数据、气象里的温度数据、空气PM2.5数据等都是天然的函数型数据,并且这些函数型数据之间是相依的,不再满足独立同分布的条件,又称之为相依性函数型数据。当函数型数据具有相依特征时,样本协方差函数不再是总体协方差函数的一致估计量,导致函数主成分计算不准确,进而影响后续的统计推断。本文将利用长期协方差函数得到更加准确的函数型主成分,证明了检验统计量收敛到卡方分布,并给出效应量的度量方法从而计算最低样本量。最后通过数据模拟以及该方法应用到空气质量指数(AQI)和六大空气主要污染物PM2.5、PM10、SO2、NO2、O3、CO的浓度数据证明方法的有效性。
Abstract: With the advancement of science and technology, it has become possible to collect and store functional data. High frequency stock data in financial markets, temperature data in meteorology, PM2.5 data in the air, etc. are all natural functional data, and these functional data are interdependent and no longer meet the conditions of independent and identically distributed data, also known as dependent functional data. When functional data has dependency characteristics, the sample covariance function is no longer a consistent estimate of the population covariance function, resulting in inaccurate calculation of the principal components of the function, which in turn affects subsequent statistical inference. This article will use long-term covariance functions to obtain more accurate functional principal components, prove that the test statistic converges to a chi square distribution, and provide a measurement method for the effect size to calculate the minimum sample size. Finally, the effectiveness of the method was demonstrated through data simulation and its application to the Air Quality Index (AQI) and concentration data of the six major air pollutants PM2.5, PM10, SO2, NO2, O3, and CO.
文章引用:张可. 相依函数型数据均值检验的样本量确定[J]. 统计学与应用, 2024, 13(5): 1796-1806. https://doi.org/10.12677/sa.2024.135176

参考文献

[1] Ramsay, J.O. (1982) When the Data Are Functions. Psychometrika, 47, 379-396. [Google Scholar] [CrossRef
[2] Ramsay, J.O. and Dalzell, C.J. (1991) Some Tools for Functional Data Analysis. Journal of the Royal Statistical Society Series B: Statistical Methodology, 53, 539-561. [Google Scholar] [CrossRef
[3] Fan, J. and Lin, S. (1998) Test of Significance When Data Are Curves. Journal of the American Statistical Association, 93, 1007-1021. [Google Scholar] [CrossRef
[4] Ferraty, F., Quintela-del-Río, A. and Vieu, P. (2011) Specification Test for Conditional Distribution with Functional Data. Econometric Theory, 28, 363-386. [Google Scholar] [CrossRef
[5] Bugni, F.A. (2012) Specification Test for Missing Functional Data. Econometric Theory, 28, 959-1002. [Google Scholar] [CrossRef
[6] Fremdt, S., Steinebach, J.G., Horváth, L. and Kokoszka, P. (2012) Testing the Equality of Covariance Operators in Functional Samples. Scandinavian Journal of Statistics, 40, 138-152. [Google Scholar] [CrossRef
[7] Jarušková, D. (2013) Testing for a Change in Covariance Operator. Journal of Statistical Planning and Inference, 143, 1500-1511. [Google Scholar] [CrossRef
[8] Horvath, L., Kokoszka, P. and Jaruvskova, D. (2009) Two Sample Inference in Functional Linear Models. Canadian Journal of Statistics, 37, 571-591.
[9] Zhang, J., Liang, X. and Xiao, S. (2010) On the Two-Sample Behrens-Fisher Problem for Functional Data. Journal of Statistical Theory and Practice, 4, 571-587. [Google Scholar] [CrossRef
[10] García-Portugués, E., González-Manteiga, W. and Febrero-Bande, M. (2014) A Goodness-Of-Fit Test for the Functional Linear Model with Scalar Response. Journal of Computational and Graphical Statistics, 23, 761-778. [Google Scholar] [CrossRef
[11] Hu, W., Lin, N. and Zhang, B. (2020) Nonparametric Testing of Lack of Dependence in Functional Linear Models. PLOS ONE, 15, e0234094. [Google Scholar] [CrossRef] [PubMed]
[12] Horváth, L., Kokoszka, P. and Reeder, R. (2012) Estimation of the Mean of Functional Time Series and a Two-Sample Problem. Journal of the Royal Statistical Society Series B: Statistical Methodology, 75, 103-122. [Google Scholar] [CrossRef
[13] Cohen J. (1988) Statistical Power Analysis for the Behavioral Sciences. 2th Edition, Routledge, 215-227.
[14] Politis, D.N. and Romano, J.P. (1996) On Flat-Top Kernel Spectral Density Estimators for Homogeneous Random Fields. Journal of Statistical Planning and Inference, 51, 41-53. [Google Scholar] [CrossRef