Abstract:
The rapid growth of online network platforms generates large-scale network data and it poses great challenges for statistical analysis using the spatial autoregression (SAR) model. In this work, we develop a novel distributed estimation and statistical inference framework for the SAR model on a distributed system. We first propose a distributed network least squares approximation (DNLSA) method. This enables us to obtain a one-step estimator by taking a weighted average of local estimators on each worker. Afterwards, a refined two-step estimation is designed to further reduce the estimation bias. For statistical inference, we utilize a random projection method to reduce the expensive communication cost.The theoretical findings and computational advantages are validated by several numerical simulations implemented on the Spark system. Lastly, an experiment on the Yelp dataset further illustrates the usefulness of the proposed methodology.
Distributed estimation and inference for spatial autoregression model with large scale networks,
Journal of Econometrics:Volume 238, Issue 2.
http://doi.org/10.1016/j.jeconom.2023.105629.
Code available at http://github.com/feng-li/local-information-advantage/
Keywords:
Spatial autoregressionLarge-scale network dataDistributed systemLeast squares approximationRandom projection
About the author:
Hanson Wang is a Professor of Business Statistics in the Department of Business Statistics and Econometrics at Guanghua School of Management in Peking University . He is a recipient of the Outstanding Young Scholar Grant from NSFC, a professor of the Changjiang Scholars Program by the Ministry of Education, and the founding president of the Chinese Statistical Association of Young Scholars. He is also a Fellow of the Institute of Mathematical Statistics (IMS), a fellow of the American Statistical Association (ASA), and an Elected Member of the International Statistical Institute (ISI). Throughout his career, he has served as associate editor or editor for 9 international academic journals. He has published over 100 articles in various professional journals both domestically and internationally, co-authored one English monograph, and co-authored four Chinese textbooks. He has been recognized as a highly cited scholar by Elsevier in the fields of mathematics (2014-2019), applied economics (2020), and statistics (2021-2022).