Job Market Paper

1. Statistical Inference on Partially Linear Panel Model under Unobserved Linearity

Abstract: A new statistical procedure, based on a modified spline basis, is proposed to identify the linear components in the panel data model with fixed effects. Under some mild assumptions, the proposed procedure is shown to consistently estimate the underlying regression function, correctly select the linear components, and effectively conduct the statistical inference. When compared to existing methods for detection of linearity in the panel model, our approach is demonstrated to be theoretically justified as well as practically convenient. We provide a computational algorithm that implements the proposed procedure along with a path-based solution method for linearity detection, which avoids the burden of selecting the tuning parameter for the penalty term. Monte Carlo simulations are conducted to examine the finite sample performance of our proposed procedure with detailed findings that confirm our theoretical results in the paper. Applications to Aggregate Production and Environmental Kuznets Curve data also illustrate the necessity for detecting linearity in the partially linear panel model.
Summary: We develop a data-driven procedure to determine which variables are linear in the partially linear panel model.

2. Deep Instrument Variables Estimator

Abstract: The endogeneity issue is fundamentally essential in econometrics and statistics. Many empirical applications may suffer from the omission of explanatory variables, measurement error and simultaneous causality. We propose a two-stage estimator based on deep neural network (Deep Instrument Variables Estimator) to overcome endogeneity in the linear instrument variables model. A critical drawback of existing methods is that when the number of instruments is large, one has to sacrifice the statistical efficiency for avoiding curse of dimensionality, or impose structural assumptions and explicitly rely on the specified structures to obtain an efficient estimator. We impose a latent structural assumption on the reduced form equation, which is more general and includes most of the popular statistical and econometric models. Based on deep neural network, we prove that our estimator can effectively capture the intrinsic structures of the reduced form equation without knowing the prior information of the structures. Moreover, we show that the proposed estimator is root-n consistent and semiparametric efficient. Simulation studies on synthetic data confirm the validity of our theoretical results. We also apply the proposed method to a real-world dataset to study the relationship between vehicle sales and price.
Summary: We use deep neural network to study linear instrument variables model and discuss when and how inference based on deep neural network will be better than that using classical methods.

3. Optimal Nonparametric Inference via Deep Neural Network

Abstract: Deep neural network is a state-of-art method in modern science and technology. Much statistical literature have been devoted to understanding its performance in nonparametric estimation, whereas the results are suboptimal due to a redundant logarithmic sacrifice. In this paper, we show that such log-factors are not necessary. We derive upper bounds for the L^2 minimax risk in nonparametric estimation. Sufficient conditions on network architectures are provided such that the upper bounds become optimal (without log-sacrifice). Our proof relies on an explicitly constructed network estimator based on tensor product B-splines. We also derive asymptotic distributions for the constructed network and a relating hypothesis testing procedure. The testing procedure is further proven as minimax optimal under suitable network architectures.
Summary: We theoretically show that deep neural network can achieve statistical optimality in terms of estimation.