Publications & Working Papers
2026
- Integrating Heterogeneous Information in Randomized Experiments: A Unified Calibration FrameworkWei Ma, Zeqi Wu*, and Zheng ZhangWorking Paper, Mar 2026
In modern randomized experiments, large-scale data collection increasingly yields rich baseline covariates and auxiliary information from multiple sources. Such information offers opportunities for more precise treatment effect estimation, but it also raises the challenge of integrating heterogeneous information coherently without compromising validity. Covariate-adaptive randomization (CAR) is widely used to improve covariate balance at the design stage, but it typically balances only a small set of covariates used to form strata, making covariate adjustment at the analysis stage essential for more efficient estimation of treatment effects. Beyond standard covariate adjustment, it is often desirable to incorporate auxiliary information, including cross-stratum information, predictions from various machine learning models, and external data from historical trials or real-world sources. While this auxiliary information is widely available, existing covariate adjustment methods under CAR primarily exploit within-stratum covariates and do not provide a coherent mechanism for integrating it. We propose a unified calibration framework that integrates such information through an information proxy vector and calibration weights defined by a convex optimization problem. The resulting estimator recovers many recent covariate adjustment procedures as special cases while providing a systematic mechanism for both internal and external information borrowing within a single framework. We establish large-sample validity and a no-harm efficiency guarantee, showing that incorporating additional information sources cannot increase asymptotic variance, and we extend the theory to settings in which both the number of strata and the number of information sources grow with the sample size. Simulation studies and an empirical analysis of a field experiment on savings behavior in Uganda and Malawi demonstrate the strong finite-sample performance and practical utility of our method.
2025
- Limit Theorems for Network Data without Metric StructureWen Jiang, Yachen Wang, Zeqi Wu*, and Xingbai Xu*Submitted, Nov 2025
This paper develops limit theorems for random variables with network dependence, without requiring that individuals in the network to be located in a Euclidean or metric space. This distinguishes our approach from most existing limit theorems in network econometrics, which are based on weak dependence concepts such as strong mixing, near-epoch dependence, and $\psi$-dependence. By relaxing the assumption of an underlying metric space, our theorems can be applied to a broader range of network data, including financial and social networks. To derive the limit theorems, we generalize the concept of functional dependence (also known as physical dependence) from time series to random variables with network dependence. Using this framework, we establish several inequalities, a law of large numbers, and central limit theorems. Furthermore, we verify the conditions for these limit theorems based on primitive assumptions for spatial autoregressive models, which are widely used in network data analysis.
- A New and Efficient Debiased Estimation of General Treatment Models by Balanced Neural Networks WeightingZeqi Wu, Meilin Wang, Wei Huang, and Zheng ZhangSubmitted, Jun 2025
Estimation and inference of treatment effects under unconfounded treatment assignments often suffer from bias and the `curse of dimensionality' due to the nonparametric estimation of nuisance parameters for high-dimensional confounders. Although debiased state-of-the-art methods have been proposed for binary treatments under particular treatment models, they can be unstable for small sample sizes. Moreover, directly extending them to general treatment models can lead to computational complexity. We propose a balanced neural networks weighting method for general treatment models, which leverages deep neural networks to alleviate the curse of dimensionality while retaining optimal covariate balance through calibration, thereby achieving debiased and robust estimation. Our method accommodates a wide range of treatment models, including average, quantile, distributional, and asymmetric least squares treatment effects, for discrete, continuous, and mixed treatments. Under regularity conditions, we show that our estimator achieves rate double robustness and $\sqrt{N}$-asymptotic normality, and its asymptotic variance achieves the semiparametric efficiency bound. We further develop a statistical inference procedure based on weighted bootstrap, which avoids estimating the efficient influence/score functions. Simulation results reveal that the proposed method consistently outperforms existing alternatives, especially when the sample size is small. Applications to the 401(k) dataset and the Mother's Significant Features dataset further illustrate the practical value of the method for estimating both average and quantile treatment effects under binary and continuous treatments, respectively.
- Applications of Functional Dependence to Spatial EconometricsZeqi Wu, Wen Jiang, and Xingbai XuEconometric Theory, Oct 2025
In this paper, we generalize the concept of functional dependence (FD) from time series (see Wu [2005, Proceedings of the National Academy of Sciences 102, 14150–14154]) and stationary random fields (see El Machkouri, Volný, and Wu [2013, Stochastic Processes and Their Applications 123, 1–14]) to nonstationary spatial processes. Within conventional settings in spatial econometrics, we define the concept of spatial FD measure and establish a moment inequality, an exponential inequality, a Nagaev-type inequality, a law of large numbers, and a central limit theorem. We show that the dependent variables generated by some common spatial econometric models, including spatial autoregressive (SAR) models, threshold SAR models, and spatial panel data models, are functionally dependent under regular conditions. Furthermore, we investigate the properties of FD measures under various transformations, which are useful in applications. Moreover, we compare spatial FD with the spatial mixing and spatial near-epoch dependence proposed in Jenish and Prucha ([2009, Journal of Econometrics 150, 86–98], [2012, Journal of Econometrics 170, 178–190]), and we illustrate its advantages.