Preprints
Zhang, H. and Wang, H. (2024). Refitted cross-validation estimation for high-dimensional subsamples from low-dimension full data. arXiv:2409.14032.
Zhang, H., Zheng, Y., Hou, L. and Liu, L. (2024). HIMA: An R package for high-dimensional mediation analysis. submitted. [R package “HIMA”].
Publications (# indicates corresponding author)
Liu, L., Zhang, H., Zhang, K., Zheng, Y., Gao, T., Zheng, C., Hou, L., Liu, L. (2025). High-dimensional mediation analysis for longitudinal mediators and survival outcomes. Briefings in Bioinformatics, 26, 1-11
Bai, X. and Zhang, H.# (2025). An online updating approach for estimating and testing mediation effects with big data streams. Statistics and Computing, 35, 1-17.(The first author is a Master student under my supervision).
Zhang, H., Li, Y. and Wang, H. (2025). DsubCox: A fast subsampling algorithm for Cox model with distributed and massive survival data. International Journal of Biostatistics. DOI:10.1515/ijb-2024-0042
Zhang, H. (2025). Efficient adjusted joint significance test and Sobel-type confidence interval for mediation effect. Structural Equation Modeling: A Multidisciplinary Journal, 32, 93-104. [R package “AdjMed”].
Bai,X., Zheng, Y., Hou, L., Zheng, C., Liu, L., and Zhang, H.# (2024). An efficient testing procedure for high-dimensional mediators with FDR control. Statistics in Biosciences, DOI:10.1007/s12561-024-09447-4. (The first author is a Master student under my supervision).
Zhang, H., Hong, X., Zheng, Y., Hou, L., Zheng, C., Wang, X. and Liu, L. (2024). High-dimensional quantile mediation analysis with application to a birth cohort study of mother-newborn pairs. Bioinformatics, DOI: 10.1093/bioinformatics/btae055
Zhang, H., Zuo, L., Wang, H. and Sun, L. (2024). Approximating partial likelihood estimators via optimal subsampling. Journal of Computational and Graphical Statistics, 33, 276-288.
An, M. and Zhang, H.# (2023). High-dimensional mediation analysis for time-to-event outcomes with additive hazards model. Mathematics, 11, 4891; DOI: 10.3390/math11244891. (The first author is a Master student under my supervision)
Shi, Y., Li, H., Wang, C., Chen, J., Jiang, H., Shih, T., Zhang, H., Song, Y., Feng, Y. and Liu, L. (2023). A flexible quasi-likelihood model for microbiome abundance count data. Statistics in Medicine, 42, 4632-4643.
Hou, L., Zhang, H., Hou, Q. Guo, A., Wu, O., Zhang, J. and Yu, T. (2023). SARW: Similarity-Aware random walk for GCN. Intelligent Data Analysis, 27, 1615-1636.
Zhang, H. and Li, X. (2023). A framework for mediation analysis with massive data. Statistics and Computing, DOI: 10.1007/s11222-023-10255-x
Wang, T., Zhang, H.# and Sun, L. (2023). Renewable learning for multiplicative regression model with streaming datasets. Computational Statistics. DOI: 10.1007/s00180-023-01360-6 (The first author is a Master student under my supervision)
Perera, C.,Zhang, H., Zheng, Y., Hou, L., Qu, A., Zheng, C., Xie, K. and Liu, L. (2022). HIMA2: high-dimensional mediation analysis and its application in epigenome-wide DNA methylation data. BMC Bioinformatics, 23:296.
Zhang, H., Hou, L. and Liu, L. (2022) A review of high-dimensional mediation analyses in DNA methylation studies. In Guan, Weihua (Ed.), Epigenome-Wide Association Studies: Methods and Protocols, 2432, 123-135.
Wang, T. and Zhang, H.# (2022). Optimal subsampling for multiplicative regression with massive data. Statistica Neerlandica, 76, 418-449. (The first author is a Master student under my supervision).
Liu, J. and Zhang, H.# (2022). First-order random coefficient INAR process with dependent counting series. Communications in Statistics: Simulation and Computation, 51, 3341-3354. (The first author is a Master student under my supervision).
Zhang, H., Huang, J. and Sun, L. (2022). Projection-based and cross-validated estimation in high-dimensional Cox model. Scandinavian Journal of Statistics, 49, 353-372.
Li, C., Zhang, H.# and Wang, D. (2022). Modelling and monitoring of INAR(1) process with geometrically inflated Poisson innovations. Journal of Applied Statistics, 49, 1821-1847.
Zhang, H. and Wang, H. (2021). Distributed subdata selection for big data via sampling-based approach. Computational Statistics and Data Analysis. DOI: 10.1016/j.csda.2020.107072
Zuo, L., Zhang, H.#, Wang, H. and Liu, L. (2021). Sampling-based estimation for massive survival data with additive hazards model. Statistics in Medicine, 40, 441-450. (The first author is a Master student under my supervision).
Zhang, H., Zheng, Y., Hou, L., Zheng, C. and Liu, L. (2021). Mediation analysis for survival data with high-dimensional mediators. Bioinformatics, 37, 3815-3821.
Zuo, L., Zhang, H.# , Wang, H. and Sun, L. (2021). Optimal subsample selection for massive logistic regression with distributed data. Computational Statistics, 36, pages2535–2562. (The first author is a Master student under my supervision).
Zhang, H., Chen, J., Feng, Y., Wang, C., Li, H. and Liu, L. (2021). Mediation effect selection in high-dimensional and compositional microbiome data. Statistics in Medicine, 40, 885-896.
Wang, Y. and Zhang, H.# (2021). Some estimation and forecasting procedures in Possion-Lindley INAR(1) process. Communications in Statistics: Simulation and Computation, 50,49-62. (The first author is a Master student under my supervision).
Zhang, H., Chen, J., Li, Z. and Liu, L. (2021). Testing for mediation effect with application to human microbiome data. Statistics in Biosciences, 13, 313-328.
Zhang, H., Huang, J. and Sun, L. (2020). A rank-based approach to estimating monotone individualized two treatment regimes. Computational Statistics and Data Analysis. DOI: 10.1016j.csda.2020.107015
Wang, X., Wang, D. and Zhang, H. (2020). Poisson autoregressive process modeling via the penalized conditional maximum likelihood procedure. Statistical Papers, 61, 245-260.
Zhang, H.#, Wang, D. and Sun, L. (2017). Regularized estimation in GINAR(p) process. Journal of the Korean Statistical Society, 46, 502-517.
Zhang, H., Sun, L., Zhou, Y. and Huang, J. (2017). Oracle inequalities and selection consistency for weighted lasso in high-dimensional additive hazards model. Statistica Sinica, 27, 1903-1920.
Zhou, J., Zhang, H.#, Sun, L. and Sun, J. (2017). Joint analysis of panel count data with informative observation process and a dependent terminal event. Lifetime Data Analysis, 23, 560-584.
Zhang, H., Zheng, Y., Yoon, G., Zhang, Z., Gao, T., Joyce, B., Zhang, W., Schwartz, J., Vokonas, P., Colicino, E., Baccarelli, A., Hou, L. and Liu, L. (2017). Regularized estimation in sparse high-dimensional multivariate regression, with application to a DNA methylation study. Statistical Applications in Genetics and Molecular Biology, 16, 159-171.
Fang, S., Zhang, H.#, Sun, L. and Wang, D. (2017). Analysis of panel count data with time-dependent covariates and informative observation process. Acta Mathematicae Applicatae Sinica, English Series,33, 147-156.
Fang, S., Zhang, H. and Sun, L. (2016). Joint analysis of longitudinal data with additive mixed effect model for informative observation times. Journal of Statistical Planning and Inference,169, 43-55.
Zhang, H., Zheng, Y., Zhang, Z., Gao, T., Joyce, B., Yoon, G., Zhang, W., Schwartz, J., Just, A., Colicino, E., Vokonas, P., Zhao, L., Lv, J., Baccarelli, A., Hou, L. and Liu, L. (2016). Estimating and testing high-dimensional mediation effects in epigenetic studies. Bioinformatics, 32, 3150-3154.
Liu,Y., Wang, D., Zhang, H. and Shi, N. (2016). Bivariate zero truncated Poisson INAR(1) process. Journal of the Korean Statistical Society, 45, 260-275.
Zhang, H. and Wang, D. (2015). Inference for random coefficient INAR(1) process based on frequency domain analysis. Communications in Statistics: Simulation and Computation, 44, 1078-1100.
Li, C., Wang, D. and Zhang, H. (2015). First-order mixed integer-valued autoregressive processes with zero-inflated generalized power series innovations. Journal of the Korean Statistical Society, 44, 232-246.
Jia, B., Wang, D. and Zhang, H. (2014). A study for missing values in PINAR(1) processes. Communications in Statistics: Theory and Methods, 43, 4780-4789.
Zhang, H., Zhao, H., Sun, J., Wang, D. and Kim, K. (2013). Regression analysis of multivariate panel count data with an informative observation process. Journal of Multivariate Analysis, 119, 71-80.
Zhang, H., Sun, J. and Wang, D. (2013). Variable selection and estimation for multivariate panel count data via the seamless Lo penalty. The Canadian Journal of Statistics, 41, 368-385.
Zhang, H., Wang, D. and Zhu, F. (2012). Generalized RCINAR(1) process with signed thinning operator. Communications in Statistics: Theory and Methods,41, 1750-1770.
Zhang, H., Wang, D. and Zhu, F. (2011). Empirical likelihood inference for random coefficient INAR(p) process. Journal of Time Series Analysis, 32, 195-203.
Zhang, H., Wang, D. and Zhu, F. (2011). The empirical likelihood for first-order random coefficient integer-valued autoregressive processes. Communications in Statistics: Theory and Methods, 40, 492-509.
Wang, D. and Zhang, H. (2011). Generalized RCINAR(p) process with signed thinning operator. Communications in Statistics: Simulation and Computation, 40, 13-44.
Zhang, H., Wang, D. and Zhu, F. (2010). Inference for INAR(p) processes with signed generalized power series thinning operator. Journal of Statistical Planning and Inference,140, 667-683.