Comparing Linear and non-linear Mixed Effects Models with Autoregressive Error under Small Area Estimation

Document Type : Original Article

Authors
1 Department of Computer Sciences and Statistics, Faculty of Mathematics, K.N. Toosi, University of Technology, Tehran, Iran.
2 Department of Statistics‎, ‎Imam Khomeini International University‎, ‎Qazvin‎, ‎Iran
3 Department of Biostatistics‎, ‎Manitoba University‎, ‎Winnipeg‎, ‎Canada
Abstract
The problem of small area estimation is how to produce reliable estimates of characteristics of interest such as means, counts and quantiles. It is usually assumed that the observed values and the auxiliary values follow the linear regression model and the sampling errors are dependent and follow the autoregressive model. However, in practice, there are many situations in the observed values and the auxiliary values follow the non-linear regression model. We assume that the true model is unknown and consider some non-nested, non-linear or linear regression models as rival models and select an optimal model based on extensions of the model selection tests such as Vuong's test. This paper considers non-linear regression models to improve estimation and model selection based on latent variables and proposes a global model selection test for small-area estimation. A numerical example and real data analysis were carried out to illustrate the procedures obtained theoretically.
Keywords
Subjects

Abdi, F., Khalili-Damghani, K., & Abolmakarem, S. (2017). Solving Customer Insurance Coverage Sales Plan Problem Using a Multi-Stage Data Mining Approach. Kybernetes, 47(1), 2–19.
Boodhun, N., & Jayabalan, M. (2018). Risk prediction in life insurance industry using supervised learning algorithms.
Brofer, A., Rezaian, A., & Shokoohyar, S. (2017). Identification of Customer Behavior Pattern in Life Insurance and Capital Formation Using Data Mining. Management Research in Iran, 20(4), 65–94.
Fay, R. E., & Herriot, R. A. (1979). Estimates of income for small places: an application of James–Stein procedures to census data. Journal of the American Statistical Association, 74, 269–277.
Folland, S., Goodman, A., & Stano, M. (2016). The Economics of Health and Health Care. Routledge. https://doi.org/10.4324/9781315510736
Frost, J. (2019). Heterogeneity. statisticsbyjim.com/basics/heterogeneity.
Ghuse, N., Pawar, P., & Potgantwar, A. (2017). An Improved Approach for Fraud Detection in Health Insurance Using Data Mining Techniques. International Journal of Scientific Research in Network Security and Communication, 5(5).
Goel, S., & Chaudhary, A. (2024). Prediction of Health Insurance Price using Machine Learning Algorithms. INDIACom, 2024. https://doi.org/10.23919/INDIACom61295.2024.10498661
Goodarzi, A., & Janat Babaei, S. (2016). Evaluation of Decision Tree Algorithms, Naive Bayes and Logistic Regression in Detection of Car Insurance Frauds. Insurance Research Quarterly, 1(2), 61–80.
Jiang, J., Nguyen, T., & Rao, J. S. (2010). Fence method for non-parametric small area estimation. Survey Methodology, 36, 3–11.
Jiang, J., Nguyen, T., & Rao, J. S. (2011). Best predictive small area estimation. Journal of the American Statistical Association, 106(494), 732–745.
Jiang, J., Nguyen, T., & Lahiri, P. (2018). A unified Monte-Carlo jackknife for small area estimation after model selection. Annals of Mathematical Sciences and Applications, 3, 405–438.
Jiang, J., Rao, J. S., Gu, Z., & Nguyen, T. (2008). Fence methods for mixed model selection. The Annals of Statistics, 36, 1669–1692.
Jones, K. I., & Swati, S. (2023). The Implementation of Machine Learning in the Insurance Industry With Big Data Analytics. International Journal of Data Informatics and Intelligent Computing, 2(2), 21–38.
Kalra, H., Singh, R., & Kumar, T. S. (2022). Fraud Claims Detection in Insurance Using Machine Learning. Journal of Pharmaceutical Negative Results. https://doi.org/10.47750/pnr.2022.13.S03.053
Kalra, M., Lal, N., & Qamar, S. (2018). K-Mean Clustering Algorithm for Mining Heterogeneous Data. Information and Communication Technology for Sustainable Development. https://doi.org/10.1007/978-981-10-3920-1_7
Kumar Dubey, A., Kumar Dubey, A. N., Agarwal, V., & Khandagre, Y. (2012). Knowledge discovery with a subset–superset approach for Mining Heterogeneous Data. CSI Sixth International Conference on Software Engineering (CONSEG). https://doi.org/10.1109/CONSEG.2012.6349495
Lahiri, P., & Rao, J. N. K. (1995). Robust estimation of mean squared error of small area estimators. Journal of the American Statistical Association, 82, 758–766.
Nielsen, S. F. (2000). The stochastic EM algorithm: Estimation and asymptotic results. Bernoulli, 6(3), 457–489.
Özgür, B., & Yolcu, U. (2023). Prediction of the Premium Production of Insurance Companies Operating in Turkey Using Artificial Neural Networks. Turkish Journal of Forecasting. https://doi.org/10.34110/forecasting.1223653
Panda, S., Purkayastha, B., Das, D., Manomita, C., & Saroj, B. (2022). Health Insurance Cost Prediction Using Regression Models. COM-IT-CON, 2022. https://doi.org/10.1109/COM-IT-CON54601.2022.9850653
Pantelous, A., & Passalidou, E. (2013). Optimal premium pricing policy in a competitive insurance market environment. Annals of Actuarial Science, 7(2), 175–191.
Patil, M. S., Sanika, K., & Sanjana, K. (2024). Medical Insurance Premium Prediction with Machine Learning. International Journal of Innovations in Engineering Research and Technology. https://doi.org/10.26662/ijiert.v11i5.pp5-12
Prasad, N. G. N., & Rao, J. N. K. (1990). The estimation of the mean squared error of small-area estimators. Journal of the American Statistical Association, 85, 163–171.
Rao, J. N. K., & Molina, I. (2015). Small Area Estimation (2nd ed.). Wiley. https://doi.org/10.1002/9781118735855
Rao, J. N. K., & Molina, I. (2015). Empirical Bayes and hierarchical Bayes estimation of poverty measures for small areas. In M. Pratesi (Ed.), Analysis of Poverty Data by Small Area Methods. Wiley.
Rao, J. N. K., & Yu, M. (1992). Small area estimation by combining time series and cross-sectional data. Proceedings of the Section on Survey Research Method, 1–9.
Rao, J. N. K., & Yu, M. (1994). Small area estimation by combining time series and cross-sectional data. Canadian Journal of Statistics, 22, 511–528.
Rose, F. (2013). Marine Insurance: Law and Practice. Routledge.
Salama, M., Abdelkader, H., & Abde
lwahab, A. (2022). A novel ensemble approach for heterogeneous data with active learning. International Journal of Engineering Business Management. https://doi.org/10.1177/18479790221082605
Sayyareh, A. (2012). Inference after separated hypotheses testing: an empirical investigation for linear models. Journal of Statistical Computation and Simulation, 82(9), 1275–1286.
Schenker, N., & Welsh, A. H. (1987). Asymptotic results for multiple imputation. Annals of Statistics, 16, 1550–1566.
Shokoohi, F., & Torabi, M. (2018). Semi-parametric small-area estimation by combining time-series and cross-sectional data. Australian & New Zealand Journal of Statistics, 60(3), 323–342.
Sugasawa, S., Kawakubo, Y., & Datta, G. S. (2019). Observed best selective prediction in small area estimation. Journal of Multivariate Analysis, 173, 383–392.
Volume 24, Issue 1
June 2025
Pages 67-96

  • Receive Date 24 November 2024
  • Revise Date 10 July 2025
  • Accept Date 31 August 2025