Akaike, H. (1973), Information Theory and an Extension of the Maximum Likelihood Principle. In the Second International Symposium on Information Theory, eds. by B.N. Petrov and F. Csake, Akademiai Kiado, Hungary, 267-281.
Commenges, D., Sayyareh, A., Letenneur, L., Guedj, J. and Bar-Hen, A. (2008), Estimating a difference of Kullback-Leibler risks using a normalized difference of AIC. The Annals of Applied Statistics, 2(3), 1123-1142.
Goldenshluger, A., and Lepski, O. (2011), Bandwidth selection in kernel density estimation: oracle inequalities and adaptive minimax optimality.
Jaakkola, T., Diekhans, M., and Haussler, D. (1999), Using the fisher kernel method to detect remote protein homologies. In Proc. Internation Conference on Intelligent Systems for Molecular Biology, 149-158.
Hastie, T., Tibshirani, R. and Friedman, J. (2001), The Elements of Statistical Learning. Springer-Verlag (2nd Edition), New York, NY.
Habbema, J. D. F., Hermans, J. and van der Broek, K. (1974), A stepwise discrimination program using density estimation. In Bruckman, G.(ed), Compstat. Vienna: Physica Verlag, 100-110.
Hjort, N. L., and Jones, M. C. (1996), Locally parametric nonparametric density estimation. The Annals of Statistics, 1619-1647.
Liu, J., Chen, J., Chen, S. and Ye. J. (2009), Learning the optimal neighborhood kernel for classification. In International Joint Conference on Artificial Intelligence, Pasadena, California.
Lacour, C., Massart, P., and Rivoirard, V. (2017), Estimator selection: a new method with applications to kernel density estimation. Sankhya A, 79, 298-335.
Loader, C. R. (1999), Bandwidth selection: clasic or plug-in. The Annals of Statistics. 27(2), 415-438.
Moreno, P. J., P., Ho, P. and Vasconcelos, N. (2003), A kullback-leibler divergence based kernel for svm classification in multimedia applications. In Advances in Neural Information Processing Systems.
Marron, J. S. (1985), An asymptotically efficient solution to the bandwidth problem of kernel density estimation. Annals of statistic. 13(3), 1011-1023.
Panahi, H., and Sayyareh, A. (2014), Tracking interval for type II hybrid censoring scheme. Journal of The Iranian Statistical Society, 13(2), 187-208.
Sayyareh, A. (2012), Inference after separated hypotheses testing: an empirical investigation for linear models. Journal of Statistical Computation and Simulation, 82(9), 1275-1286.
Scott, D. W. (2015), Multivariate density estimation: theory, practice, and visualization. John Wiley and Sons.
Silverman, B. W. (1986), Density estimation for statistics and data analysis, (Vol. 26). CRC press.
Stone, C. J. (1984), An asymptotically optimal window selection rule for kernel density estimates. The Annals of Statistics, 1285-1297.
Stonem M. (1974), Cross-validatory choice and assessment of statistical predictions (with discussion). Journal of the Royal Statistical Society, Series B, 36, 111–147.
Vuong, Q. H. (1989), Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica, 57(2), 307-333.
Wang, L., Chan, K.L., Xue, P. and Zhou, L.P. (2008), A kernel-induced space selection approach to model selection in klda. IEEE Trans. Neural Networks, 19, 2116-2131.
Yeung, D., Chang, H. and Dai. G. (2007), Learning the kernel matrix by maximizing a kfd-based class separability criterion. Pattern Recognition, 40, 2021-2028.
Xiong, H., Swamy, M.N.S., and Ahmad, M.O. (2005), Optimizing the kernel in the empirical feature space. IEEE Transactions on Neural Networks, 16(2), 460-474.