1
1726-4057
Iranian Statistical Society
158
60: Probability theory and stochastic processes
Exponential Models: Approximations for Probabilities
Fraser
D. A. S.
Naderi
A.
Ji
Kexin
Lin
Wei
Su
Jie
1
11
2011
10
2
95
107
07
11
2011
12
09
2015
Welch & Peers (1963) used a root-information prior to obtain
posterior probabilities for a scalar parameter exponential model
and showed that these Bayes probabilities had the confidence property
to second order asymptotically. An important undercurrent of this indicates
that the constant information reparameterization provides location
model structure, for which the confidence property was and is well
known. This paper examines the role of the scalar-parameter exponential
model for obtaining approximate probabilities and approximate confidence
levels, and then addresses the extension for the vector-parameter
exponential model.
159
60: Probability theory and stochastic processes
Marginal Analysis of A Population-Based Genetic Association Study of Quantitative Traits with Incomplete Longitudinal Data
Chen
Baojiang
Chen
Zhijian
Wu
Longyang
Wang
Lihua
Y. Yi
Grace
1
11
2011
10
2
109
123
07
11
2011
12
09
2015
A common study to investigate gene-environment interaction
is designed to be longitudinal and population-based. Data arising
from longitudinal association studies often contain missing responses.
Naive analysis without taking missingness into account may produce
invalid inference, especially when the missing data mechanism depends
on the response process. To address this issue in the analysis concerning
gene-environment interaction effects, in this paper, we adopt an inverse
probability weighted generalized estimating equations (IPWGEE)
approach to conduct statistical inference. This approach is attractive
because it does not require full model specification yet it can provide
consistent estimates under the missing at random (MAR) mechanism.
We utilize this method to analyze data arising from a cardiovascular
disease study.
160
60: Probability theory and stochastic processes
Penalized Bregman Divergence Estimation via Coordinate Descent
Zhang
Chunming
Zhang
Zhengjun
Chai
Yi
1
11
2011
10
2
125
140
07
11
2011
12
09
2015
Variable selection via penalized estimation is appealing for
dimension reduction. For penalized linear regression, Efron, et al. (2004)
introduced the LARS algorithm. Recently, the coordinate descent (CD)
algorithm was developed by Friedman, et al. (2007) for penalized linear
regression and penalized logistic regression and was shown to gain computational
superiority. This paper explores the CD algorithm to penalized
Bregman divergence (BD) estimation for a broader class of models,
including not only the generalized linear model, which has been well
studied in the literature on penalization, but also the quasi-likelihood
model, which has been less developed. Simulation study and real data
application illustrate the performances of the CD and LARS algorithms
in regression estimation, variable selection and classification procedure
when the number of explanatory variables is large in comparison to the
sample size.
161
60: Probability theory and stochastic processes
Regularized Autoregressive Multiple Frequency Estimation
Chen
Bei
Gel
Yulia R.
1
11
2011
10
2
141
166
07
11
2011
12
09
2015
The paper addresses a problem of tracking multiple number
of frequencies using Regularized Autoregressive (RAR) approximation.
The RAR procedure allows to decrease approximation bias, comparing
to other AR-based frequency detection methods, while still providing
competitive variance of sample estimates. We show that the RAR estimates
of multiple periodicities are consistent in probability and illustrate
dynamics of RAR in respect to sample size and signal-to-noise ration by
simulations.
162
60: Probability theory and stochastic processes
Pseudo-Likelihood Inference Underestimates Model Uncertainty: Evidence from Bayesian Nearest Neighbours
Su
Wanhua
Chipman
Hugh
Zhu
Mu
1
11
2011
10
2
167
180
07
11
2011
12
09
2015
When using the K-nearest neighbours (KNN) method, one
often ignores the uncertainty in the choice of K. To account for such
uncertainty, Bayesian KNN (BKNN) has been proposed and studied
(Holmes and Adams 2002 Cucala et al. 2009). We present some evidence
to show that the pseudo-likelihood approach for BKNN, even after
being corrected by Cucala et al. (2009), still significantly underestimates
model uncertainty.
163
60: Probability theory and stochastic processes
On Model-Based Clustering, Classification, and Discriminant Analysis
McNicholas
Paul D.
1
11
2011
10
2
181
190
07
11
2011
12
09
2015
The use of mixture models for clustering and classification
has burgeoned into an important subfield of multivariate analysis. These
approaches have been around for a half-century or so, with significant
activity in the area over the past decade. The primary focus of this
paper is to review work in model-based clustering, classification, and
discriminant analysis, with particular attention being paid to two techniques
that can be implemented using respective R packages. Parameter
estimation and model selection are also discussed. The paper concludes
with a summary, discussion, and some thoughts on future work.
164
60: Probability theory and stochastic processes
An Overview of the New Feature Selection Methods in Finite Mixture of Regression Models
Khalili
Abbas
1
11
2011
10
2
201
235
07
11
2011
12
09
2015
Variable (feature) selection has attracted much attention in
contemporary statistical learning and recent scientific research. This is
mainly due to the rapid advancement in modern technology that allows
scientists to collect data of unprecedented size and complexity. One type
of statistical problem in such applications is concerned with modeling
an output variable as a function of a small subset of a large number of
features. In certain applications, the data samples may even be coming
from multiple subpopulations. In these cases, selecting the correct
predictive features (variables) for each subpopulation is crucial. The
classical best subset selection methods are computationally too expensive
for many modern statistical applications. New variable selection
methods have been successfully developed over the last decade to deal
with large numbers of variables. They have been designed for simultaneously
selecting important variables and estimating their effects in a
statistical model. In this article, we present an overview of the recent
developments in theory, methods, and implementations for the variable
selection problem in finite mixture of regression models.
165
60: Probability theory and stochastic processes
On Mathematical Characteristics of some Improved Estimators of the Mean and Variance Components in Elliptically Contoured Models
Arashi
M.
Ehsanes Saleh
A. K. Md
Tabatabaey
S. M. M.
1
11
2011
10
2
237
266
07
11
2011
12
09
2015
In this paper we treat a general form of location model. It is
typically assumed that the error term is distributed according to the law
belonging to the class of elliptically contoured distribution. Some sorts
of shrinkage estimators of location and scale parameters are proposed
and their exact bias and MSE expressions are derived. The performance
of the estimators under study are completely analyzed and the condition
of superiority of each estimator is studied in details.
166
60: Probability theory and stochastic processes
Positive-Shrinkage and Pretest Estimation in Multiple Regression: A Monte Carlo Study with Applications
Enayetur Raheem
SM
Ejaz Ahmed
S.
1
11
2011
10
2
267
289
07
11
2011
12
09
2015
Consider a problem of predicting a response variable using
a set of covariates in a linear regression model. If it is a priori known
or suspected that a subset of the covariates do not significantly contribute
to the overall fit of the model, a restricted model that excludes
these covariates, may be sufficient. If, on the other hand, the subset
provides useful information, shrinkage method combines restricted
and unrestricted estimators to obtain the parameter estimates. Such
an estimator outperforms the classical maximum likelihood estimators.
Any prior information may be validated through preliminary test (or
pretest), and depending on the validity, may be incorporated in the
model as a parametric restriction. Thus, pretest estimator chooses between
the restricted and unrestricted estimators depending on the outcome
of the preliminary test. Examples using three real life data sets are
provided to illustrate the application of shrinkage and pretest estimation.
Performance of positive-shrinkage and pretest estimators are compared
with unrestricted estimator under varying degree of uncertainty of the
prior information. Monte Carlo study reconfirms the asymptotic properties
of the estimators available in the literature.