Pseudo-likelihood-based Bayesian information criterion for variable selection in survey data

Warning View the most recent version.

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

Chen Xu, Jiahua Chen and Harold Mantel1

Abstract

Regression models are routinely used in the analysis of survey data, where one common issue of interest is to identify influential factors that are associated with certain behavioral, social, or economic indices within a target population. When data are collected through complex surveys, the properties of classical variable selection approaches developed in i.i.d. non-survey settings need to be re-examined. In this paper, we derive a pseudo-likelihood-based BIC criterion for variable selection in the analysis of survey data and suggest a sample-based penalized likelihood approach for its implementation. The sampling weights are appropriately assigned to correct the biased selection result caused by the distortion between the sample and the target population. Under a joint randomization framework, we establish the consistency of the proposed selection procedure. The finite-sample performance of the approach is assessed through analysis and computer simulations based on data from the hypertension component of the 2009 Survey on Living with Chronic Diseases in Canada.

Key Words

Variable selection; Sampling weights; Model-design-based inference; BIC; Penalized likelihood; Selection consistency.

Table of content

1 Introduction

2 Joint inference and super-population

3 Pseudo-likelihood-based selection with BIC

4 Consistency of PPL-BIC

5 Numerical studies

6 Concluding remarks

 

 

 

 

 


1Chen Xu and Jiahua Chen, Department of Statistics, University of British Columbia, Vancouver, BC, Canada, V6T 1Z4. E mail: chen.xu@stat.ubc.ca and jhchen.stat.ubc.ca; Harold Mantel, Statistical Research and Innovation Division, Statistics Canada, Ottawa, ON, Canada, K1A 0T6. E-mail: Harold.Mantel@statcan.gc.ca.

Date modified: