![]() |
||||||
Information identified as archived on the Web is for reference, research or recordkeeping purposes. It has not been altered or updated after the date of archiving. Web pages that are archived on the Web are not subject to the Government of Canada Web Standards. As per the Communications Policy of the Government of Canada, you can request alternate formats on the "Contact Us" page.
Longitudinal Survey of Immigrants to Canada A Portrait of Early Settlement Experiences |
Methodology and data qualityThe Longitudinal Survey of Immigrants to Canada (LSIC) was established in response to the growing need for information on immigrants to Canada. Particular emphasis is given to the settlement process and the factors that influence immigrants' ability to integrate and adapt to Canadian society, and the services used by immigrants to facilitate the transition. The completed survey will consist of three interviews (waves): the first of these was conducted six months after the immigrant's arrival in Canada, with subsequent interviews occurring two and four years after their arrival. Only immigrants who respond to the wave one interview will be traced for the wave two interview; only those who respond to the second wave interview will be traced and interviewed for wave three. The following sections describe the survey methodology and outline some of the limitations of the data. A more detailed discussion of the methodology and data quality can be found in the Microdata User Guide – Longitudinal Survey of Immigrants to Canada – Wave 11. Survey populationsThe target population for the survey consists of immigrants who arrived in Canada from abroad between October 1st, 2000 and September 30th, 2001, and were 15 years old or more at the time of landing. Individuals who applied and landed from within Canada are excluded from the survey. These people may have been in Canada for a considerable length of time before officially "landing" and would therefore likely demonstrate quite different integration characteristics from those who recently arrived in Canada. Refugees claiming asylum from within Canada are also excluded from the scope of the survey. The target population accounts for approximately 169,400 of the 250,000 persons admitted to Canada during this period. Coverage of the survey included all Census Metropolitan Areas and non-remote Census Agglomerations. The population of interest are those immigrants in the target population who still reside in Canada at the time of a given wave. For example, during the six months between arrival and the time of the wave one interview, some immigrants left Canada to return to their country of origin, or to another country, and are thus excluded from the population of interest. Survey designThe frame for the LSIC is an administrative database of all landed immigrants to Canada which comes from Citizenship and Immigration Canada. The database, known as FOSS (Field Operation Support System), includes various characteristics of each immigrant that can be used for survey design purposes, such as: name; age; sex; mother tongue; country of origin; knowledge of English and/or French; category of immigrant; date of landing; and intended province of destination in Canada. The survey was designed based on probability sampling theory, using a two-stage stratified sampling method. The first stage involved the selection of the immigrating unit (IU) using a probability proportional to size (PPS) method. The size was defined as the number of immigrants in the IU. The second stage involved the random selection of one IU member within each selected IU. The selected member of the IU is called the longitudinal respondent (LR). Only the LR will be followed throughout the survey and no interviews will be conducted with other members of the IU or the LR's household. To ensure reliable estimates and to satisfy various requirements of federal and provincial government departments, the sample was stratified by month of landing, province of destination and class of immigrant, and the following subgroups were over-sampled:
As a result of sampling, the sample of immigrants becomes representative of the target population only through the use of the survey weight. The survey weight can be thought of as the number of immigrants in the population represented by a sampled immigrant. The estimates presented earlier in this document are weighted estimates. To ensure reliable estimates at wave three, a minimum sample size of at least 5,755 respondents is required. The determination of the initial sample size was based on several sample attrition hypotheses applied to the wave three minimum sample size requirements. As a result, 20,322 immigrants were selected for the wave one interview. Data collectionThe questionnaire is administered by computer assisted interview in French and English. The use of computer assisted interviews facilitates the collection of data that would be difficult to capture using paper and pencil. Translated paper versions are available in thirteen additional languages. The fifteen languages cover 93% of the immigrant population. Collection of wave 1 data took place between April 2001 and May 2002. Most (68%) of the interviews were conducted face-to-face. The remaining interviews were conducted over the telephone for various reasons (remote location, language requirements, etc.). Interviews lasted approximately 90 minutes. Of the 20,322 immigrants selected in the initial sample, 12,040 participated in the wave 1 interview (respondents); 2,120 chose not to participate (non respondents); and 411 were found to be no longer in the population of interest (out of scope). Additionally, 5,751 of the selected immigrants could not be located, and thus their status was unresolved. Data quality and limitationsThere are two main types of errors: sampling errors and non-sampling errors. A sampling error is the difference between an estimate derived from a sample and the one that would have been obtained from a census that used the same procedures to collect data from every person in the population. All other types of errors such as frame coverage, response, processing and non-response are non-sampling errors. Many non-sampling errors are difficult to identify and quantify. Statistics Canada 's Standards and Guidelines on the Documentation of Data Quality and Methodology2 states that external users must be given with an indication of the magnitude of the sampling error. The basis for measuring sampling error is the standard error of the estimates derived from survey results. However, because of the large variety of estimates that can be produced from a survey, the standard error of an estimate is usually expressed relative to the estimate to which it pertains. This measure, known as the coefficient of variation (CV) of an estimate, is obtained by expressing the standard error of the estimate as a percentage of the estimate. An indication of the magnitude of sampling error has been provided for the estimates appearing in this report. A CV greater than 33.3% indicates that an estimate is too unreliable to publish. In this report, such values have been suppressed and replaced with a letter code F. While publishable, estimates with a CV between 16.6 and 33.3% are considered marginally acceptable and should be interpreted with caution. Such estimates are accompanied by the letter code E in this document. The weights of resolved units (respondents and out-of-scope) are inflated to account for the non-responding and unresolved immigrants. Every effort is made to minimize non-response bias, however given the importance of the latter two groups, the potential for bias is considerable. In the absence of a reliable, independent source of information on these immigrants, these biases cannot be quantified. Thus, the reader should be aware of this potential. Among some responding units, incomplete data are obtained: a respondent may fail to provide data for a specific set (module) of questions (partial non-response); or may not provide a response to an individual question (item non-response). Partial and item non-response are corrected by imputation. Imputation consists of replacing a missing or inconsistent value with a plausible value. When carried out properly, imputation improves data quality by reducing non-response bias. As in many surveys, the questions on income were the most under-reported, with non-response to this module in roughly 3.7% of cases (i.e. 96.3% provided a complete response to the questions in income). Partial non-response was dealt with using a process called massive imputation, whereby the entire incomplete module is replaced for a partial respondent using data from a donor (a respondent for whom all modules are complete). Item non-response was highest among family income amount questions. Imputation rates for the income amount questions are provided in Table 12.1. Income values are imputed on a question-by-question basis (field imputation), again using data from a respondent. The potential for bias is greater as the imputation rates are quite high for many of these questions. Notes
|
|