Appendix 3
Survey methodology

Warning View the most recent version.

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

Target population
Sample frame
Sample design
Response rates
Estimation and weighting
Data accuracy

The National Apprenticeship Survey (NAS) of 2007 is a cross-sectional survey designed to collect data directly from Canadian apprentices. These apprentices were contacted by Statistics Canada between January and  May 2007 and responded to a telephone survey conducted on a voluntary basis. It should be noted that the sample represents three specific types of apprentices and not the entire apprentice population and the survey results provides a cross section—a snapshot of all the groups at one point in time.

Target population

For the NAS, a selected person was considered in scope for the survey if he or she had engaged in some apprentice activities between 2000 and 2004. The NAS targeted registered apprentices in the ten provinces and the three territories based on their apprenticeship status and thus are not representative of all apprentices. The three groups of apprentices targeted were:

Completers: were identified as such by the 12 jurisdictions1 and refers to those who had completed their apprenticeship program in either one of the reference years 2002, 2003 and 2004 and were not registered in any apprenticeship training as of December 31st 2004.

Discontinuers: were identified as such by the 12 jurisdictions1 and refers to those who had stopped their apprenticeship program in either one of the reference years 2002, 2003 and 2004 and were not registered in any apprenticeship training as of December 31st 2004.

Long-term continuers: were defined as active apprentices as of December 31st  2004 who registered as apprentices before the year 2000 and who had been registered for more than one and a half the prescribed duration time required to complete their apprenticeship programs in the same trade as 2004. Approximately 19% of the 2004 continuers were long-term continuers.

Excluded from the target population are apprentices who were registered in any apprenticeship training as of December 31st 2004 and who had within the normal bounds of the prescribed duration for their training. This group represents 81% of all continuers as of 2004.

The target population was first determined at the stage of frame creation using the definitions above.  During data collection, individuals were asked to confirm their apprentice activities as of 2000 and 2004. If their confirmed apprenticeship status did not fall within one of the three target population groups they were considered out-of-scope for the survey.

Sample frame

The survey sampling frame was based on lists of registered apprentices provided by the provincial and territorial jurisdictions for the targeted reference years (2002, 2003 and 2004). These lists contain all necessary information needed for the stratification and selection of the sample such as the status of the apprentice, registration year, trade or training program, apprentice's age and gender. As well, contact information was provided such as the apprentice's address and phone number. A second source of contact information was also provided for some jurisdictions.

An assessment of the sampling frame was conducted to evaluate its coverage and the quality and uniformity of the information for the 12 jurisdictions that provided data. Linking of the apprentices from the three reference years was necessary in order to classify each apprentice in the right status group (long-term continuers, completers or discontinuers) and also to eliminate duplicates within and across jurisdictions.

Table A.3.1 Number of apprentices on frame by jurisdiction and frame status. Opens a new browser window.

Table A.3.1
Number of apprentices on frame by jurisdiction and frame status

Sample design

Three variables were used for the stratification of the survey sample: jurisdiction, apprentice status and main trade groups. There were 12 jurisdictions, three apprentice statuses, and 7 main trade groups. These variables produced a total of 231 strata.

A national sample size of at least 30,000 respondents was necessary to provide reliable estimates for each stratum. A minimum sample was allocated to each stratum and the remaining sample was allocated proportionally to the number of apprentices in each stratum. In several strata, a census of apprentices was selected. Moreover, in small provinces and territories, it resulted in selecting a census of apprentices for this jurisdiction.

Within each stratum, a random sample of apprentices was selected. The sample was allocated in seven steps. First, the sample was allocated by final status (expected status at time of collection), then by frame status. Third, allocation of a minimum number of cases by stratum took place followed by determining take-all strata. Fifth, proportional allocation of the remaining cases was applied. Adjustments for tracing and response rates and augmentation for cases with no useful contact information were the two last steps done when allocating the sample.

Shown in the table below is the total number of cases allocated by jurisdictions and frame status, sent to the different regional offices of Statistics Canada to do the survey. It is from this collection sample that the targeted sample of 30,000 respondents is collected in order to reach a minimum precision for all domains of interest (aim of a CV of 33.3% for an estimated proportion of 10% in as many strata as possible and approximately to a CV of 16.6% for an estimated proportion of 25%).

Table A.3.2 Collection sample size by jurisdiction and frame status. Opens a new browser window.

Table A.3.2
Collection sample size by jurisdiction and frame status

A much higher than expected out of scope rate was observed in some strata during the first half of collection, consequently, it was decided to add sample to make up for the expected loss of respondents compared to the number expected before collection.

Table A.3.3 Allocation of the raw sample by jurisdiction after additional sample (based on the frame status). Opens a new browser window.

Table A.3.3
Allocation of the raw sample by jurisdiction after additional sample (based on the frame status)

Response rates

Survey response rates help to measure the effectiveness of the population being sampled and the collection process as well as being good indicators of the quality of the estimates produced. The table below shows the response rate at collection of NAS, at the national level as well as at the jurisdictional level.

Table A.3.4 Response rates by province and territory and frame status for National Apprenticeship Survey, 2007. Opens a new browser window.

Table A.3.4
Response rates by province and territory and frame status for National Apprenticeship Survey, 2007

Estimation and weighting

The principle behind estimation in a probability sample such as the NAS is that each person in the sample "represents", besides himself or herself, several other persons not in the sample. In order to have estimates produced from survey data being representative of the target population, a weight is given to each person who responded to the survey questions. This weight corresponds to the number of persons represented by the respondent for the target population. The weighting phase is a step which calculates, for each record, what this number is. This weight appears on the micro data file, and must be used to derive meaningful estimates from the survey.

For weighting purpose, this survey can be seen as a two-phase survey. The first phase corresponds to the selection of the sample and the responding units correspond to the second phase sample. The first phase weight is the inverse of the probability of selection of the apprentice. This first phase weight is then multiplied by a second phase adjustment factor. For the purpose of the second phase adjustment, response homogeneous groups (RHG) are created based on the characteristics of the respondents and the non-respondents. The adjustment factor is simply the inverse of the observed weighted response rate in each RHG.

For variance estimation, the two-phase approach of the Generalized Estimation System (GES) was used.

Data accuracy

While considerable effort is made to ensure high standards throughout the collection and processing of date, the resulting estimates are inevitably subject to a certain degree of error. There are two major types of error: non-sampling and sampling.

Non-sampling errors may result from frame imperfections and non-responses. A large proportion of apprentices (25.9%) in the sample were found to be out-of-scope (no apprentice activities during the target reference period) due to the frame imperfection. They were out-of-scope because they said they never been an apprentice or they had been an apprentice but not within the targeted reference years. Provincial/territorial out-of-scope rates ranged from 10% to 40%. The out-of-scope rate was 7.8% for completers, 35% for long-term continuers and 39.3% for discontinuers.

Table A.3.5 Out-of-scope rates by jurisdiction and frame status (calculated from resolved units only). Opens a new browser window.

Table A.3.5
Out-of-scope rates by jurisdiction and frame status (calculated from resolved units only)

There is an important coverage difference for Quebec in comparison to other provinces. In Quebec, almost only the construction trades are represented on the NAS frame. The list of apprentices for the construction trades was provided by "La Commission de la construction du Québec" (CCQ). Emploi-Québec (EQ) provided a list for 4 non-construction trades but this list was incomplete (no completers for 3 of the 4 trades). Therefore, only one trade (industrial electrician) was kept on the NAS frame from the EQ list of apprentices. Therefore, comparisons of estimates between the province of Quebec and other provinces should be avoided unless the comparison is made with similar trades.

A major source of non-sampling errors in surveys is the effect of non-response on the survey results. The extent of non-response varies from partial non-response (failure to answer just one or some questions) to total non-response.  Total non-response occurred because the interviewer was either unable to contact the respondent, no member of the household was able to provide the information, or the respondent refused to participate in the survey. Total non-response was handled by adjusting the weight of individuals who responded to the survey to compensate for those who did not respond.

In most cases, partial non-response to the survey occurred when the respondent did not understand or misinterpreted a question, refused to answer a question, or could not recall the requested information. In partial and item non-response cases, donor imputation was performed for certain variables.  The variables imputed were the wages and salaries related variables of the Labour Force (LF) and Most Recent Job (MR) modules.

The basis for measuring the potential size of sampling errors is the standard error of the estimates derived from survey results. Because of the large variety of estimates that can be produced from a survey, the standard error of an estimate is usually expressed relative to the estimate to which it pertains. This resulting measure, known as the coefficient of variation (CV) of an estimate, is obtained by dividing the standard error of the estimate by the estimate itself and is expressed as a percentage of the estimate.


Note

  1. Nunavut data was unavailable for the survey.