3.5 Estimation
3.5.1 Weighting

Text begins

The principle behind estimation in a probability survey is that each sample unit represents not only itself, but also several units of the survey population. The design weight of a unit usually refers to the average number of units in the population that each sampled unit represents. This weight is determined by the sampling method and is an important part of the estimation process.

While the design weights can be used for estimation, most surveys produce a set of estimation weights by adjusting the design weights to improve the precision of the final estimates. The two most common reasons for making adjustments are to account for nonresponse and to make use of pertinent data available from other sources. Once the final estimation weights have been calculated, they are applied to the sample data in order to compute estimates

Design weight

The first step in estimation is assigning a weight to each sampled unit. The design weight ( w d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfKttLearuatH9givLearmWu51MyVXgatC vAUfeBSjuyZL2yd9gzLbvyNv2CaeHbd9wDYLwzYbItLDharyavP1wz ZbItLDhis9wBH5garqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpe pae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaam aaeaqbaaGcbaaeaaaaaaaaa8qacqWG3bWDpaWaaSbaaSqaa8qacqWG Kbaza8aabeaaaaa@3D9B@ ), which is the average number of units in the population that each sampled unit represents, is the inverse of its inclusion probability (π) in the sample.

w d = 1 / π MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfKttLearuatH9givLearmWu51MyVXgatC vAUfeBSjuyZL2yd9gzLbvyNv2CaeHbd9wDYLwzYbItLDharyavP1wz ZbItLDhis9wBH5garqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpe pae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaam aaeaqbaaGcbaaeaaaaaaaaa8qacqWG3bWDpaWaaSbaaSqaa8qacqWG Kbaza8aabeaak8qacqGH9aqpcqaIXaqmcqGGVaWlcqaHapaCaaa@424E@

If the inclusion probability is 1/50, then each selected unit represents on average 50 units in the population and the design weight is w d = 50 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfKttLearuatH9givLearmWu51MyVXgatC vAUfeBSjuyZL2yd9gzLbvyNv2CaeHbd9wDYLwzYbItLDharyavP1wz ZbItLDhis9wBH5garqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpe pae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaam aaeaqbaaGcbaaeaaaaaaaaa8qacqWG3bWDpaWaaSbaaSqaa8qacqWG Kbaza8aabeaak8qacqGH9aqpcqaI1aqncqaIWaamaaa@40A1@ .

Some sample designs assign the same design weights for all units in the sample, while others give different design weights to sampled units for various reasons, such as improving precision or reducing cost.

Example 1: Simple Random Sample

Suppose there are N =100 Grade 12 (or secondary 5) students in a high school. A simple random sample of size n =25 students is selected, and the selected students are invited to complete a questionnaire about their career plan.

  • The inclusion probability is:
    π   = n / N   = 25 / 100   =   1 / 4 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfKttLearuatH9givLearmWu51MyVXgatC vAUfeBSjuyZL2yd9gzLbvyNv2CaeHbd9wDYLwzYbItLDharyavP1wz ZbItLDhis9wBH5garqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpe pae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaam aaeaqbaaGcbaaeaaaaaaaaa8qacqaHapaCcqqGGaaicqGH9aqpcqWG UbGBcqGGVaWlcqWGobGtcqqGGaaicqGH9aqpcqaIYaGmcqaI1aqncq GGVaWlcqaIXaqmcqaIWaamcqaIWaamcqqGGaaicqGH9aqpcqqGGaai cqaIXaqmcqGGVaWlcqaI0aanaaa@4E3B@ .
  • The design weight is: 
    w d = 1 π =   1 / 1 4 = 4 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfKttLearuatH9givLearmWu51MyVXgatC vAUfeBSjuyZL2yd9gzLbvyNv2CaeHbd9wDYLwzYbItLDharyavP1wz ZbItLDhis9wBH5garqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpe pae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaam aaeaqbaaGcbaaeaaaaaaaaa8qacqWG3bWDpaWaaSbaaSqaa8qacqWG Kbaza8aabeaak8qacqGH9aqpdaWccaWdaeaapeGaeGymaedapaqaa8 qacqaHapaCaaGaeyypa0JaeiiOaaQaeGymaeJaei4la8YaaSaaa8aa baWdbiabigdaXaWdaeaapeGaeGinaqdaaiabg2da9iabisda0aaa@4A8C@ .

Each student selected in the simple represents four students of the school.

Production of simple estimates

Estimates can be produced after weights are calculated while only simple estimates, such as totals, averages and proportions, are covered here. 

Estimating a population total

The estimate of the total number ( Y ^ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfKttLearuatH9givLearmWu51MyVXgatC vAUfeBSjuyZL2yd9gzLbvyNv2CaeHbd9wDYLwzYbItLDharyavP1wz ZbItLDhis9wBH5garqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpe pae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaam aaeaqbaaGcbaaeaaaaaaaaa8qacuWGzbqwpaGbaKaaaaa@3BD3@ ) of units in the population is calculated by multiplying the weight and the value of interest for each selected unit then summed over all in sample units. For categorical variables, the estimate is actually calculated by adding together the weights of the responding units.

Example 2: Simple Random Sample (Continued)

Suppose that within the 25 students selected in the sample, there are about 10 applied to science programs. Then, the total number of students applied to science programs is:

Y ^ =   4   ×   10   =   40 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfKttLearuatH9givLearmWu51MyVXgatC vAUfeBSjuyZL2yd9gzLbvyNv2CaeHbd9wDYLwzYbItLDharyavP1wz ZbItLDhis9wBH5garqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpe pae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaam aaeaqbaaGcbaaeaaaaaaaaa8qacuWGzbqwpaGbaKaapeGaeyypa0Ja eeiiaaIaeGinaqJaeeiiaaIaey41aqRaeeiiaaIaeGymaeJaeGimaa JaeeiiaaIaeyypa0JaeeiiaaIaeGinaqJaeGimaadaaa@48A1@

Estimating a population average

The estimate of the average ( Y ¯ ^ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfKttLearuatH9givLearmWu51MyVXgatC vAUfeBSjuyZL2yd9gzLbvyNv2CaeHbd9wDYLwzYbItLDharyavP1wz ZbItLDhis9wBH5garqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpe pae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaam aaeaqbaaGcbaWaaecaaeaaqaaaaaaaaaWdbiqbdMfaz9aagaqeaaGa ayPadaaaaa@3C9D@ ) in the population is the estimate of the total value for the variable in interest ( Y ^ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfKttLearuatH9givLearmWu51MyVXgatC vAUfeBSjuyZL2yd9gzLbvyNv2CaeHbd9wDYLwzYbItLDharyavP1wz ZbItLDhis9wBH5garqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpe pae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaam aaeaqbaaGcbaaeaaaaaaaaa8qacuWGzbqwpaGbaKaaaaa@3BD3@ ) divided by the estimate of the total number of units ( N ^ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfKttLearuatH9givLearmWu51MyVXgatC vAUfeBSjuyZL2yd9gzLbvyNv2CaeHbd9wDYLwzYbItLDharyavP1wz ZbItLDhis9wBH5garqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpe pae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaam aaeaqbaaGcbaaeaaaaaaaaa8qacuWGobGtpaGbaKaaaaa@3BBD@ ) in the population.

Y ¯ ^ =   Y ^ N ^ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfKttLearuatH9givLearmWu51MyVXgatC vAUfeBSjuyZL2yd9gzLbvyNv2CaeHbd9wDYLwzYbItLDharyavP1wz ZbItLDhis9wBH5garqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpe pae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaam aaeaqbaaGcbaWaaecaaeaaqaaaaaaaaaWdbiqbdMfaz9aagaqeaaGa ayPadaWdbiabg2da9iabcckaGoaalaaapaqaa8qacuWGzbqwpaGbaK aaaeaapeGafmOta40dayaajaaaaaaa@4258@

Example 3: Simple Random Sample (Continued)

Usually, students apply to more than one program when applying for university study. Suppose that within the 25 students selected in the sample, 5 of them apply to only 1 program, 10 of them apply to 2 programs and 10 of them apply to 3 programs. Then, the average number of applications per student is calculated as following:

  • Total number of applications is given by:
    Y ^ = ( 4 × 5 × 1 ) + ( 4 × 10 × 2 ) + ( 4 × 10 × 3 ) = 220 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfKttLearuatH9givLearmWu51MyVXgatC vAUfeBSjuyZL2yd9gzLbvyNv2CaeHbd9wDYLwzYbItLDharyavP1wz ZbItLDhis9wBH5garqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpe pae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaam aaeaqbaaGcbaaeaaaaaaaaa8qacuWGzbqwpaGbaKaapeGaeyypa0Za aeWaa8aabaWdbiabisda0iabgEna0kabiwda1iabgEna0kabigdaXa GaayjkaiaawMcaaiabgUcaRmaabmaapaqaa8qacqaI0aancqGHxdaT cqaIXaqmcqaIWaamcqGHxdaTcqaIYaGmaiaawIcacaGLPaaacqGHRa WkdaqadaWdaeaapeGaeGinaqJaey41aqRaeGymaeJaeGimaaJaey41 aqRaeG4mamdacaGLOaGaayzkaaGaeyypa0JaeGOmaiJaeGOmaiJaeG imaadaaa@5E73@
  • Total number of students is given by:
    N ^ = 4   × 25 = 100 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfKttLearuatH9givLearmWu51MyVXgatC vAUfeBSjuyZL2yd9gzLbvyNv2CaeHbd9wDYLwzYbItLDharyavP1wz ZbItLDhis9wBH5garqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpe pae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaam aaeaqbaaGcbaaeaaaaaaaaa8qacuWGobGtpaGbaKaapeGaeyypa0Ja eGinaqJaeiiOaaQaey41aqRaeGOmaiJaeGynauJaeyypa0JaeGymae JaeGimaaJaeGimaadaaa@4764@
  • Average number of applications per student is given by:
    Y ¯ ^ = Y ^ N ^ = 220 100 = 2.2 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfKttLearuatH9givLearmWu51MyVXgatC vAUfeBSjuyZL2yd9gzLbvyNv2CaeHbd9wDYLwzYbItLDharyavP1wz ZbItLDhis9wBH5garqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpe pae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaam aaeaqbaaGcbaWaaecaaeaaqaaaaaaaaaWdbiqbdMfaz9aagaqeaaGa ayPadaWdbiabg2da9maalaaapaqaa8qacuWGzbqwpaGbaKaaaeaape GafmOta40dayaajaaaa8qacqGH9aqpdaWcaaWdaeaapeGaeGOmaiJa eGOmaiJaeGimaadapaqaa8qacqaIXaqmcqaIWaamcqaIWaamaaGaey ypa0JaeGOmaiJaeiOla4IaeGOmaidaaa@4B60@

Estimating a population proportion

The estimate of the proportion in the survey population having a given characteristic is quite similar as estimating a population average in terms of the mathematical formula. It is also calculated as a quotient between two estimated totals. The main difference is the numerator, which indicates the estimate of the total number of units possessing the given characteristic ( c MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfKttLearuatH9givLearmWu51MyVXgatC vAUfeBSjuyZL2yd9gzLbvyNv2CaeHbd9wDYLwzYbItLDharyavP1wz ZbItLDhis9wBH5garqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpe pae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaam aaeaqbaaGcbaaeaaaaaaaaa8qacqWGJbWyaaa@3BC8@ ) when estimating a proportion ( P ^ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfKttLearuatH9givLearmWu51MyVXgatC vAUfeBSjuyZL2yd9gzLbvyNv2CaeHbd9wDYLwzYbItLDharyavP1wz ZbItLDhis9wBH5garqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpe pae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaam aaeaqbaaGcbaaeaaaaaaaaa8qacuWGqbaupaGbaKaaaaa@3BC1@ ). However, the numerator is the estimate of the total value for quantitative data when estimating an average.

P ^ = N C ^ N ^ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfKttLearuatH9givLearmWu51MyVXgatC vAUfeBSjuyZL2yd9gzLbvyNv2CaeHbd9wDYLwzYbItLDharyavP1wz ZbItLDhis9wBH5garqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpe pae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaam aaeaqbaaGcbaaeaaaaaaaaa8qacuWGqbaupaGbaKaapeGaeyypa0Za aSaaa8aabaWaaecaaeaapeGaemOta40damaaBaaaleaapeGaem4qam eapaqabaaakiaawkWaaaqaa8qacuWGobGtpaGbaKaaaaaaaa@41B4@

Example 4: Simple Random Sample (Continued)

Suppose within the 25 students selected in the sample, there are 10 females and 15 males. Overall, 10 students apply for science programs with 5 females and 5 males. The proportion of students apply for science programs by gender is calculated as following:

  1. Total number of students applied science programs by gender is given by:
    N ^ m a l e ,   s c i e n c e = 5 × 4 = 20 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfKttLearuatH9givLearmWu51MyVXgatC vAUfeBSjuyZL2yd9gzLbvyNv2CaeHbd9wDYLwzYbItLDharyavP1wz ZbItLDhis9wBH5garqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpe pae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaam aaeaqbaaGcbaaeaaaaaaaaa8qacuWGobGtpaGbaKaadaWgaaWcbaWd biabd2gaTjabdggaHjabdYgaSjabdwgaLjabcYcaSiabcckaGkabdo haZjabdogaJjabdMgaPjabdwgaLjabd6gaUjabdogaJjabdwgaLbWd aeqaaOWdbiabg2da9iabiwda1iabgEna0kabisda0iabg2da9iabik daYiabicdaWaaa@5590@
    N ^ f e m a l e ,   s c i e n c e = 5 × 4 = 20 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfKttLearuatH9givLearmWu51MyVXgatC vAUfeBSjuyZL2yd9gzLbvyNv2CaeHbd9wDYLwzYbItLDharyavP1wz ZbItLDhis9wBH5garqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpe pae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaam aaeaqbaaGcbaaeaaaaaaaaa8qacuWGobGtpaGbaKaadaWgaaWcbaWd biabdAgaMjabdwgaLjabd2gaTjabdggaHjabdYgaSjabdwgaLjabcY caSiabcckaGkabdohaZjabdogaJjabdMgaPjabdwgaLjabd6gaUjab dogaJjabdwgaLbWdaeqaaOWdbiabg2da9iabiwda1iabgEna0kabis da0iabg2da9iabikdaYiabicdaWaaa@5838@
  2. Total number of students by gender is given by:
    N ^ m a l e = 15 × 4 = 60 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfKttLearuatH9givLearmWu51MyVXgatC vAUfeBSjuyZL2yd9gzLbvyNv2CaeHbd9wDYLwzYbItLDharyavP1wz ZbItLDhis9wBH5garqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpe pae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaam aaeaqbaaGcbaaeaaaaaaaaa8qacuWGobGtpaGbaKaadaWgaaWcbaWd biabd2gaTjabdggaHjabdYgaSjabdwgaLbWdaeqaaOWdbiabg2da9i abigdaXiabiwda1iabgEna0kabisda0iabg2da9iabiAda2iabicda Waaa@4A6D@
    N ^ f e m a l e = 10 × 4 = 40 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfKttLearuatH9givLearmWu51MyVXgatC vAUfeBSjuyZL2yd9gzLbvyNv2CaeHbd9wDYLwzYbItLDharyavP1wz ZbItLDhis9wBH5garqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpe pae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaam aaeaqbaaGcbaaeaaaaaaaaa8qacuWGobGtpaGbaKaadaWgaaWcbaWd biabdAgaMjabdwgaLjabd2gaTjabdggaHjabdYgaSjabdwgaLbWdae qaaOWdbiabg2da9iabigdaXiabicdaWiabgEna0kabisda0iabg2da 9iabisda0iabicdaWaaa@4D07@
  3. Proportion of students applied science programs by gender is given by:
    P ^ m a l e ,   s c i e n c e = N ^ m a l e ,   s c i e n c e N ^ m a l e = 20 60 = 1 / 3 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfKttLearuatH9givLearmWu51MyVXgatC vAUfeBSjuyZL2yd9gzLbvyNv2CaeHbd9wDYLwzYbItLDharyavP1wz ZbItLDhis9wBH5garqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpe pae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaam aaeaqbaaGcbaaeaaaaaaaaa8qacuWGqbaupaGbaKaadaWgaaWcbaWd biabd2gaTjabdggaHjabdYgaSjabdwgaLjabcYcaSiabcckaGkabdo haZjabdogaJjabdMgaPjabdwgaLjabd6gaUjabdogaJjabdwgaLbWd aeqaaOWdbiabg2da9maalaaapaqaa8qacuWGobGtpaGbaKaadaWgaa WcbaWdbiabd2gaTjabdggaHjabdYgaSjabdwgaLjabcYcaSiabccka GkabdohaZjabdogaJjabdMgaPjabdwgaLjabd6gaUjabdogaJjabdw gaLbWdaeqaaaGcbaWdbiqbd6eao9aagaqcamaaBaaaleaapeGaemyB a0MaemyyaeMaemiBaWMaemyzaugapaqabaaaaOWdbiabg2da9maala aapaqaa8qacqaIYaGmcqaIWaama8aabaWdbiabiAda2iabicdaWaaa cqGH9aqpcqaIXaqmcqGGVaWlcqaIZaWmaaa@71F5@
    P ^ f e m a l e ,   s c i e n c e = N ^ f e m a l e ,   s c i e n c e N ^ f e m a l e = 20 40 = 1 / 2 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfKttLearuatH9givLearmWu51MyVXgatC vAUfeBSjuyZL2yd9gzLbvyNv2CaeHbd9wDYLwzYbItLDharyavP1wz ZbItLDhis9wBH5garqqtubsr4rNCHbGeaGqiVu0Je9sqqrpepC0xbb L8F4rqqrFfpeea0xe9Lq=Jc9vqaqpepm0xbba9pwe9Q8fs0=yqaqpe pae9pg0FirpepeKkFr0xfr=xfr=xb9adbaqaaeGaciGaaiaabeqaam aaeaqbaaGcbaaeaaaaaaaaa8qacuWGqbaupaGbaKaadaWgaaWcbaWd biabdAgaMjabdwgaLjabd2gaTjabdggaHjabdYgaSjabdwgaLjabcY caSiabcckaGkabdohaZjabdogaJjabdMgaPjabdwgaLjabd6gaUjab dogaJjabdwgaLbWdaeqaaOWdbiabg2da9maalaaapaqaa8qacuWGob GtpaGbaKaadaWgaaWcbaWdbiabdAgaMjabdwgaLjabd2gaTjabdgga HjabdYgaSjabdwgaLjabcYcaSiabcckaGkabdohaZjabdogaJjabdM gaPjabdwgaLjabd6gaUjabdogaJjabdwgaLbWdaeqaaaGcbaWdbiqb d6eao9aagaqcamaaBaaaleaapeGaemOzayMaemyzauMaemyBa0Maem yyaeMaemiBaWMaemyzaugapaqabaaaaOWdbiabg2da9maalaaapaqa a8qacqaIYaGmcqaIWaama8aabaWdbiabisda0iabicdaWaaacqGH9a qpcqaIXaqmcqGGVaWlcqaIYaGmaaa@79E7@

Other estimation methods

The estimation method described above for Simple Random Sampling is the simplest estimation method, and there are other more advanced ones available, which are widely applied in many surveys. The most appropriate estimation method to use is determined by a few factors, such as the characteristics to be estimated, the different types of data, reliability, cost and timeliness, etc. At Statistics Canada, specialized estimation systems are used to produce estimates involving complicated procedures in a timely manner.

Adjusting the weights

Quite often design weights have to be adjusted prior to estimation, and there are two main types of adjustment: nonresponse adjustment and adjustment for external information.

Adjusting for nonresponse

Almost all surveys suffer from nonresponse, which occurs when all or some key information requested from sampled units is unavailable for some reason, such as the sample unit refuses to participate, no contact is made, the unit cannot be located or the information obtained is unusable. The easiest way to deal with such nonresponse is to ignore it, but this leads to inaccurate estimates.

Two common ways of dealing with this kind of nonresponse is to impute missing answers or to adjust the design weights based on the assumption that the responding units represent both responding and nonresponding units. The design weights of the non-respondents are then redistributed among the respondents.

Adjusting for external information

Sometimes information about the survey population is available from other sources, for example information from a census or an administration file. This information can also be incorporated in the weighting process.

There are two main reasons for using external (auxiliary) data at estimation. The first reason is that it is often important for the survey estimates to match known population totals or estimates from another, more reliable, survey. For example, many social surveys adjust their survey estimates in order to be consistent with estimates (age, sex distributions, etc.) of the most recent census of the population. External information may also be obtained from administrative data or from another survey that is considered to be more reliable because of its larger sample size or because the published estimates must be respected.

The second reason is to improve the precision of the estimates, as long as the values of the auxiliary variables are collected for the surveyed units and that population totals or estimates are available for these variables from another reliable source.


Date modified: