2 Formalization of the optimization problem

Marco Ballin and Giulio Barcaroli

Previous | Next

Universe of alternative stratifications

We define as sampling frame F MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamOraa aa@3ACC@ a set of N MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamOtaa aa@3AD4@ records containing information (organised in variables) related to N MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamOtaa aa@3AD3@ individuals of the reference population. Some variables are useful for the identification of units, while some other can be used in order to define the sampling strategy. The values of the latter (from now on: auxiliary variables) can be observed by means of a census, or from other sources as administrative registers.

We assume that in the frame a set of M MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamytaa aa@3AD3@ auxiliary variables X m ( m = 1 , , M ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbcvPDwzYbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0x e9LqFf0xe9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9 q8as0lf9Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcba GaamiwamaaBaaaleaacaWGTbaabeaakmaabmaabaGaamyBaiabg2da 9iaaigdacaGGSaGaeSOjGSKaaiilaiaad2eaaiaawIcacaGLPaaaaa a@45B0@ are available. This set may contain different typologies of variables (nominal, ordinal, or continuous). We assume also that continuous auxiliary variables are split into classes by applying suitable transformation algorithms.

All such variables can potentially be used to stratify the units in the frame.

Under these assumptions, we can associate to each auxiliary variable a vector d m = { x 1 , , x k m } MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbcvPDwzYbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0x e9LqFf0xe9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9 q8as0lf9Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcba GaamizamaaBaaaleaacaWGTbaabeaakiabg2da9maacmqabaGaamiE amaaBaaaleaacaaIXaaabeaakiaacYcacqWIMaYscaGGSaGaamiEam aaBaaaleaacaWGRbWaaSbaaWqaaiaad2gaaeqaaaWcbeaaaOGaay5E aiaaw2haaaaa@4921@ of contiguous integer values, each of them representing an original value in the domain set.

Then, the most detailed stratification of F MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamOraa aa@3ACC@ can be considered as the result of the Cartesian product C P = X 1 × X 2 × × X M . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaam4qai aadcfacqGH9aqpcaWGybWaaSbaaSqaaiaaigdaaeqaaOGaey41aqRa amiwamaaBaaaleaacaaIYaaabeaakiabgEna0kablAciljabgEna0k aadIfadaWgaaWcbaGaamytaaqabaGccaGGUaaaaa@4A3E@

The maximum number of strata will be K = m = 1 M k m I * , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbcvPDwzYbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0x e9LqFf0xe9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9 q8as0lf9Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcba Gaam4saiabg2da9maaradabaGaam4AamaaBaaaleaacaWGTbaabeaa kiabgkHiTiaadMeadaahaaWcbeqaaiaacQcaaaaabaGaamyBaiabg2 da9iaaigdaaeaacaWGnbaaniabg+GivdGccaGGSaaaaa@48C4@ where I * MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamysam aaCaaaleqabaGaaiOkaaaaaaa@3BA9@ is the number of impossible or absent combinations of values in the frame. So, the most detailed stratification of the frame is such that it contains K MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaam4saa aa@3AD0@ strata, corresponding to all possible combinations of values in the M MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamytaa aa@3AD2@ auxiliary variables. We call atomic strata the strata belonging to this particular stratification. Each atomic stratum is characterised by a unique combination of values of the M MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamytaa aa@3AD2@ auxiliary variables. We can assign a label l k ( k = 1 , , K ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamiBam aaBaaaleaacaWGRbaabeaakmaabmaabaGaam4Aaiabg2da9iaaigda caGGSaGaeSOjGSKaaiilaiaadUeaaiaawIcacaGLPaaaaaa@43A3@ to each atomic stratum.

If we consider the labelled set of atomic strata L = { l 1 , l 2 , , l K } , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamitai abg2da9maacmaabaGaamiBamaaBaaaleaacaaIXaaabeaakiaacYca caWGSbWaaSbaaSqaaiaaikdaaeqaaOGaaiilaiablAciljaacYcaca WGSbWaaSbaaSqaaiaadUeaaeqaaaGccaGL7bGaayzFaaGaaiilaaaa @47A6@ we can define the set of all its possible partitions P 1 , P 2 , , P B , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamiuam aaBaaaleaacaaIXaaabeaakiaacYcacaWGqbWaaSbaaSqaaiaaikda aeqaaOGaaiilaiablAciljaacYcacaWGqbWaaSbaaSqaaiaadkeaae qaaOGaaiilaaaa@4341@ where B MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamOqaa aa@3AC7@ can be calculated by using the Bell formula:

B K = i = 0 K 1 ( K 1 i ) B i       ( B 0 = 1 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamOqam aaBaaaleaacaWGlbaabeaakiabg2da9maaqahabaWaaeWaaeaafaqa beGabaaabaGaam4saiabgkHiTiaaigdaaeaacaWGPbaaaaGaayjkai aawMcaaaWcbaGaamyAaiabg2da9iaaicdaaeaacaWGlbGaeyOeI0Ia aGymaaqdcqGHris5aOGaeyyXICTaamOqamaaBaaaleaacaWGPbaabe aakiaabccacaqGGaGaaeiiaiaabccacaqGGaWaaeWaaeaacaWGcbWa aSbaaSqaaiaaicdaaeqaaOGaeyypa0JaaGymaaGaayjkaiaawMcaaa aa@55A6@

We define the set { P 1 , P 2 , , P B } MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaWaaiWaae aacaWGqbWaaSbaaSqaaiaaigdaaeqaaOGaaiilaiaadcfadaWgaaWc baGaaGOmaaqabaGccaGGSaGaeSOjGSKaaiilaiaadcfadaWgaaWcba GaamOqaaqabaaakiaawUhacaGL9baaaaa@44C2@ of partitions of L MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamitaa aa@3AD1@ as the universe (or space) of stratifications.

Assessment of a given stratification

Given a partition P i MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamiuam aaBaaaleaacaWGPbaabeaaaaa@3BEF@ of L , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamitai aacYcaaaa@3B81@ characterized by H MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamisaa aa@3ACD@ strata, let N h MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamOtam aaBaaaleaacaWGObaabeaaaaa@3BEC@ and S h , g 2 , h = 1 , , H , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaam4uam aaDaaaleaacaWGObGaaiilaiaadEgaaeaacaaIYaaaaOGaaiilaiaa dIgacqGH9aqpcaaIXaGaaiilaiablAciljaacYcacaWGibGaaiilaa aa@45B1@ g = 1 , , G MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaam4zai abg2da9iaaigdacaGGSaGaeSOjGSKaaiilaiaadEeaaaa@3FFB@ be respectively the number of units and variances in stratum h MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamiAaa aa@3AED@ of the G MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaam4raa aa@3ACC@ different survey target variables Y 1 , , Y G . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamywam aaBaaaleaacaaIXaaabeaakiaacYcacqWIMaYscaGGSaGaamywamaa BaaaleaacaWGhbaabeaakiaac6caaaa@40E3@ Assuming a simple random sampling of n h MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamOBam aaBaaaleaacaWGObaabeaaaaa@3C0C@ units without replacement in each stratum, the variance of the Horvitz-Thompson estimator of the total of the g th MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaam4zam aaCaaaleqabaGaaeiDaiaabIgaaaaaaa@3CFC@ target variable ( T ^ g ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaWaaeWaae aaceWGubGbaKaadaWgaaWcbaGaam4zaaqabaaakiaawIcacaGLPaaa aaa@3D94@ is

Var ( T ^ g ) = h = 1 H N h 2 ( 1 n h N h ) S h , g 2 n h      g = 1 , , G        ( 2.1 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaaeOvai aabggacaqGYbWaaeWaaeaaceWGubGbaKaadaWgaaWcbaGaam4zaaqa baaakiaawIcacaGLPaaacqGH9aqpdaaeWbqaaiaad6eadaqhaaWcba GaamiAaaqaaiaaikdaaaGcdaqadaqaaiaaigdacqGHsisldaWcaaqa aiaad6gadaWgaaWcbaGaamiAaaqabaaakeaacaWGobWaaSbaaSqaai aadIgaaeqaaaaaaOGaayjkaiaawMcaaaWcbaGaamiAaiabg2da9iaa igdaaeaacaWGibaaniabggHiLdGcdaWcaaqaaiaadofadaqhaaWcba GaamiAaiaacYcacaWGNbaabaGaaGOmaaaaaOqaaiaad6gadaWgaaWc baGaamiAaaqabaaaaOGaaeOlaiaabccacaqGGaGaaeiiaiaabccaca WGNbGaeyypa0JaaGymaiaacYcacqWIMaYscaGGSaGaam4raiaaxMaa caWLjaWaaeWaaeaaqaaaaaaaaaWdbiaaikdacaGGUaGaaGymaaWdai aawIcacaGLPaaaaaa@65EA@

Consider the following cost function

C ( n 1 , , n H ) = C 0 + h = 1 H C h n h        ( 2.2 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaam4qam aabmaabaGaamOBamaaBaaaleaacaaIXaaabeaakiaacYcacqWIMaYs caGGSaGaamOBamaaBaaaleaacaWGibaabeaaaOGaayjkaiaawMcaai abg2da9iaadoeadaWgaaWcbaGaaGimaaqabaGccqGHRaWkdaaeWbqa aiaadoeadaWgaaWcbaGaamiAaaqabaGccaWGUbWaaSbaaSqaaiaadI gaaeqaaaqaaiaadIgacqGH9aqpcaaIXaaabaGaamisaaqdcqGHris5 aOGaaCzcaiaaxMaadaqadaqaaabaaaaaaaaapeGaaGOmaiaac6caca aIYaaapaGaayjkaiaawMcaaaaa@5526@

where C 0 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaam4qam aaBaaaleaacaaIWaaabeaaaaa@3BAE@ indicates a fixed cost (not dependent on the sample size) and C h MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaam4qam aaBaaaleaacaWGObaabeaaaaa@3BE1@ represents the average cost of observing a unit in stratum h . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamiAai aac6caaaa@3B9F@

Given V g ( g = 1 , , G ) , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamOvam aaBaaaleaacaWGNbaabeaakmaabmaabaGaam4zaiabg2da9iaaigda caGGSaGaeSOjGSKaaiilaiaadEeaaiaawIcacaGLPaaacaGGSaaaaa@4431@ the upper bounds for the expected sampling variance for T ^ 1 , , T ^ G , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGabmivay aajaWaaSbaaSqaaiaaigdaaeqaaOGaaiilaiablAciljaacYcaceWG ubGbaKaadaWgaaWcbaGaam4raaqabaGccaGGSaaaaa@40F7@ the classical optimal multivariate allocation problem (Bethel 1985) can be defined as the search for the solution of the minimum (with respect to n h MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamOBam aaBaaaleaacaWGObaabeaaaaa@3C0C@ ) of the linear function C MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaam4qaa aa@3AC8@ under the convex constraints Var ( T ^ g ) V g    g = 1 , , G : MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaaeOvai aabggacaqGYbWaaeWaaeaaceWGubGbaKaadaWgaaWcbaGaam4zaaqa baaakiaawIcacaGLPaaacqGHKjYOcaWGwbWaaSbaaSqaaiaadEgaae qaaOGaaeiiaiaabccacaWGNbGaeyypa0JaaGymaiaacYcacqWIMaYs caGGSaGaam4raiaacQdaaaa@4BF7@

{ min C ( n 1 , , n H ) = C 0 + h = 1 H C h n h Var ( T ^ g ) = h = 1 H N h 2 ( 1 n h N h ) S h , g 2 n h V g        g = 1, , G        ( 2.3 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaWaaiqaaq aabeqaaiGac2gacaGGPbGaaiOBaiaadoeadaqadaqaaiaad6gadaWg aaWcbaGaaGymaaqabaGccaGGSaGaeSOjGSKaaiilaiaad6gadaWgaa WcbaGaamisaaqabaaakiaawIcacaGLPaaacqGH9aqpcaWGdbWaaSba aSqaaiaaicdaaeqaaOGaey4kaSYaaabCaeaacaWGdbWaaSbaaSqaai aadIgaaeqaaOGaamOBamaaBaaaleaacaWGObaabeaaaeaacaWGObGa eyypa0JaaGymaaqaaiaadIeaa0GaeyyeIuoaaOqaaiaabAfacaqGHb GaaeOCaiaacIcaceWGubGbaKaadaWgaaWcbaGaam4zaaqabaGccaGG PaGaeyypa0ZaaabCaeaacaWGobWaa0baaSqaaiaadIgaaeaacaaIYa aaaOWaaeWaaeaacaaIXaGaeyOeI0YaaSaaaeaacaWGUbWaaSbaaSqa aiaadIgaaeqaaaGcbaGaamOtamaaBaaaleaacaWGObaabeaaaaaaki aawIcacaGLPaaaaSqaaiaadIgacqGH9aqpcaaIXaaabaGaamisaaqd cqGHris5aOWaaSaaaeaacaWGtbWaa0baaSqaaiaadIgacaGGSaGaam 4zaaqaaiaaikdaaaaakeaacaWGUbWaaSbaaSqaaiaadIgaaeqaaaaa kiabgsMiJkaadAfadaWgaaWcbaGaam4zaaqabaGccaqGGaGaaeiiai aabccacaqGGaGaaeiiaiaabccacaWGNbGaeyypa0JaaeymaiaabYca cqWIMaYscaqGSaGaam4raaaacaGL7baacaWLjaGaaCzcaiaaxMaada qadaqaaabaaaaaaaaapeGaaGOmaiaac6cacaaIZaaapaGaayjkaiaa wMcaaaaa@848F@

Bethel (1989) suggested that the problem can be more easily solved by considering the following function of n h : MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamOBam aaBaaaleaacaWGObaabeaakiaacQdaaaa@3CD4@

x h = { 1 / n h  if  n h 1  otherwise        ( 2.4 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamiEaS WaaSbaaeaacaWGObaabeaakiabg2da9maaceaabaqbaeaabiqaaaqa amaalyaabaGaaGymaaqaaiaad6galmaaBaaabaGaamiAaaqabaaaaO GaamiiaiaadccacaWGGaGaaeiiaiaabMgacaqGMbGaaeiiaiaad6ga lmaaBaaabaGaamiAaaqabaGccqGHLjYScaqGXaaabaGaeyOhIuQaam iiaiaadccacaWGGaGaamiiaiaadccacaWGGaGaamiiaiaadccacaqG VbGaaeiDaiaabIgacaqGLbGaaeOCaiaabEhacaqGPbGaae4Caiaabw gaaaaacaGL7baacaWLjaGaaCzcamaabmaabaaeaaaaaaaaa8qacaaI YaGaaiOlaiaaisdaa8aacaGLOaGaayzkaaaaaa@5F03@

Using x h MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamiEam aaBaaaleaacaWGObaabeaaaaa@3C16@ the cost function can be written as

C ( x 1 , , x H ) = C 0 + h = 1 H C h x h        ( 2.5 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaam4qam aabmaabaGaamiEamaaBaaaleaacaaIXaaabeaakiaacYcacqWIMaYs caGGSaGaamiEamaaBaaaleaacaWGibaabeaaaOGaayjkaiaawMcaai abg2da9iaadoeadaWgaaWcbaGaaGimaaqabaGccqGHRaWkdaaeWbqa amaalaaabaGaam4qamaaBaaaleaacaWGObaabeaaaOqaaiaadIhada WgaaWcbaGaamiAaaqabaaaaaqaaiaadIgacqGH9aqpcaaIXaaabaGa amisaaqdcqGHris5aOGaaCzcaiaaxMaadaqadaqaaabaaaaaaaaape GaaGOmaiaac6cacaaI1aaapaGaayjkaiaawMcaaaaa@5557@

and the variances as

Var ( T ^ g ) = h = 1 H N h 2 ( 1 1 x h N h ) S h , g 2 x h = h = 1 H N h 2 S h , g 2 x h N h S h , g 2     g = 1 , , G        ( 2.6 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaaeOvai aabggacaqGYbWaaeWaaeaaceWGubGbaKaadaWgaaWcbaGaam4zaaqa baaakiaawIcacaGLPaaacqGH9aqpdaaeWbqaaiaad6eadaqhaaWcba GaamiAaaqaaiaaikdaaaaabaGaamiAaiabg2da9iaaigdaaeaacaWG ibaaniabggHiLdGcdaqadaqaaiaaigdacqGHsisldaWcaaqaaiaaig daaeaacaWG4bWaaSbaaSqaaiaadIgaaeqaaOGaamOtamaaBaaaleaa caWGObaabeaaaaaakiaawIcacaGLPaaacaWGtbWaa0baaSqaaiaadI gacaGGSaGaam4zaaqaaiaaikdaaaGccaWG4bWaaSbaaSqaaiaadIga aeqaaOGaeyypa0ZaaabCaeaacaWGobWaa0baaSqaaiaadIgaaeaaca aIYaaaaOGaam4uamaaDaaaleaacaWGObGaaiilaiaadEgaaeaacaaI YaaaaOGaamiEamaaBaaaleaacaWGObaabeaakiabgkHiTiaad6eada WgaaWcbaGaamiAaaqabaGccaWGtbWaa0baaSqaaiaadIgacaGGSaGa am4zaaqaaiaaikdaaaGccaqGGaGaaeiiaiaabccacaWGNbGaeyypa0 JaaGymaiaacYcacqWIMaYscaGGSaGaam4raaWcbaGaamiAaiabg2da 9iaaigdaaeaacaWGibaaniabggHiLdGccaWLjaGaaCzcamaabmaaba aeaaaaaaaaa8qacaaIYaGaaiOlaiaaiAdaa8aacaGLOaGaayzkaaaa aa@7C70@

Consequently, the multivariate allocation problem can be defined as the search for the minimum (with respect to x h MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamiEam aaBaaaleaacaWGObaabeaaaaa@3C16@ ) of the convex function (2.5) under a set of linear constraints

h = 1 H N h 2 S h , g 2 x h N h S h , g 2 V g      g = 1 , , G        ( 2.7 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaWaaabCae aacaWGobWaa0baaSqaaiaadIgaaeaacaaIYaaaaOGaam4uamaaDaaa leaacaWGObGaaiilaiaadEgaaeaacaaIYaaaaOGaamiEamaaBaaale aacaWGObaabeaakiabgkHiTiaad6eadaWgaaWcbaGaamiAaaqabaGc caWGtbWaa0baaSqaaiaadIgacaGGSaGaam4zaaqaaiaaikdaaaaaba GaamiAaiabg2da9iaaigdaaeaacaWGibaaniabggHiLdGccqGHKjYO caWGwbWaaSbaaSqaaiaadEgaaeqaaOGaaeiiaiaabccacaqGGaGaae iiaiaadEgacqGH9aqpcaaIXaGaaiilaiablAciljaacYcacaWGhbGa aCzcaiaaxMaadaqadaqaaabaaaaaaaaapeGaaGOmaiaac6cacaaI3a aapaGaayjkaiaawMcaaaaa@6174@

An algorithm, that is proved to converge to the solution (if it exists), was provided by Bethel by applying the Lagrangian multipliers method to this problem (an easier algorithm was previously proposed by Chromy (1987); as Bethel pointed out, the Chromy algorithm works in most of the practical cases but there is no proof that it converges if a solution exists).

The optimization approach here illustrated yields a continuous solution, which must be rounded to provide integer stratum sample sizes. The implementation we made of the Bethel algorithm provides the n h MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamOBam aaBaaaleaacaWGObaabeaaaaa@3C0C@ values as the values 1 / x h MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaWaaSGbae aacaaIXaaabaGaamiEamaaBaaaleaacaWGObaabeaaaaaaaa@3CE7@ rounded up to the upper integer.

It should be noted that the same approach can be used to deal with the multidomain problem. Let us consider the usual transformation for the domain estimation problem:

Y i d = { Y i  if the unit  i  belongs to domain  d   0  otherwise  MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamywam aaDaaaleaacaWGPbaabaGaamizaaaakiabg2da9maaceaabaqbaeaa biqaaaqaaiaadMfadaWgaaWcbaGaamyAaaqabaGccaWGGaGaamiiai aabMgacaqGMbGaaeiiaiaabshacaqGObGaaeyzaiaabccacaqG1bGa aeOBaiaabMgacaqG0bGaaeiiaiaadMgacaqGGaGaaeOyaiaabwgaca qGSbGaae4Baiaab6gacaqGNbGaae4CaiaabccacaqG0bGaae4Baiaa bccacaqGKbGaae4Baiaab2gacaqGHbGaaeyAaiaab6gacaqGGaGaam izaiaabccaaeaacaaIWaGaamiiaiaadccacaWGGaGaae4Baiaabsha caqGObGaaeyzaiaabkhacaqG3bGaaeyAaiaabohacaqGLbGaamiiaa aaaiaawUhaaaaa@6B73@

If the quantities previously defined to describe the Bethel approach are computed using the variables Y d ( d = 1 , , D ) , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamywam aaCaaaleqabaGaamizaaaakmaabmaabaGaamizaiabg2da9iaaigda caGGSaGaeSOjGSKaaiilaiaadseaaiaawIcacaGLPaaacaGGSaaaaa@442C@ then the multivariate allocation solution is the solution for the multidomain case.

Selection of the best stratification on the basis of a complete enumeration

In order to choose the best stratification of a given frame, i.e., the one that ensures the minimum cost C ( n 1 , , n H ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaam4qam aabmaabaGaamOBamaaBaaaleaacaaIXaaabeaakiaacYcacqWIMaYs caGGSaGaamOBamaaBaaaleaacaWGibaabeaaaOGaayjkaiaawMcaaa aa@42AD@ associated to a sample whose total size and allocation are compliant to precision constraints, it is possible to proceed as follows:

  • generate the most detailed stratification associated with F , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamOrai aacYcaaaa@3B7C@ that is the set L MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamitaa aa@3AD2@ of atomic strata;
  • enumerate all partitions P i MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamiuam aaBaaaleaacaWGPbaabeaaaaa@3BEF@ of L ; MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamitai aacUdaaaa@3B91@
  • partition P i , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamiuam aaBaaaleaacaWGPbaabeaakiaacYcaaaa@3CA9@ solve the corresponding allocation problem, that is equivalent to determine the vector ( n 1 , , n H ) , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaWaaeWaae aacaWGUbWaaSbaaSqaaiaaigdaaeqaaOGaaiilaiablAciljaacYca caWGUbWaaSbaaSqaaiaadIeaaeqaaaGccaGLOaGaayzkaaGaaiilaa aa@4295@ and calculate the value C i ( n 1 , , n H ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaam4qam aaBaaaleaacaWGPbaabeaakmaabmaabaGaamOBamaaBaaaleaacaaI XaaabeaakiaacYcacqWIMaYscaGGSaGaamOBamaaBaaaleaacaWGib aabeaaaOGaayjkaiaawMcaaaaa@43D1@ associated to P i ; MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamiuam aaBaaaleaacaWGPbaabeaakiaacUdaaaa@3CB8@
  • choose the partition P i MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamiuam aaBaaaleaacaWGPbaabeaaaaa@3BEF@ for which C i ( n 1 , , n H ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaam4qam aaBaaaleaacaWGPbaabeaakmaabmaabaGaamOBamaaBaaaleaacaaI XaaabeaakiaacYcacqWIMaYscaGGSaGaamOBamaaBaaaleaacaWGib aabeaaaOGaayjkaiaawMcaaaaa@43D1@ is minimized.

By so doing, the optimization of the solution is obtained by considering the whole universe of stratifications.

Unfortunately, this procedure is applicable only in situations where the dimension K MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaam4saa aa@3AD0@ of L MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamitaa aa@3AD1@ is low: in fact, the number of partitions (given by the Bell formula) grows very rapidly (for example, B 4 = MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamOqam aaBaaaleaacaaI0aaabeaakiabg2da9aaa@3CC1@ 15, B 10 = MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamOqam aaBaaaleaacaaIXaGaaGimaaqabaGccqGH9aqpaaa@3D78@ 115,975 and B 100 4.76 × 10 115 MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamOqam aaBaaaleaacaaIXaGaaGimaiaaicdaaeqaaOGaeyisISRaaGinaiaa c6cacaaI3aGaaGOnaiabgEna0kaaigdacaaIWaWaaWbaaSqabeaaca aIXaGaaGymaiaaiwdaaaaaaa@47BC@ ). Therefore, in most cases, the complete enumeration of the space of the solutions is not feasible. The present proposal, based on the genetic algorithm, allows to explore the universe of stratifications and to identify the one that is expected not to be far from the optimal.

The genetic algorithm

A genetic algorithm ( G A ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaWaaeWaae aacaWGhbGaamyqaaGaayjkaiaawMcaaaaa@3D1C@ is a search technique used in computing to find exact or approximate solutions to optimization and search problems. Genetic algorithms are a particular class of evolutionary algorithms that make use of techniques inspired by evolutionary biology, such as inheritance, mutation, selection and crossover (also called recombination) (Vose 1999) (Schmitt 2001 and 2004).

A G A MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaam4rai aadgeaaaa@3B93@ is implemented as an iterative computer simulation, in which an initial set of individuals, each one being a potential solution to the current problem (represented by a vector called genome), evolves by inheritance, mutation, selection and crossover, increasing the average fitness of next generations. Here, the fitness corresponds to the objective function defined in the optimization problem so that the evolution results into the maximization (or minimization) of the objective function.

The set of individuals treated in each iteration of the G A MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaam4rai aadgeaaaa@3B93@ is called generation. The evolution is the set of changes that occurs in producing consecutive generations by iterating the process.

At each iteration of the G A , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaam4rai aadgeacaGGSaaaaa@3C43@ after having evaluated the fitness of every individual in the generation, a set of individuals are stochastically selected (privileging those with higher fitness), and modified (recombined and sometimes randomly mutated) to form a new generation. This new generation is then evaluated in the next iteration of the algorithm. As individuals with the best fitness are more likely to be selected for generating individuals for the next generation, the G A MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaam4rai aadgeaaaa@3B93@ produces an increase of average fitness in the course of the evolution.

The parameter mutation rate is expressed as the rate of chromosomes (the genome elements) that can be mutated for each individual at the moment of the generation of children for the next generation. A high value guarantees large differences between successive generations. It should be noted that a high mutation rate makes the G A MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdbrVc=b0P0xb9sq=fFfeu0RXxb9qr0dd9q8as0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaam4rai aadgeaaaa@3B93@ more likely to avoid stagnating at local optima, at the price of a slower convergence to the optimal solution; whilst a low value accelerates the convergence speed, increasing the risk of local optima.

Usually, the algorithm terminates when either a maximum number of iterations has been reached, or the current solution is not improved by continuing the iteration. In both cases, the optimal solution may or may not have been reached.

Previous | Next

Date modified: