# Sampling weights

• I
Hi,
I am trying to understand weights or multipliers allocated to sample in my data(for my research work). They have given a formula as how weight is calculated and how aggregates are to be obtained. But what I am trying to understand is how they obtained the formula for multiplier which they have not given. Is there any online reference where I can understand how they derived the formula I mean based on First stage sample drawn from startified sampling method second stage Simple random sampling and like that. I need a basic text.

## Answers and Replies

Related Set Theory, Logic, Probability, Statistics News on Phys.org
BvU
Homework Helper
What do you know about statistics, errors, etc ? You posted at intermediate level, so I expect you know at least something.
Furthermore, your question hangs in the air without a reference and/or examples:
weights or multipliers allocated to sample in my data
allocated by whom ?

I am sorry if it seems so. I have a large data file and I was confused to see that. I will try to reframe.

FactChecker
Gold Member
It may be important for research work and you specify an intermediate level question. So I will refer you to Sampling Techniques by Cochran. It is a well-known classic.

Thank you data is collected by an Indian government agency, I already knew that data was not very reliable but when I came to sampling design I was like there seems to be no logic nor they have given any. I am working to understand it , it will take time. :)

What formula have "they" given?

I am attaching the estimation procedure file here Schedule 2.34 is questionaire used for survey. If things are not clear I would try to explain. For aggregating data they have given formula but I cannot understand how they derived multiplier.

#### Attachments

• 212.4 KB Views: 142
• 199.4 KB Views: 144
FactChecker
Gold Member
I am attaching the estimation procedure file here Schedule 2.34 is questionaire used for survey. If things are not clear I would try to explain. For aggregating data they have given formula but I cannot understand how they derived multiplier.
It looks like section 4, Estimation Procedure, describes it in great detail. There may not be a simple explanation of their exact formula. For an expert description of stratified sampling, the reference I gave in post #4 is a good one. For a general idea of the technique, see https://en.wikipedia.org/wiki/Stratified_sampling. Translating the general technique into the formulas for you particular application looks complicated.

What exactly confuses me is that they have taken two questionaire( Schedule 0.0 and Schedule 2.34) both with same sampling and they have calculated formula for multiplier different.

FactChecker
Gold Member
What exactly confuses me is that they have taken two questionaire( Schedule 0.0 and Schedule 2.34) both with same sampling and they have calculated formula for multiplier different.
It would make things easier if you would tell us where you are seeing these things. What page and section are these numbers at? If these questionares are optional, there might be a difference in the number of responses received from each.

Other than that, I am not able to read through the entire thing to figure it out. If you have a question about a specific detail of the document, someone might be able to help.

I am attaching the estimation procedure file here Schedule 2.34 is questionaire used for survey. If things are not clear I would try to explain. For aggregating data they have given formula but I cannot understand how they derived multiplier.
There's a clue in Section 4.7 in the first pdf:
(ii) Multipliers have to be computed on the basis of information available in the listing schedule irrespective of any misclassification observed between the listing schedule and detailed enquiry schedule.​

Perhaps reference to the specified schedules will shed more light. I agree with what Fact Checker suggests -- I think that a more detailed question might enable a more specific response -- there are many fractional sum formulae in the table of multipliers.

The question that is directly coming into my mind is how they derived multipliers. It is not so simple question and they have not given derivation of multipliers. Plus to confuse they have given that multiplier for two schedule having same sampling is different. Maybe first I should go through Sampling Techniques by Cochran but there seems to be lack of logic in what they have written(Section 4.7 Page 31).

FactChecker
Gold Member
So the difference between the Schedule 0.0 and Schedule 2.34 factors looks like a multiplier: $$Schedule 2.34 factor_{jd} = \frac{n_{stm}}{n_{stmj}}\frac{E_{stmidj}}{e_{stmidj}}*Schedule 0.0 factor$$,
for the j-th second stage stratum (j=1,2,3, ..., 19) of the d-th segment (d = 1, 2, 9) of the i-th FSU belonging to the m-th sub-sample for the t-th sub-stratum of s-th stratum.

n = number of sample FSUs surveyed including ‘zero cases’ but excluding casualty for a particular sub-sample and sub-stratum.
E = total number of enterprises listed in a second-stage stratum of an FSU / segment of sample FSU
e = number of enterprises surveyed in a second-stage stratum of an FSU / segment of sample FSU
Beyond that, I am having a hard time translating the definitions of n, E, and e into logical weighting factors.

They should have actually given a prove of how they have derived it. I am trying to understand it.