jackman.stanford.edu/blog
bannerImage

under-coverage bias and Kish on field enumeration for area-based samples

Saturday May 19, 2012

Filed under: ANES,statistics — jackman @ 5:51 pm

As one of the PIs of the 2012 ANES, I gained some exposure to the nitty-gritty of how area probability samples work in practice. We’re using an ABS-frame (the USPS Delivery Sequence File), which we will supplement with some field enumeration in Census tracts where the DSF is thought to be subject to a reasonable amount of under-coverage.

What I’ve learned thus far:

(1) Kish’s Survey Sampling remains something of a bible for practitioners.

(2) This book really is for practitioners, with large sections devoted to what actually occurs in the field, how to walk around a block, listing addresses, etc. Its odd to read this stuff. I mean the rubber does have to hit the road at some point. But so much of it seems a little, well, folksy and even ad hoc, unless I’m missing other parts of the book where the underlying rationales are more rigorously explicated. I guess it has to be that way, when you are trying to keep things simple for the non-statistician field workers.

(3) Take this, the case of how to augment a listing of dwellings when the field worker encounters dwellings not on the list (in our case this would be finding dwellings not on the DSF adjacent to a dwelling sampled from the DSF). From pp341-2 of Survey Sampling (JPGs below are clickable thumbnails), something of a “how-to” guide for the Half Open Interval procedure:

Take all unlisted dwellings if there are less than 5 of them? What is special about 5? If 5 or more, “write the office quickly” (presumably today, you’d call) and “wait for instructions”. And what, exactly, will those instructions be?

I’m sure there is some well-worked out basis for these recommendations somewhere, perhaps elsewhere in the book. At p56 Kish says that the “missed [but discovered] elements receive the same probability of selection as the pre-specified unique listings”.

Ok, but might you get too many unlisted dwelling this way? Interviewer workload becomes an issue then, which where I guess “no more than 5″ might come from.

But could you exploit whatever prior information about DSF under-coverage specific to the locality you’re working in? And at that point I guess you might be stratifying dwellings in a given geographic unit into listed and unlisted and heading towards a dual frame design etc.

Sub-sampling seems another idea: e.g., the design calls for r attempted interviews in a given locale. We sample r listed dwellings in the locale from, say, the DSF; field enumeration adds k to the frame around the listed r, we attempt interviews at r SWOR from the r+k? This keeps the IWR workload down to r attempted interviews and the selection probabilities are “known”.

The literature on snowball or “respondent-driven” sampling in social network land must have some relevant ideas here too. Hitting r listed dwellings and then looking around for unlisted dwellings seems a lot like what goes on with sampling on networks for “hidden” populations etc.

Finally – I have to note that this stuff really is probably 2nd order at best. We’re doing our best on the design for the in-person components of ANES 2012, I think. But there is this big scary monster out there, waiting for us in the Fall when we go into the field, and its name is non-response. As a source of bias this has to be 10x what we’re looking at from DSF under-coverage.

Comments Off

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress

Bad Behavior has blocked 3661 access attempts in the last 7 days.