beating up on opt-in Internet samples (again)

Thursday September 10, 2009

Filed under: statistics — jackman @ 10:19 pm

Gary Langer does it again, this time with supporting references to a paper by Jon Krosnick and 6 co-authors; Doug Rivers (finally!) replies and at length. Two of Krosnick’s co-authors are former students of mine; current students are thanked in the acknowledgements; like Krosnick, Rivers is my colleague — Stanford really is ground-zero in this debate.

Langer says:

I welcome any coherent theoretical defense of the use of convenience samples in estimating population values; it’s a debate we need to have.

And in his earlier post he said:

I have yet to hear any reasonable theoretical justification for the calculation of sampling error with a convenience sample.

Got one? Hit me.

Try this: model-based inference is an idea that has been around for a long time, and contrasts quite markedly with design-based inference for data generated by surveys. There is plenty written on this, but I’d suggest starting with a reasonably accessible book on sampling, like Sharon Lohr’s Sampling: Design and Analysis. Model-based inference for survey data is discussed in various places, typically in a “starred section” in each chapter (e.g., here’s how we can do design of and inference for cluster sampling from the model-based perspective, etc). The references provided by Lohr include important works by Basu and Royall etc. See also the delightful book called Combined Survey Sampling Inference by Ken Brewer — if you can get your hands on it. Doug Rivers pointed me to this book a year or two ago and it is a treat (as these things go).
As I’ve said before, as soon as non-response enters the picture we’re relying on models (e.g., what variables to use when weighting for non-response) and the “purity” of randomization in the sampling design is starting to fall by the wayside.

Social scientists and pollsters etc would seem to have a reasonable bead on design-based inference, if the current stridency about “probability samples” is anything to go by. Collectively, we’re ignorant about other approaches, although we’ve been making use of model-based ideas for decades (e.g., weighting to correct for non-response). Doug Rivers is going to be teaching all this stuff and more in his Winter quarter sampling class.

Comments (5)

Comments are closed.

Powered by WordPress

Bad Behavior has blocked 3284 access attempts in the last 7 days.