Thursday September 10, 2009

Gary Langer does it again, this time with supporting references to a paper by Jon Krosnick and 6 co-authors; Doug Rivers (finally!) replies and at length. Two of Krosnick’s co-authors are former students of mine; current students are thanked in the acknowledgements; like Krosnick, Rivers is my colleague — Stanford really is ground-zero in this debate.

Langer says:

I welcome any coherent theoretical defense of the use of convenience samples in estimating population values; it’s a debate we need to have.

And in his earlier post he said:

I have yet to hear any reasonable theoretical justification for the calculation of sampling error with a convenience sample.

Got one? Hit me.

Try this: model-based inference is an idea that has been around for a long time, and contrasts quite markedly with design-based inference for data generated by surveys. There is plenty written on this, but I’d suggest starting with a reasonably accessible book on sampling, like Sharon Lohr’s Sampling: Design and Analysis. Model-based inference for survey data is discussed in various places, typically in a “starred section” in each chapter (e.g., here’s how we can do design of and inference for cluster sampling from the model-based perspective, etc). The references provided by Lohr include important works by Basu and Royall etc. See also the delightful book called Combined Survey Sampling Inference by Ken Brewer — if you can get your hands on it. Doug Rivers pointed me to this book a year or two ago and it is a treat (as these things go).
As I’ve said before, as soon as non-response enters the picture we’re relying on models (e.g., what variables to use when weighting for non-response) and the “purity” of randomization in the sampling design is starting to fall by the wayside.

Social scientists and pollsters etc would seem to have a reasonable bead on design-based inference, if the current stridency about “probability samples” is anything to go by. Collectively, we’re ignorant about other approaches, although we’ve been making use of model-based ideas for decades (e.g., weighting to correct for non-response). Doug Rivers is going to be teaching all this stuff and more in his Winter quarter sampling class.

