### data wars (can you trust Internet samples?)

## Wednesday April 22, 2009

Gary Langer, the Director of Polling at ABC News discusses the properties of “opt-in” Internet samples. His chief gripe: “you need a probability sample to compute sampling error” and so any opt-in Internet poll that reports a standard error is lying.

This is a really important issue, since internet polling is not going away: its too fast, too cheap, and can generate big samples in a hurry; there is a lot to be said for self-completion and presenting multimedia content to respondents; and hence Internet is very attractive relative to other modes. So look for a response from proponents of opt-in sampling in the near future.

Observation: all survey respondents “opt-in”. Would-be respondents (selected via random sampling or not) decide whether to respond or not, or can’t be reached at all. We then weight the data we get to try to deal with any resulting biases. The resulting standard errors should be computed taking the weighting into account (in almost all media polling I see, they are not, and the standard error is computed a la Stats 101 with the number of completed interviews in the denominator), but in any event, even the correct standard errors are conditional on the way the weights were computed. The Stats 101 “textbook purity” of “simple random sampling” has long been left behind…particularly given some of the horror stories you hear about RDD response rates.

So I tend to think the “you can’t trust opt-in Internet polls” line is something of a beat-up. Sure, there is work to be done in understanding the properties of data generated this way, and how to compute a standard error with these data. I don’t see this as an impossible hill to climb. It is critical that this work get done, because if/when we can get comfortable with the bias issues (and we know what the issues are), then I think its game over.

If/when the bias issue more or less neutralized, Internet will most likely kill RDD in terms of sampling variability due to the huge effective sample sizes; exactly how much will turn on how much of a hit Internet takes in sampling variability when making bias adjustments, but that is going to have be a huge hit in order for RDD to wind up dominating Internet (given relative costs, and the fact that RDD is taking a variance hit too in making bias adjustments).

If you treat a poll that occurs often ( as an example the AGE poll) you get an additional piece of information, the number of people who opted in, you get a feel for the number of people who actually care.

Simon,

Are you really saying you can weight and make sensible a survey of the type that newspapers have on their online sites – the ones that people can vote as many times as they like on? If so, you’d have to weight the thing to within an inch of its life surely, and with a massive sample (the massive sample part isn’t difficult), and make sure you get all sorts of data from respondents, and be sure it’s correct.

Or do I misunderestimate you?

Peter:

Ok some more detail:

The job gets *much* harder for those “click here to vote” Internet polls that you see on the newspapers’ web sites, and it is probably impossible to do anything serious with most of them; i.e., in addition to the multiple response problem you mention, those polls are not collecting anywhere near the kinds of demographics you’d want to do any decent weighting etc.

The “better” way to do Internet polling is to have something that isn’t that far away from random sampling in the first instance. For example, you either use traditional phone or other methods to recruit respondents to an Internet panel; you can then randomly sample from that panel (or allocate to quotas) on a project-by-project basis. Up to some firm-specific variation, this is what is done at the more reputable Internet shops: e.g., Knowledge Networks, YouGov/Polimetrix, Harris Interactive. You can use quotas and incentives to manage the panel composition from the get-go, and the way specific projects are generating completes over the field period etc.

YouGov/Polimetrix uses a “matching” procedure: say they get N respondents for a specific project (through the methods described above). They then draw a sample of size M << N from a “target frame” (e.g., a “high quality” sample of the US population such as the CPS or the ACS) and then find the M nearest matches among the N respondents on a series of demographic characteristics (and/or any other items you might have in common in the two surveys). Note that you wind up junking some (if not many) of the N respondents (but again, you can manage that through quotas, incentives, who you are inviting to your panel in the first instance, etc).

If non-response is ignorable given the characteristics you’ve matched on, then you’re unbiased. If not, then how badly have you missed? In either case, how should we compute margins of error that result from this procedure?

Under the best case — when the assumption of ignorability given observable characteristics holds and you have perfect one-to-one matching — you might be able to get a classical result such as V(\hat{p}) \approx \hat{p}(1-\hat{p})/N. There are really interesting questions as to what the bias variance tradeoff looks like for such a procedure, and indeed, how big the biases are in the first place (i.e., is non-response ignorable given observables, or is there some magic “X factor” that determines survey participation over and above observables).

I’m not claiming that this is perfect, far from it. But if you were to compute MSE/$ of various sampling methodologies, then under what scenario does Internet not win? n.b., for the uninitiated, MSE is mean squared error = bias squared + variance for a given estimand, a conventional summary measure of the performance of a statistic. The difficulty for Internet at the moment is that we’re not exactly sure how to compute the variance piece.

Moreover, there is “Internet” and there is “Internet” (garbage insta-polls of the sort Peter alludes to vs the much more careful work I’ve been sketching in this comment; only for the latter do we have any hope of making progress).