Wild swings in polling results have been an ongoing big story this election cycle. The LAT, as Dave Pierre pointed out a week ago, experienced a huge shift in their polling away from their man, Barack Obama, and were left scrambling to come up with a solution. But the LAT is not alone. Last month, P.J. Gladnick highlighted a similarly drastic shift in the Newsweek poll.
What, then, are we to conclude from this polling data? Are Newsweek and the LAT biased in favor of Barack Obama and other Democrats?
Pollster.com sought to answer that question by running regression analysis on every last bit of polling data they could get their hands on. They found clear divergence in the polling data away from their trend line. Not surprisingly, the results showed polls slightly skewed in favor of both candidates. Pollster.com's, Charles Franklin, chooses to refer to these differences as "house effects" rather than bias.
To wit:
Who does the poll affects the results. Some. These are called "house effects" because they are systematic effects due to survey "house" or polling organization. It is perhaps easy to think of these effects as "bias" but that is misleading. The differences are due to a variety of factors that represent reasonable differences in practice from one organization to another.
It is certainly true that there can be reasonable differences in polling practices (e.g. question construction), however, in many cases, these differences are conscious choices made by the various polling institutions. Once again, I'll let Franklin explain what he means:
Organizations differ in whether they typically interview adults, registered voters or likely voters. The differences across those three groups produce differences in results. Which is right? It depends on what you are trying to estimate-- opinion of the population, of people who can easily vote if the choose to do so or of the probable electorate.
This is an interesting question I had not considered. When looking at a poll, I automatically assumed that this was their best guess at who might win the election were it held that day. Like many of you, I could not care less which candidate is more popular generally or even favored by those who cannot or likely will not vote in the November election.
It is here that the construction of the question and the compilation of the data--the "house effects"--play a role. If a left leaning, Barack Obama championing poll wanted to concoct a survey that would favor their man, they would include more data--more than is called for based on the outcome of past elections--from younger voters. This is something we're all familiar with. Democrats expect significantly larger turnout among Obama's twentysomething supporters and as a result, polls that favor him will adjust their polls to include more results from voters that fit this criterion. One could argue, based on historical turnout of this demographic, that including this data is the result of selection bias.
But this isn't the only way to manipulate a poll. An opinion poll including those unable and unlikely to vote would similarly skew the survey in favor of one candidate or the other (think non-residents, etc.).
Let's bring this back to where we started. Pollster.com's analysis of the numerous polls run during this election cycle found that Newsweek and LAT both had "house effects" that averaged about 2 points in favor Obama. That is to say, they were 1 point high on Obama (measured against the Pollster.com trend line) and 1 point low on McCain. Of course, and this is significant in light of the posts written by Pierre and Gladnick, each of these polls were brought back within what might be considered respectable range of the trend line because of the sharp, recent correction to their polling data. If you look closely at the results of the LAT and Newsweek polling data compared to the trend line, they both had huge outliers--the biased polls Pierre and Gladnick brought to our attention.
All of this analysis begs the question: What about the trend line? How can we know that it is an accurate or more accurate telling of the state of the race? Again, I defer to Franklin:
Estimating the house effect is not hard. But knowing where "zero" should be is very hard. A house effect of zero is saying the pollster perfectly matches some standard. The ideal standard, of course, is the actual election outcome. But we don't know that now, only after the fact in November.
As the election continues and polling continues to heat up, comparing the data to the Pollster.com trend line will serve as an effective tool to ferret out which wide shifts in polling data reflect a bias, nay "house effect," in the institution or a true change in the mood of the people.