Research in a Minute: Probability vs. Non-Probability Sampling
by Marc Ryan January 28, 2011
One of my personal gripes about the field of market research is that a very small percentage of the population knows the essential lessons of how to conduct good research. Yet, I consistently find that most people feel they know everything they need to know about how research works.
Then again, how can I blame them?
It’s much too easy to run your own Twitter poll of your friends, or to even create your own survey online and collect responses. How many of us have tried to hone in on an opinion from a friend, only to turn to a 10 point scale? “Come on tell the truth, on a scale from 1 to 10, do these pants make my butt look big?”
After all, isn’t market research just questions and answers?
Obviously the answer is no, at least I hope so; if the answer were yes, I’d find myself quickly becoming obsolete. The reality is that there are a lot of things that need to be considered in any research project. As a supplier of research services, our team often finds themselves in the role of educator as we work to teach or clients the nuances of setting up a study to achieve the goals they’ve set for themselves. Granted, while we’re often working on highly complex solutions that require some advanced statistics to complete, the truth is that everyone could use an understanding of the basics of research. Viola! Research in a Minute. While this series of posts may seem remedial to some, I hope that someone finds value in them.
One important concept that I find very few people understand is the concept of probability vs. non-probability sampling. These are pretty basic, easily digestible concepts but they can become complicated in how they are implemented. But before we go too far, let’s just put out some basic definitions of these sampling techniques.
Probability sampling means that everyone in a given population has an equal chance of being surveyed for a particular piece of research. Let’s say we want to know how many people would choose blue as their favorite color. If we wanted to answer that question in the context of the average American, that would mean that everyone in the United States would have an equal chance of being sampled for the study. The same holds true for sub-segments of the population.
For example, if you wanted the opinions of pregnant moms, a probability sample would mean that every pregnant mom would have an equal chance of participating in the research. Another more relevant example may be that you want to know the color preferences of visitors to Hulu.com. A probability sample would ensure that everyone who goes to Hulu.com has an equal chance of being surveyed. Essentially, probability sampling means that respondents are chosen at random and everyone has an equal opportunity to participate in the research.
Non-probability sampling comes in various shapes and sizes, but the essence of it is that a bias exists in the group of people you are surveying. Let’s think about it in the context of our fictional color preference survey. If I asked the question to all of my friends, the results are not representative of anything other than the opinion of my friends and, specifically, those friends to whom I decided to send the survey. Another example of non-probability sampling would occur if I were to send you the survey and then ask you to pass the survey onto a friend. This effect, called snowballing, creates a biased sample wherein not everyone has an equal chance of being sampled.
In reality, most surveys in the market that we see on a daily basis use non-probability samples. That poll you read on FoxNews.com? Non-Probability sample. Why? Well first, it’s only representative of the people who go to Fox News. Secondly, it’s more representative of people who are frequent visitors to the site. Huh? Well, read on…
Think about it like this, there is a group of 100 people who live in an apartment building. The apartments are numbered from 1-100. If I wanted a probability sample of 20 people living in the apartment building, I could put 100 pieces of paper each labeled with an apartment number in a hat and, closing my eyes, I could pick out 20 apartment numbers at random. Perfect probability sample, as each apartment had an equal chance to be chosen.
On the other hand, if I wanted to be lazy I could identify that in the lobby of the apartment building there’s a telephone. Taking advantage of this fact, I could call that phone 20 times and whoever picked up the phone at that time would be included in my sample. But this approach creates a problem. It’s likely that some residents of the building pass by the phone more frequently than others. Perhaps the person living in apt. 25 is an invalid and never leaves their apartment. If that were the case, his likelihood of being sampled is zero. It could also be that a bunch of friends like to hang out in the lobby and play dice, their likelihood of being sampled is significantly higher than anyone else. In fact there is even a chance that the same person could pick up the phone every time I call. Definitely a non-probability sample.
It’s the same challenge with the Fox News poll. First, the poll is only representative of visitors to FoxNews.com, but it’s also biased toward heavier users of the website. If I don’t go to the website on the day the poll is running, my likelihood of being sampled is zero, but a colleague of mine who visits the site every day has a much higher chance of being included in the poll. What that means is that even though they may have collected 10,000 responses to their poll, the poll only represents the opinions of Fox News visitors, but not just any Fox News visitor, mostly the heavier users of FoxNews.com.
Of course, I’m not writing this piece to push everyone away from non-probability sampling; it has a role in research and often times it’s your only option. There are some great pieces of research in the market today being conducted using non-probability sample that are corrected to address the biases that the non-probability approach brings with it. The important takeaway here is to know what kind of sampling you’re using so you understand the biases that exist in your research, which then tells you how they can be corrected. But that’s a conversation for another day.
Filed under: Research Insights



Visit insightexpress.com

Leave a Comment
XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>
TrackBack URL | RSS feed for comments on this post.