7 Ways to Spot Bad Data

In response to last week’s newsletter, Is Your Research Good Enough for The New York Times?, which discussed hurdles of getting online survey research reported by some news organizations, a customer reminded us that online surveys can be difficult to sell internally as well.  Too many people have been burned by junk data from online surveys.

One problem with online panels is that some respondents (a small minority) participate only to get paid in cash or redeemable credits.  If these respondents are not providing thoughtful answers, the data are suspect.  All panels have the problem, though some are worse than others; reputable sample providers work hard to identify and remove fraudulent respondents from their panels.

But we should not rely on panel providers alone to ensure valid data.  Buyers of panel surveys should always look at the data case by case to identify and remove suspicious cases.  Here are typical indicators of potentially bad data:

1.  Speeding. Though people can legitimately whiz through surveys at varying speeds, we typically flag the fastest five percent for further investigation.

2.  Non-sense open ends.  People who have nothing to say will usually say that, so we flag respondents who type random letters, offer non-sense or vacuous answers, or skip answering entirely.

3.  Choosing all options on a screening question.  Often it  means the respondent was gaming the survey to get in, especially if some options logically exclude others.

4.  Failing quality check questions.  Usually we include a couple of questions that have only one correct response to flag respondents who are not paying attention.

5.  Inconsistent numeric values.  How long a person has worked in a profession or at a particular job, for example, must be consistent with a person’s age.

6.  Straight-lining and patterning.  If questions are laid out in grids, respondents who answer identically for all questions, or who move in a diagonal along the grid should be flagged for investigation.

7.  Logically inconsistent answers.  If attitude and behavior questions are logically related to each other (for example, multiple questions about concern for the environment), inconsistent responses may indicate bad data.

The customer who reminded us that online surveys face multiple hurdles had just gotten results from a survey that she discovered had included a respondent who took the survey 250 times.  Nobody from the research firm bothered to look at the data beyond feeding it into the data-tabulator-chart-maker-here-are-your-actionable-insights machine.

At Versta Research, our approach is the opposite.  Smart people look at your data at each step because there is no other way to turn data into a story that you can trust and then share with your management team with confidence.

Joe Hopper, Ph.D.

Tags: , , ,

Comments are closed.