Statistical Significance in UX

Statistical Significance in UX

July 15, 2019 0 By Kailee Schamberger



In quantitative research, we’re always comparing things. That’s what quantitative data — numerical
data — is really good for. We also need meaningful comparisons in order
to make sense of that quantitative data. Often, we use numerical data to compare
two or more designs. Let’s say we’re comparing our product’s
user experience to our competitor’s. When we compare the two sets of quantitative
data for these products, we want to know if the result is reliable. That’s what statistical significance can tell us. If the difference between two numbers is statistically
significant, it’s reliable. The result we’re seeing probably isn’t
due to random chance. Another way to think about this: if we ran
the study again in the same way a second time, we’d expect to see a similar result. So, if we run a quantitative user research
study, and the data suggests that we have a better UX than our competitor, we want to
know if that’s a reliable finding. We can use statistics formulas to calculate
statistical significance for the relationship between these two data sets. If the formula says that we have statistical
significance, that means it’s reliable, from a statistics standpoint at least. We can probably trust that if we ran the study
again in the same way, we’d again find that we’re better than our competitor. But if we don’t find statistical significance,
there’s some risk that our finding isn’t really there. Maybe if we ran the study again, we’d find
that, actually, our competitor’s UX
looks better than ours. Here’s one thing that sometimes confuses
people — statistical significance only refers to comparisons. You wouldn’t determine statistical significance
for a single set of data. Let’s imagine that I run a quantitative
usability test and find that my users on average took 3 minutes to complete a task. I wouldn’t be able to say that that 3 minutes
is statistically significant. But I could compare the average time on task
for my product to the average time on task for my competitor’s product. And I can determine statistical significance
for that relationship between the two products. Quantitative data is often used by companies
to make decisions. Should we approve funding for a new redesign project? Where should we focus our design efforts? Which design option works best? Whenever you’re trying to interpret numbers,
make sure you ask, do we have statistical significance for this finding? Is this comparison reliable?