Hi, I’m Al, Referly’s Chief Design Officer! I’ve been running usability tests for over a decade, longer than I’ve been a designer or working at startups. It’s been my secret sauce across my entire career and yet I’ve never written anything about it until now, so today I’m going to write up almost everything I know and help you get started. (stay tuned for another blog post with advanced tips specifically for testing B2B apps)
Beyond the soft skills of gaining product insights, there’s cold hard financial reasons why you should expand your toolbox – redesigns informed by user testing increase desired metrics by 135% on average. If you’re not doing them, you’re crippling your chances for success.
So, buckle up.
Here’s where we’re headed:
- First we’ll cruise through a bit of history and how user testing came to be.
- Then go over the different techniques that are available today.
- Finally, we’ll end with some tips on how to run them yourself.
Invented By Buzz Lightyear
User testing has its roots in academia and later the ergonomic and functional testing of cockpits for jet pilots. Fun fact for space nerds like me: it’s not a coincidence that all of the early NASA astronauts were test pilots and that many of them still are to this day.
Contrary to the movie The Right Stuff, test pilots weren’t selected to be astronauts because of their bravado. It’s because they were already trained in using and evaluating unfinished equipment under high pressure situations, as well as doing the kind of Method Acting needed to imagine different scenarios and give feedback about how to improve the controls for each situation. This is opposed to the typical pilots that learn the ins and outs of the unalterable machine that they’re given.
These guys definitely weren’t rocket-riding cowboys or dumb monkeys in a can – they had engineering degrees and keen analytical skills.
Over the following decades since the birth of airflight and the space program, the practice of studying the interaction between people and complicated machinery spread out to also test computers, then beyond hardware to evaluating the usability of software. Academia eventually created departments titled Cognitive Science and Computer Human Interaction, exploring and refining the field even more.
Rise of the Web
Fast-forward a few decades to the late 90’s. More people were using software than ever before due to the commercialization of the Internet and its shift from an academic tool to a mainstream communication for everyday consumers consumers. As a result, the early 2000′s witnessed the rise of user-centric design and the birth of UX as a genre, with advocates like Steve Krug and Jared Spool leading the way.
But, there was a problem. User testing was still trapped in its roots as an expensive academic research technique. Labs with two-way mirrors and white-coated moderators cost $10,000 per day, so only governments and large established companies could afford the competitive advantages that usability tests brought.
Then along came Jakob Nielsen, who popularized inexpensive guerilla usability testing through his influential website useit.com and a series of books. (Designing Web Usability was the first book I bought with my first paycheck and it was totally dog-eared by the time I was done with it.)
It’s hard to overstate Nielsen’s influence. He legitimized guerilla methods and brought the technique to the masses by showing that quick informal testing is measurably better than more expensive methods because it can be done more often and at much earlier stages of product development.
The new user testing techniques revolutionized the field. It created superstars like Adaptive Path and Nielsen’s own Nielsen Norman Group, which guide teams through running their own tests and creating user-centric designs, rather than hiring expensive labs to provide research and insights.
In-house teams at countless companies embraced the new techniques. Tivo, for example, did a major redesign in which they ran weekly informal usability tests for 12 weeks and massively improved their product. Meetup took it to another level by running user tests 2-3 times a week and broadcasting it live throughout the office for anyone to watch.
Blueprint For Early Stage Startups And Customer Development
Even better, the democratization of user testing meant that the people making products could get direct feedback on their creations instead of having it filtered through a moderator. It’s an addictive cycle – make something, put it in front of people, gauge their reactions, then make changes based on their feedback. No gatekeepers.
At the same time, software development was going through its own revolution. Methodologies like Agile and XP had similar epiphanies about short iterative cycles instead of long waterfall death marches. The combination of quick user testing to reveal problem areas and rapid software development to create solutions forms the blueprint for today’s product development, particularly at early stage startups and fast-paced internal teams.
A decade later, things have changed since those early days of guerilla user testing. Now there are a few different ways to highlight usability issues and some of them are even easier, quicker, and cheaper to implement that the in-person tests that Nielsen popularized.
Here are 4 test types that I’ve used:
- In person, moderated and lightly scripted: This is the traditional format of a usability test and of the in-person techniques, it’s the easiest one to get started with. A quick definition of terms: in person means that you’re sitting down with the user, moderated means that there’s someone talking to the user and acting as a sort of referee to keep things on track, and scripted means that there’s a fixed set of tasks written down in advance and that the moderator wants the user to accomplish.
- In person, moderated and unscripted: This is the Listening Lab technique popularized by Mark Hurst. It requires a bit more experience and finesse to pull off in order to gather valid usability insights without turning into a completely unstructured focus group. In it, the moderator asks the user what tasks he or she wants to accomplish before diving into the app, based on their demographic, a loose conversation about their interests, or their impressions of external marketing pages.
- Remote and moderated (scripted or unscripted): This can be roughly the same as the two techniques above, except the moderator and user aren’t physically present in the same room and use a video app like Skype to connect instead. This is especially a help to early stage projects, when initial early users can come from any part of the world.
- Remote, asynchronous and heavily scripted: This is the absolute easiest and most newbie-friendly way to try user testing. It’s enabled by relatively new tools like Usertesting.com, where the moderator and user don’t chat in real time and instead, the user is left on their own to follow step-by-step instructions on how to use the site or app in question, while their mouse movement and audio feedback is recorded.
How To Run A Session
Running each of the different types is pretty straightforward. #4 is the easiest because you simply write the script in advance and then let the software do the job of moderating the users. It’s also super easy to run sessions back-to-back: you can either can tweak the script and run another test, or you can tweak your app and run the exact same script again to see if you solved the problems.
The in-person techniques take some practice to master but can be more rewarding. You can also test them out on friends to build your comfort level before you start recruiting strangers.
The basic framework of an in-person tests is:
- Think about what goals you’d like the user to accomplish (yes, even for the unscripted tests because you still want to see if they can complete certain tasks, although in a looser, unguided format).
- Record everything – at least the audio and screen, if not also their face and body, so you can review later as well as share with the team. It’s a newbie mistake to try to moderate a test at the same time that you’re analyzing it for lessons, and even newbier to ask stakeholders to just listen to a summary when audio/visual feedback can have such a punch.
- Use about 5 or 6 people – any less has too much noise while any more has diminishing returns. Note that a qualitative research method like this is different from a quantitative one like A/B testing, where the cutoff is much higher and more users is always better.
- Let people explore and make mistakes on their own while also keeping them on task. Don’t talk so much that you’re explaining everything on the screen like a car salesman, but also don’t morosely sit back and let them feel judged. Try to strike a balance between friendly and observant.
The last rule is the most important. If you screw that one up, then there’s almost nothing you can do to rescue the session. Talking too much or too little (especially too much) ruins the data collection phase, so the analysis is a moot point.
I have a personal rule of thumb that’s helped me find the right tone – let the user flop around for about 30 seconds whenever they’re stuck before jumping in. Any later than that and they’ll feel stupid and possibly get so frustrated that they shut down, any earlier than that and you’re giving more of a guided tour than collecting genuine reactions.
If you want even more details on how to run an in-person test, my friend Josh wrote up a great introduction on the mechanics of setting up and moderating them.
Common Mistakes While Moderating Tests
One problem with usability testing is that it’s easy to be led astray by a few hidden pitfalls. Things like:
- asking leading or hypothetical questions
- letting the conversation swerve towards subjective aesthetics rather than objective functionality.
- indulging participants when they talk about other hypothetical situations or types of users.
- filling in blank silences with chatter.
- letting silences go on too long until test subjects start to blush and feel stupid.
The most dangerous misstep is intentional confirmation bias – seeking out or only tuning into answers that affirm what you already think is great or problematic about your darling creation. If your heart is already set on blowing away some login form or implementing a shiny new feature, you’ve got too much wax in your ears to really listen well.
Don’t let this stop you from trying your hand at running a usability test. It’s still more art than science – whether learning to moderate a session or decoding the feedback – which means that you’ll only get better if you keep trying. It just takes practice.
So When To Use Which Method?
They each have their tradeoffs. The first three techniques all yield a gold mine of insights, while the last one is crazy quick to set up.
Typically I run in-person tests while creating early stage products or even static prototypes, and use remote or asynchronous user testing on more mature or complete products.
If you’ve got a working app already, the most newbie-friendly method is to pull up Usertesting.com and get started with a quick batch of about 4 or 5 users. I’ve become a huge fan of Usertesting.com in the last 2 years because they have their own army of test subjects on standby, ready to go at a moment’s notice. It’s also much easier to share the raw video and your annotated insights with the rest of your team, and it’s super quick to recycle a script once you’ve written it and run another test.
It used to be a pain in the ass and giant production when I had to explain how to get started with user testing. Now I tell people to just try Usertesting.com out first, then graduate to the moderated versions later when they’re more comfortable.
In-Person Tests Are Still Crazy Useful
But, in-person testing is worth all of the headaches in setting up and decoding the feedback though, because it’ll highlight your most challenging usability obstacles like nothing else. It ends all circular debates on a team about priorities when you’re crowded around a video or live stream of real live humans wandering through the maze, unable to find the cheese that the rest of you can see so plainly.
My friend and UX designer Joshua Kaufman agrees:
“I personally believe that it’s really important to be able to moderate tests 1:1 because there’s a lot of subtle things that you just can’t get from asynchronous testing. So while I think it’s okay to suggest Usertesting.com as the quick and easy solution, encourage readers to learn the skills themselves so that they have a better appreciation of the activity and the users.”
Testing Mobile Apps On Drunks
Mobile apps are an interesting caveat to the drawbacks of in-person user testing and my friend Hang has a technique he calls the The $5 Guerrilla User Test, where you walk up to a group of people at a bar and ask them to try your app in exchange for a few drinks, instead of going through the time-consuming process of recruiting participants. You can even do a rough screen of demographics: male/female, young/old, preppy/cool, etc.
Even more valuable than the convenience is the fact that drunks at a bar behave pretty similarly to distracted people at home. If someone can use your app in a dark crowded environment when they’re hammered out of their gourd, then you can be pretty confident that it’ll work in other calmer situations.
Now Try It Yourself
The best way to learn something is by doing it and thankfully, usability testing is pretty easy to get into. It doesn’t take any specialized skills or equipment, you don’t need anyone’s permission, and the results are always a welcome conversation starter.
I never fail to come away inspired with ideas for how to improve whatever I’m working on, plus relaying the feedback to the team always gets them pumped up to make something better too. It’s a bonding experience – it cuts across all of the circular internal debates about how a particular feature might be perceived and focuses everyone on solving clear problems.
There’s a lot of facets to gathering and analyzing user feedback, which is why I consider it a foundational tool for anyone to learn: PMs, designers, developers, marketers… anyone really. It’s an endlessly fascinating skill set and I’m constantly learning new aspects of it, so I’d love to hear your experiences too. Drop me a note at firstname.lastname@example.org
Thank you to Jason Shen, John Ramey and Joshua Kaufman for reading drafts of this article.