Do you like it? How to assess preference

Written by David Peter Simon — Photograph by Companion Platform

Published:
May 15, 2020
Written by:
- David Peter Simon
Artwork by:
- Companion Platform

For those unfamiliar with the term, it’s a way of understanding preference by asking people to choose which design variation they prefer in an array of options. While it can be helpful in unpacking some aspects of visual attractiveness, it can be problematic when making a design decision.

Often, preference testing leads to a focus on raw numbers, rather than an analysis of why. If not used carefully, it can perpetuate the logical fallacy of “appeal to popularity,” which is when we equate preference to what matters most.

Preference testing isn’t incorrect, per se, but as design practitioners we can extract more meaningful learnings with other methods. We should go beyond preference testing when conducting design research, especially when our studies lie at the intersection of customer experience and business strategy.

In this blog post, I’ll share four alternative ways to research visuals and understand preference. These methods will help you hone how people perceive the atomic elements of visual design—the building blocks that create memorable experiences and recognizable brands.

Impression tests

Impression tests are a qualitative research method that ephemerally display visuals to target users or customers.

In the user-experience community, impression tests are often referred to as 5-second tests. This is because a graphical user interface is shown to people for 5 seconds. According to Paul Doncaster, author of The UX Five-Second Rules, the original “5 second” time length could be traced back to a 1980s paper, which found that it took a certain number of milliseconds for participants to register individual items when asked to perform a task. In practice, it does seem 5 seconds is enough time for people to commit a good number of design elements to memory.

I lead with impression tests because they are the easiest method to set up, and most effective at garnering rich feedback.

When to use impression tests

Impression tests can be helpful when designing landing pages, welcome modals, or digital banner ads—singular and isolated visuals that don’t require additional scrolling or multiple clicks.

For example, we can impression-test whether or not an ad’s color palettes are successfully conveying an emotion, or whether a welcome modal illustration is effectively communicating a message.

Impression tests are particularly suitable for ads and landing pages, as they probe what people remember about visuals—whether they retained the right information about its underlying message—within a short time frame. This mimics the natural engagement of people with these artifacts: scrolling through a content-saturated feed or quickly alternating clicks among multiple tabs.

How it works

Show a target user a design for 5 seconds, then ask them to answer a series of probing questions. Ensure the image you display doesn’t have any scrolling elements, and don’t use any content-dense images that place cognitive burden on the viewer.

Preface the test by establishing context and providing general instructions. Instructions commonly ask participants to imagine a scenario, look at an image for 5 seconds, and then answer a series of follow-up questions.

Examples of follow-up questions include:

What was your first impression (or initial reaction) to what you just saw?
What are some of the main things you remember about that?
How would you describe what you just saw to a friend?
Who do you think the image is meant for?

Rinse and repeat the impression test with 5 users for each of your target cohorts until you gain a saturation of learnings.

Measuring success

Success for impression tests will vary depending on their purpose. One way I determine success is by setting up “criteria” with my stakeholders beforehand, to reach agreement on what qualifies as someone “getting it.” For instance, success can be something like “6 out of 10 participants successfully identified who the landing page was targeted at.”

Reaction cards

Reaction cards use card sorting to understand how people perceive a design. The method is also sometimes referred to by its original name: the Microsoft Desirability Toolkit. That’s because in 2002, two researchers from Microsoft wrote a paper about how they were using 118 adjective-based cards to measure desirability.

Today, reaction cards have been written about by various industry experts. They are accepted as a well-grounded approach for garnering general insight into aesthetics.

When to use reaction cards

Reaction cards can be helpful if there are brand attributes that you want to convey with visuals. You can use reaction cards to determine whether your design elements are bringing these attributes to life in a nonverbal way.

Reaction cards can also be helpful when redefining components within a screen (such as sets of icons or logos) or when evaluating the cohesiveness of restyled design (such as the implementation of a new design system across a series of screens).

As Nielsen Norman Group notes, static screens help focus people’s eyes on the visuals and reduce distractions related to functionality.

How it works

Before the test, select a list of about 25 words that match your strategic values, devoid of descriptions related to functionality, content, or performance. These can be descriptive adjectives from the original 118-word list or custom brand words.

Ensure you have a mixture of positive-, neutral-, and negative-sentiment words. Then choose the words you think apply to your visual design, and the words that don’t apply.

Write the words on cards (physical or digital) and randomize the order of their presentation. I often alphabetize them to avoid bias.

During the test, ask people to sort the words into two columns: “Applies to this design” and “Does not apply to this design.”

After the test, analyze the results by creating a word cloud or frequency chart. Depending on your sample, create comparative word clouds or Venn diagrams based on the different word choices of each group of participants.

Measuring success

The Japanese graphic designer Kenya Hara says, “Design should function as part of a planning process.” With this in mind, reaction cards can be considered successful when you learn whether or not the target audience’s choices match your strategic intent, which then allows you to plan accordingly.

VisAWI

VisAWI, short for visual aesthetics of websites inventory, is a measurement to assess perceived visual aesthetics. It uses a four-statement, matrix-based format.

Psychologists created VisAWI to determine how people subjectively rate the pleasingness of a graphical interface. Graphical interfaces can range from prototypes to slide decks.

When to use it

VisAWI is helpful for evaluating high-fidelity digital prototypes and/or comparing two differently styled user journeys. This is because VisAWI was created to assess the relationship between layouts and color composition.

How it works

VisAWI recommends that you find about 20 people who fit your target audience.

Ask participants to briefly engage with your artifact—for example, through an open-ended, exploratory scenario. Timing will vary depending on whether it’s a comprehensive website or a brief, clickable prototype.

After participants engage with the artifact, ask them to evaluate four statements using a 7-point scale ranging from strongly agree to strongly disagree. The statements are as follows:

Everything goes together on this (prototype/website/presentation).
The layout is pleasantly varied.
The color composition is attractive.
The layout appears professionally designed.

You can employ VisAWI by creating single-response statements in a survey tool such as Google Forms, SurveyGizmo, or Qualtrics, or by using the VisAWI tool offered on its website.

Measuring success

Assign each response a number from 1 to 7, with strongly agree representing a score of 7 and strongly disagree representing a score of 1. In Qualtrics, you can literally recode the values, whereas in Google Forms you might want to do this numerical translation during analysis.

After “recoding values,” add up all of the individual items and calculate a mean value (sum them and divide them by 4). The mean value represents the general factor of aesthetics found in the model. Use the mean values to compare multiple designs or to contrast how different members of your target audience perceive the same design.

The Markup Method

The last method is simple yet engaging in its approach. I picked it up from my colleague Nash, who has a great article on how to win over skeptics with qualitative design, by the way.

It’s called the Markup Method because it asks users to create drawings of abstract concepts on designs. The materials are often symbolic in nature, such as shapes or colors.

When to use it

Use this method with value prop–heavy screens—for example, landing pages—especially those that are trying to communicate value with both visuals and text. I say this because the method helps you assess whether certain pictures, illustrations, or color arrangements are influencing people more so than copy.

How it works

Ask a participant to look at your design, similar to an impression test.

Instead of showing them the design and taking it away, give them a pen (physically or digitally) and ask them to mark up the design. You can suggest colors (like green/yellow/red highlights) or symbols (like X, 0, and stars). Tell them to use shapes and colors to represent what’s working conceptually for them—and what’s not.

The important part is to ask them to describe why they made a particular mark—use this process as a probing mechanism to understand how they are perceiving something. Instruct them to think aloud while they’re drawing or to summarize their drawing after the exercise.

Measuring success

After the tests, aggregate the shapes and/or colors into a heatmap that shows where the design is strongest and weakest. Success in this case is unpacking a visual design’s relationship to users, as well as identifying its strengths and weaknesses.

Against validating visuals

I want to underscore that these methods are not about validating a design direction. They’re about finding a reliable and rigorous way to involve people—users and customers—in complex, multi-stakeholder design initiatives.

At Dropbox, we’ve found these types of methods useful when relaunching our logged-out website user experience or refining our go-to-market brand strategy. These cross-functional projects include various perspectives of marketing, growth, design, and sales, so it is important to focus on the why rather than on what’s popular—internally or externally.

As Bill Moggridge, co-founder of the firm IDEO, once said, “If there’s a simple, easy design principle that binds everything together, it’s about starting with the people.” Why not start with people the next time you want to research and refine your visual design?

Thanks to Michelle Morrison, Christopher Nash, Lisa Hanson, Lauren LoPrete, and John Mikulenka for feedback on earlier versions of this blog post.

Latest in Design Research

In a virtual world, the sky is the limit.

Article — Design Research

Running group research in a pandemic

How the Research Operations team helped take group research to the virtual world.

Popcorn in a blue background arranged from smallest to biggest.

Article — Design Research

Scaling Research at Dropbox

Responsible democratization: how the Design Research team at Dropbox empowers cross-functional partners to conduct competent research