Company: Publicis Sapient
Role: Experience Designer
I worked in one of the leading recipe skills in the Amazon skills store. With more than 150k visits per month, the client team were keen to assess and improve their user experience.
A first look at the usage data showed us that, although thousands of users accessed the skill every month, only a small percentage of them were listening to the recipe descriptions. An even smaller percentage heard the last step of a recipe.
We analysed the skill, simulating the main use cases the client had originally identified. After the analysis and a review the interaction model, we discovered a few areas of improvement:
- Error handling: The skill had multiple dead ends with poor error handling that didn’t allow the user to recover and potentially kept them in an error loop.
- Lack of relevance and variety in the results: The search results only considered the search term and ignored any other aspect of the context.
After the review we started working with the data analyst. We wanted to understand the main user behaviours in the skill and discover more areas of improvement.
The first issue we identified was a large percentage of searches with no results. Further analysis showed that there were 4 main reasons for this error:
1. Lack of content: Visitors searched for a recipe or an ingredient but we couldn’t find any matching recipes (e.g. guacamole, turnip)
2. Unrelated search: Visitors searched for something that wasn’t related to food or Alexa misunderstood the expression (e.g. slime, meow)
3. Cocktails and alcohol: The skill is about food recipes, therefore anyone asking for alcoholic drinks received a generic error message. Alcohol searches accounted for approximately 12% of the null searches.
4. Boiled eggs: Searches around timings for boiled eggs returned a generic error message. Boiled egg searches accounted for approximately 10% of the null searches.
The analysis of the data also allowed us to identify key use cases for the skill. For example, we found that most searches related to meals were done at the time - or near the time - of the meal. This led us to assume that people use Alexa to search for recipes at a time when they are ready to cook. This was a key finding for error handling and context.
Apart from data, we had couple of initial guerrilla research sessions. We used card sorting to understand the users’ priorities for different aspects of recipes (i.e. dietary requirements, ingredients, spiciness, portions, etc.) when they deciding what to cook.
We found that people are mainly interested in the ingredients and the time it takes to prepare and cook the dish. Additionally, people who have dietary requirements (e.g. vegetarian, gluten intolerant) consider this information top priority.
Usability testing has never been more important than it is with voice. When we design for voice, the system needs to learn to talk like the user, not the other way around. Therefore, we tested continuously to understand users’ behaviours and attitudes better.
Our first priority was to ensure that customers were able to find recipes for their needs. We worked with the developers and the copywriter to define a strategy that allowed us to manage the different issues we had found.
Users of voice interfaces will inevitably find errors in their journeys, and therefore handling errors is essential. We dedicated most of our effort to handling errors in the skill in order to improve the quality of the experience.
In order to target users who were looking for cocktails and boiled eggs instructions, we created two simple responses that replaced the generic error response.
In the case of the cocktails, our main goal was to inform users that the purpose of the skill is to provide food recipes, so we created a response that acknowledged their request and reiterated the mission of the skill.
Example conversation of the cocktails journey
We handled 33k queries for alcohol searches within the first three months of implementation.
For the boiled eggs, we created two different answers that gave visitors the information they were needed without any hassle.
We handled nearly 4k queries for boiled eggs in the first three weeks of implementation.
Example conversation of the boiled egg journey
We then took on the task to help customers with searches without results.
We based our solution on the hypothesis (from the data analysis) that the majority of users look for recipes when they are ready to cook. We created a new journey that gave users who didn’t find any results from their search the opportunity to find recipes through a guided search.
In this journey, users can provide 2 ingredients, a dietary requirement and the time they want to cook to get recipe recommendations. With this approach, even if we can’t find a recipe for their search, we can help them find a recipe with the ingredients they have in their fridge.
We tested the happy path and iterated to add handle unhappy paths and errors.
Finally, we included this journey as part of an A/B test in the live skill. In the first 2 months of testing we saw a 15% increase in the number of visitors who heard the details of a recipe compared to the previous solution. The recipe maker was then integrated as part of the live skill.
Additionally, we reviewed all the errors users could find in the skill and defined appropriate messages for each situation, whether it meant providing alternative options or acknowledging the error and ending the interaction gracefully.
Once we ensured that users could easily find recipes, we focused on making the recipe recommendations more relevant. We targeted relevancy from two points of view: refining the search results and considering the time of day.
The review of the data showed that “Dinner” was the second most searched term from all visitors, and “breakfast” and “lunch” were within the top 30. However, the data also revealed that visitors who searched for meals - instead of recipe names or ingredients – were less likely to hear the last step of a recipe.
We did more qualitative research around the topic and found that the recommendations given by the skill when users asked for meals felt “random”, as it wasn’t clear why a specific recommendation was being given.
Our hypothesis was that, if we allowed users to give more details about the type of food they want to eat, the number of users hearing recipes would increase. Based on this, we modified the journey for meal searches. The new journey allows users to decide whether they are looking for inspiration or they want to have a guided search based on their ingredients.
The solution tested well in qualitative research and is now in development.
We also used the time of the day to refine the type of results we give to the users when they do a search. For example, we prioritise recipes tagged as “dinner” if the user searches for recipes at 6pm.
Another area of focus for the project was the copy of the conversations and the refinement of the utterances. We worked with a copywriter to ensure the dialogs were consistent across the skill. We iterated on the copy after every usability testing session.
We also worked on the utterances (the way users can say commands to the skill) in an iterative way, adding all the inputs we gathered from usability testing.
We designed multiple additional journeys to improve the experience and refined them using usability testing. These journeys are in development and still need to run through A/B testing.
Finally, I looked at ways to maintain consistency across different skills from the same brand. I created a set of design guidelines to help other designers create experiences that represent the values of the brand and target the right user needs.