
Designing the user experience of Top Trumps
This is a kind of follow-up post to a previous analysis of a pack of Top Trumps. If you don’t know what Top Trumps are, I suggest you look at that. I was contacted via this blog by some people who had designed a pack of Top Trumps. They wanted my help with making their pack more fun to play. I did some analysis (using different tools to the previous post) and I’ll go into the results and reasons behind them below.
The pack
The pack has 5 properties per card, and there are 32 cards in the pack. The makers passed on comments from people who had played the game. They reported that you could get onto a winning streak such that the game would be over quickly. This wasn’t as much fun as if the game lasted a bit longer, with more to-and-fro. The makers wondered if the numbers on the cards could be tweaked to make the game more balanced.
My approach
My first thought was that I’d do some kind of evolutionary or hill-climbing algorithm to tweak the scores. For this I’d model the pack using a program, and get the program to play lots of games. I would then make random tweaks to the scores, play more games and accept the tweaks if they produced longer-lasting games. If the tweaks didn’t produce better results I’d throw them away and try different ones. I’d repeat this process until either I got bored waiting for the program to finish or the games lasted long enough.
However, I had a second thought that meant I didn’t do any hill climbing. I noticed that, unlike packs of Top Trumps I’d played as a child, all the properties on the cards were numbers in the range 1-100. When I thought back to packs I’d played with, such as one about cars, the properties would be something like:
- Top speed, mph, 100-300
- 0-60 time, seconds, 3-10 (lowest wins)
- Engine size, cc, 1000-3000
- Weight, kg, 800-3500
- Cost, £, 3000-250000
If all properties were numbers in the same range (where bigger meant better for each of them), then it’s easy to pick the highest value on a card. This will likely (but not certainly – see below) give you the best chance of winning with the card. By contrast, if you have very different ranges for the different properties it’s harder to pick the property that gives you the best chance of winning.
So my theory was there was a user experience (UX) problem rather than a maths and numbers problem, and the solution was to make it harder to do something (picking the best property on a card) in the name of having more fun playing the game.
Analysis
I wrote some C# code to model the pack and a game using the pack, and then got the program to play itself a lot. (“A lot” means it shuffled the deck 100 times, and for each shuffle played 100 games, resulting in 10,000 games in total.) I tested out a few different strategies for picking which property to play, and compared how long games were as a result. (To simplify things, all players used the same strategy.) There were four strategies:
- Picking a property at random
- Picking the property that had the highest score
- Picking the property that had the best chance of winning
- A hybrid of options 1 and 3.
The first two are probably self-explanatory. The third one is what I was referring to above, and I’ll explain it now. If a given card had properties A and B, where A had a score of 72 and B had a score of 79, then picking the highest score would mean picking B. However, it might be that there are lots of cards with scores in the range 80-100 for B, but only one card with a score higher than 72 for A. Therefore, to have the best chance of winning you should pick A rather than B. If all properties were numbers in the the same range (e.g. 1-100), then option 2 is trivial. Option 3 needs you to remember all scores for all properties on all cards, which I think is beyond most people.
Option 4 is the option that I think is a more likely end result for experienced players, when the cards don’t have properties that are all numbers in the range 1-100. After playing for a while with a pack, I used to remember the top and bottom 2 or so scores for each property. If a card had a property that had a bottom 2 score, I would skip that property. If the card had a property that was a top 2 score, I would choose that property. For all other properties and scores I was largely picking at random.
Results
The graph below shows how long it takes to play a 2 player game based on different strategies for picking card properties.

I have expressed the results like this to make it easier to compare the different strategies, even though they had different ranges of game length. The shortest that any 2 player game can be is 16 rounds, as the players start with 16 cards each and it will take a round per card to hand over the cards. This happens 4-9% of the time with the best chance and highest value strategies, but hardly ever with the other 2 strategies. The longest game is 212 rounds with best chance, 218 rounds with highest value, 926 rounds with hybrid and 1814 with random.
If all the properties have the same range of values, it’s easy to use the strategy behind the purple line, which rises very steeply – indicating most games are short. Making it too hard to pick either the highest value or the property with the best chance means that the green curve is the best that most players can achieve. It’s interesting to see how big a difference is made by memorising only 4 values per property (of the up to 32 values per property) – i.e. how much the green line differs from the orange.
Spreadsheets for fun?
It might seem odd to you that spreadsheets and other models should occur in the same sentence as fun, other than in something like Finding the last bug in my spreadsheet’s formulae is not my idea of fun. However, these techniques are common when designing board games or computer games. They help in the way that many models do – they help you ignore details so you can concentrate on a smaller and more manageable set of important things. They help you answer questions like:
- Is this monster, tool or weapon too weak or too strong?
- Does it make the game unbalanced, boringly easy or overwhelmingly hard?
While there’s no guarantee that players will enjoy a game based on a set of ‘good’ numbers in a model, it’s likely that they won’t enjoy the game if the model contains ‘bad’ numbers. I.e. good numbers are a necessary but not sufficient condition for an enjoyable game. Having a model that can highlight bad numbers, so you can move the game relatively quickly towards better numbers, means you can more quickly get to the next version that’s worth testing with humans.
There are many other things that you need for a fun game, such as graphics, sound (if it’s a computer game), instructions, packaging and so on. These might have their own scaffolding, but that’s beyond the scope of this article. The model here is just for the game’s behaviour. All the various kinds of models are part of the 99% of the game that’s invisible to the players – the effort that the makers put into designing it and so on.
Looking for balance
The graph above is based on (a computer) playing the game lots of times, in different ways. There are other ways of looking at the game that give different insights. If you look at each possible pair of cards A and B, how many of the 5 properties will mean A beats B, B beats A or there’s a draw? The chart below shows the number of ways A beats B minus the ways B beats A, for each pair of cards A and B. A score of -5 means that B beats A for all properties, and +5 means A beats B for all properties.

This chart shows the average for each card – again it’s a number in the range -5 to +5

I’ve deliberately left out all identifying information, to avoid helping people too much with this pack. You can see from the second chart that the cards fall into three groups – the best 5 cards, a middle group of 11, then a worse group of 16. This kind of analysis helps you spot if the game is balanced or not – are there some cards so strong or weak that they make the game boring? It was this kind of analysis that I did for the previous blog post on Top Trumps.
You could call this kind of analysis static, as it just looks at pairs of cards rather than at games as they are played. This contrasts to the first analysis which is more dynamic, as it depends on games playing out. There’s a parallel with software quality analysis that I hadn’t expected before writing this article. A lot of quality analysis runs the code under test and checks it changes the world in expected ways, i.e. its dynamic. Other analysis looks at code as text, and checks things like how long methods are, how deeply indented code gets etc. This doesn’t depend on the code running, and so could be called static. This combination of static and dynamic checks is the basis of the software testing trophy.
Conclusion
I’m no game designer, nor a UX expert. However, I hope you can see from the graphs that games last longer if you make it harder to pick winning cards. This will probably make the game more enjoyable to play, but this would need to be tested by real humans playing the game several times.