Quantifying Settlers

By, Neo Ellison

Continuing my ongoing quest for mastery of all things analytical, I gave myself a challenge (reward) to build a representation of the game Settlers of Catan using Python to see what kind of insights I could garner from it. Now for those of you not familiar, Settlers of Catan is a resourced based game where players are tasked with colonizing the fictional Isle of Catan by collecting resources and using them to expand their settlements until one player builds enough structures to win. The key to this game is of course collecting resources, what makes this challenging is the setup of the board changes every game and the resources you collect each turn are actually determined by rolling dice, so from game to game there can be a lot of variability.

Sample board:

Sample Settlers board on our glorious hardwood floors

To receive resources players will place little wooden pieces called settlements on the corners of the hexes (nodes), and when the dice are rolled the each player will collect the resource cards for any settlement with an adjacent hex which had the same number as the dice. Of course there are a lot of other aspects of the game, but for this exercise this is enough explanation for you to follow along.

Now to create a computer representation requires thinking of the Settlers board in the same way a computer can understand it. So a node in Settlers can be thought of as just a point in two-dimensional space. Taking this a bit further there are 54 nodes in all and they are connected to form 19 hexagons. This is a good start to get the feel of the board, and add some graphing and a little styling and you can use that to draw some pretty hexagons, cool but we are far from being able to do anything useful:

The basic Settlers board

Next you’ll need to tell the computer how the nodes and hexagons are related starting with the concept of being adjacent. This is important because in the game when the dice are rolled only the nodes adjacent to the hex with the number rolled will get the resources. Once this is done we need two more attributes to our hexes, we need to define which resources should be applied to each hex and which number so that when the dice are rolled we know which hex they affect.

I am going to skip the details on how I accomplished all this, don’t worry all the code is available on my github, see tying previous posts together. So we now have a board of hexes and nodes with a bunch of properties, all that is missing is the roll mechanism. Putting this together was on the complex side, with mapping a roll to the hexes with attribute number == roll, to that hex’s resource, applying that resource to the adjacent nodes, and capturing the results. But put it together and… BOOM! We have a very basic representation of a Settlers board that a computer can iterate through to gain insights.

Now this is where things got fun for me, I added a little function to allow simulation of resource distribution collected over the course of a game, estimated at 200 rolls. Also I added some pretty visualizations, a histogram of the distribution of which rolls took place, some bar charts on each node which show you the proportion of resources collected, and finally a very nice top 10 list of the most prolific nodes, not too shabby.

Given this was my first attempt at making a representation of a game programmatically to gain insights I am pretty pleased. I have a better understanding of a game I love and it was a lot of fun to boot. But as I started running simulations something strange started happening, the distributions were not behaving:

Boards behaving badly

I am assuming all of you are familiar with the normal distribution, also called a bell curve, well the distribution above is not matching expectation. Someone forgot to tell the 11s that they are a very unlikely number and should not be rolled nearly 10% of the time! Why is this happening? Well that is actually a pretty simple answer: sample size. There are just not enough rolls in the average Settlers game to minimize the variance enough to provide a consistent roll distribution. This lack of consistency actually provides a well documented phenomenon called a “variable rewards system”, and is vital to keeping a person attentive and engaged. And that phenomenon is what makes Settlers of Catan so damn interesting, TIL.

Still, next time you are playing Settlers, this knowledge will probably provide little comfort when your settlement in between a 6 5 9 is taking a nap while your wife’s 4 11 11 is as busy as a beaver that just discovered triple espressos. But at least you can take pride in the fact that it was relatively unlikely that she would stomp you so mercilessly, much to the amusement of everyone else at the table.