Designing an Anti-Incest Algorithm

(Cross-posted on gamasutra.com)


Once upon a time I was designing a game called Trondheim.  




In Trondheim, you would guide a small group of nomadic fairy-folk, the "Tronds," through a great and harrowing migration towards the mythical promised land called "Trondheim" (Norwegian for "Home of the Tronds").  The idea was that this journey would span several generations, so by the time you reached the final goal the original nomads would be long dead, but still live on in the memory and stories of their great-great-grandchildren. Each person would have a unique name and job in the tribe, and you would watch over them as they worked, married, gave birth, grew old, and finally died.

As the tribe would be small enough to track individual relationships, a problem arose: how would I keep the computer from marrying close relatives?  With a larger population, I could handle population growth abstractly, and with a population of anonymous animals I could mostly ignore the issue, but with a small group of individually named human characters, I needed to make sure no player would see Leif marry his sister Sonya.  Since the computer would decide the relationships, I needed an algorithm designed to detect and prevent incest.


Attempt #1:
Family Tree Traversal


My first idea was to simply check two potential partners for family ties, and then disallow certain relationships.  This required a list of "banned" pairings.  Though not universal, most cultures have incest taboos, the majority of which disallow marrying close blood-relatives, and often non-blood relations within the same family (aunts, step-relatives, etc) as well.  Western taboos are largely derived from the Old Testament, so I figured Leviticus was as good a place to start as any.  



Here's the list of marriage partners explicitly banned in the OT (assuming a male subject):

  • Sister / Sister-in-law / Half-sister / Step-sister
  • Mother / Mother-in-law / Step-mother
  • Daughter / Daughter-in-law / Step-daughter
  • Grandmother
  • Grand-daughter
  • Aunt

Using that as a starting point, I came up with the following "ban list" for my algorithm:

  • Ancestors
  • Descendants
  • Siblings
  • In-laws
  • Step-relatives
  • Aunts, Uncles, Nephews, Nieces
  • First cousins



This broadly represents the incest taboos in modern western culture.  It differs slightly from the OT laws - I'm disallowing first cousins (not explicitly banned  in Leviticus),  but allowing second cousins, which I personally find creepy, but is a reasonable concession for a small nomadic population on the edge of survival.  Considering that marriage between first cousins is not unheard of in many parts of the western world (including the United States), our video game nomads are still adhering to a pretty conservative standard here.  I should note that although these taboos are largely a mechanism for addressing inherited diseases, there are social concerns as well, since it would strike most of us as creepy if someone married their mother-in-law, step-sister, or adopted child, even though they aren't related by blood.

Defining these relationships in computable terms requires a "family tree" data structure that grows over time and tracks everyone in the society's relationship to one another.  Whenever any two individuals are considered for marriage, the game runs an "incest check", which returns a value between 0 and 1, with 0 representing "no incest whatsoever," and 1 representing maximum incest, a sibling-sibling pairing.  The idea is to run a bunch of test cases through the algorithm, and then determine what an acceptable maximum value will be.<o:p></o:p>

To calculate the incest factor, I line up each character's ancestors and compare them, father vs father, mother vs mother, patneral grandmother vs paternal grandmother, etc, and count how many ancestors they have in common. This gets complicated fast. 


Let's start with the case of two first cousins:

These first cousins don't have any parents in common - so far, so good, but they do have the same maternal grandparents.  They have 2/6 grandparents in common, and 0/4 parents in common, for a total of 2/10 ancestors, counting two generations back.  If we just count the number of shared ancestors and divide by the total number, we get a value of 0.2.  However, this raises further questions - first, how far back do we need to check?  Second, the above method leaves out the fact that although the cousins don't share any parents, their mothers are sisters.  So what do we do there?  On the one hand, since we've already counted the shared grandparents, this could implicitly account for the related mothers.  On the other hand, perhaps the algorithm should calculate the relative closeness of  each ancestor and then use those as weighted values to affect the final outcome.  At this point, some sort of recursive algorithm that cascades up and back down the family tree would be needed.


As much as that sounds like fascinating math problem, maybe there's an easier, more direct, way to do this.


Attempt #2: Virtual DNA

Since incest is primarily (but not exclusively) a genetic concern, let's try a genetic approach.  The easiest way is to just to assign each character a virtual "DNA" string at birth.  For our initial population, let's start with a diverse gene pool, with each individual having completely unique DNA.  To explain this in story terms, let's assume our society was not originally nomadic - they once lived in a great civilzation underground, but were chased out in a catastrophic invasion of subterranean trolls. This means that at the start of our game, our intitial "tribe" consists of unrelated refugees all thrown together by chance.  <o:p></o:p>

Let's take a look at the DNA of two of our nomads, the tribe chieftans: <o:p></o:p>


Our incest-checker compares the DNA strings, and looks for matching "genes."  There's zero matches here, an incest factor of 0, so this pairing is "legal."  Trondfar and Trondmor have two children, Trondson and Tronddatter, and each inherits half their DNA from their father and half from their mother.  (For simplicity's sake, we just inherit every other gene from each respective parent.  The charts below arbitrarily sort the genes for easy comparison.)

<o:p></o:p>



The incest-checker would return a value of 1.0  (all genes match) for these two, so they can't get married. Instead, Trondson marries a girl named "Sonya," and Tronddatter marries a boy named "Leif," (neither of whom are related):<o:p></o:p>




Then, Trondson and Sonya have a son, "Trondsonson", and Tronddatter and Leif have a daughter, "Tronddatterdatter", yielding these two first cousins:<o:p></o:p>


If these two were considered, the incest-checker would return a value of 8/16, or an incest value of 0.5, which is pretty high.  What's useful about this approach is that we no longer have to track direct relationships, just check how close each person's "DNA" is to another. We can supplement this by also tracking immediate relationships like step-relatives, adoptees, and in-laws to catch "non-blood incest."  All in all, the genetic approach works elegantly, quickly, and doesn't depend on laboriously checking geneology records.  It should be pretty easy at this point to run baseline checks on the relationships we want to disallow until we can determine what the maximum acceptable value is.  Under this model, second cousins would have an incest factor of 0.25.


There are, however, a few short-comings to this method, which brings to...



Attempt #3:
Virtual DNA 2.0

Virtual DNA is a good approach, but it could be improved.  For one, if I choose to make character appearance variable, it seems natural that characteristics such as hair, skin, eye color, etc, could all be driven by individual genes in the DNA string.  Even in a simple indie game with limited sprites, simple pallete-swapping and Mendelian inheritance could make it so children share characteristics with their parents. This requires some tweaks. <o:p></o:p>

So far, we've only been using the virtual DNA to check against incest.  If we're using it to drive physical features, we need to model real genetics more closely.  With the old model, two children of the same sex from the same parents would have the exact same DNA, and thus the exact same appearance. In the new model, each person has two sets of DNA, one from their father, and one from their mother.<o:p></o:p>

Here are our chieftans, then:<o:p></o:p>



Trondson and Tronddatter would still get half their genes from Trondfar and half from Trondmor, but which genes they get in each slot is random. Trondfar passes on a DNA strand that contains a random mix of his two DNA strands, as does Trondmor:<o:p></o:p>



This new system is more realistic, and accounts for inheritance better, but we've got a problem - with such a low number of "genes", it might be easy for our incest checker to fail!  Whereas the previous model would record a match on every gene for a sibling pair (since they are all essentially identical twins), the new model won't.  Let's look:<o:p></o:p>


That's 6 matches for the father's genes, and 7 matches for the mother's, or about a 40% match overall.  In real life, siblings have just over 50% of their genes in common, on average (skewed high because of identical twins), so this result is pretty accurate.  However, since the inherited genes are chosen randomly, it's still conceivable the algorithm could run into edge cases like this:<o:p></o:p>



Here, only about 15% of the genes match, an incest factor of 0.15.  The anti-incest algorithm would mistake these two for distant relatives instead of siblings.  Now, the chances of the above case are pretty slim, but even with just a few thousand people playing the game it would not be uncommon for players to report brothers and sisters randomly pairing off.  It wouldn't cost much to extend the length of the string to say, 50 genes from each parent (100 total), which would essentially eradicate the statistical likelihood of such flukes.  Even if I don't plan on using every single gene to drive some physical trait, the much longer DNA strand will ensure that the incest checker still works.  Interestingly enough, this is a lot like DNA in real life, where most of it doesn't seem to code for anything (that we know of, at least).

Throwing in a safety check to account for close relatives (and non-blood relatives) should keep things from getting creepy.

One final note - no fancy algorithm will protect our society from in-breeding in the long term with such a small population, as after only a few generations of interbreeding almost everyone will be related.  To fix this, we can include an "immigrant" mechanic (similar to the one in Dwarf Fortress), where other Tronds who escaped the cataclysm will randomly run across your tribe and join you, providing fresh, unrelated genes.  

So, there's a random page from my notebook. Maybe it will help someone in a project or something.
Any thoughts?