1.     Description and Origin of the Prisoners' Dilemma

1.1     The Prisoners' Dilemma Story

            The now famous prisoners' dilemma describes a particular social circumstance in which the two actors are positioned such that if each acts rationally, both will end up with sub-optimal outcomes.  The original story describes two prisoners who are separated and unable to communicate.  Each prisoner is being pressured by his or her captors to betray the other prisoner.  The result is that each player is presented with two options: either defect (D) by blaming the other prisoner, or cooperate (C) by refusing to betray the other.  Each player is also aware that the outcome of his or her choice is dependent on the choice the other prisoner makes.  If both prisoners cooperate, both will receive a small penalty: 6 months of prison time.  If both prisoners defect, they will receive a worse penalty: 2 years in jail.  At this point, it appears that both prisoners would be wise to cooperate.  However, we must also consider what will happen if one player defects and the other cooperates.  In this case, all blame is born by the cooperator.  The defector gets off with no prison time and the cooperator is left in prison for 5 years. 

            Herein lies the dilemma: should the prisoner risk being left in jail for 5 years by cooperating or should the prisoner play it safe and defect?  A logical analysis of each option inevitably leads each prisoner to defect and thereby assures that the result will be 2 years in prison.  Given that neither prisoner knows what the other will do, the logical examination of options is as follows.  As prisoner A considers her choices, she will wonder what prisoner B will do.  Suppose prisoner B chooses to cooperate, then prisoner A would be best to defect because then she will receive no prison time, whereas cooperation would lead to 6 months.  On the other hand, if prisoner B choose to defect, then prisoner A would also do best to defect because this would result in 2 years of prison rather than the 5 years prisoner A would receive if she chose to cooperate with the defecting B.  Therefore, no matter what prisoner B's choice, prisoner A would do best to defect.  The optimal individual choices leads to a sub-optimal collective outcome; both players, selecting their best strategies, end up in jail for 2 years whereas they could have been out in 6 months had they cooperated. 

            The prisoners' dilemma story is an instructive example of a social situation which has several specific characteristics.  These characteristics are more easily discussed using a game with positive rather than negative payoffs.  Such a representation of a prisoners' dilemma is presented in figure 1 (taken from Axelrod, 1984).  Here, rather than negative outcomes (like prison time) we represent payoffs as positive utilities.  In this case, the C, C strategy leads to the reward for cooperation (R) payoff for both players and the D, D strategy leads to the punishment for mutual defection (P) payoff for both.  C, D strategy results in the sucker's (S) payoff for the cooperator and the temptation (T) payoff for the defector.  Again we can see that the logical choice for either player is defection, no matter what the choice of the other player.  Furthermore, the result of rational action by each actor results in a sub-optimal outcome for both players (a payoff of 1 rather than a payoff of 3).

 

 

Player j

 

 

Cooperate

Defect

 

Cooperate

Player i

 

3,3

 

0,5

 

Defect

 

 

5,0

 

1,1

 

Figure 1: A Prisoners' dilemma example

 

            Using this example of a prisoners' dilemma, the two formal characteristics of the PD are easy to see.  The first is that the ordering of payoffs must be temptation highest, followed by reward, punishment, and sucker (T > R > P > S).  The second key characteristic is that the sum of the payoffs in the cases of dual cooperation must be greater than the sum of the payoffs in the D, C condition.  In other words, R > (T + S)/2.  Any set of payoffs which meet these conditions is considered a prisoners' dilemma.  A quick arithmetic check verifies that both of the prisoners' dilemma examples given above meet these conditions.  The second condition becomes important when players are involved in repeated prisoners' dilemmas with each other (an iterated prisoners' dilemma). 

1.2     The Repeated Prisoners' Dilemma

            The repeated prisoners' dilemma, (also called an iterated prisoners' dilemma or a prisoners' dilemma supergame) consists of the same players playing a succession of prisoners' dilemma games.  While the one-shot prisoners' dilemma offers little likelihood of cooperation, the repeated prisoners' dilemma (as demonstrated by Axelrod, 1984) increases the likelihood of cooperation because players can react to each other by their play in subsequent rounds and thereby develop a winning style of cooperating.  In the repeated prisoners' dilemma, it is to any player's advantage to convince the other player to cooperate on subsequent rounds.  Because one must cooperate at least some of the time in order to accomplish this, the potential for the emergence of cooperation. 

            Of course in the real world, no two people can expect to interact in prisoners' dilemmas forever.  At some point, the interaction will stop and when this occurs, cooperation is likely to break down because the future is no longer important.  If however, there is some probability of future interaction, the potential for cooperation exists.  As Axelrod demonstrates, the value of the probability of future interaction is an important component in allowing cooperation to evolve. 

            Now we can see the importance of the R > (T + S)/2 assumption.  If this condition did not hold, cooperation would be eliminated because players would be able to gain more by alternatively exploiting each other rather than cooperating. 

2.     Extension of Prisoners' Dilemma to the n-Person Context

            The prisoners' dilemma can easily be translated into a n-person game by considering payoff matrices with n dimensions.  For simplicity of presentation, consider a 3 person prisoners' dilemma.  Each player still has two choices, cooperation or defection, only now, instead of a two dimensional matrix with four possible outcome cells, we have a three-dimensional matrix with 8 possible outcome cells.  These eight cells are represented in Figure 2 where cooperating is represented by C and defecting by D. 

            Payoffs are calculated by examining each pair-wise payoff set among players.  For example, using the payoffs from the game presented in Figure 1, we can calculate the payoff to the three players for the C, C, D outcome as follows.  Player i receives 3 points for cooperating with Player j and receives the suckers payoff of zero from the interaction with Player k because k defected.  Player j's outcome is analogous.  Player k receives 5 points for taking advantage of i and 5 points for taking advantage of j,  Thus the total payoffs for players i, j, and k are 3, 3, and 10 points, respectively.  The prisoners' dilemma can be extended to any n number of players and payoff outcomes can be calculated for each of the 2n cells using the same method.

            A careful examination of the n-person matrix reveals that characteristics of the players' choices lead to the same sub-optimal outcomes as the 2-player game.  No matter what the other players decide to do, any player will do better by defecting than cooperating.  Consider Player i in Figure 2.  First, suppose players j and k decide to defect.  If i cooperates, her payoff will be 0, but if i defects she will get 1.  If j and k cooperate, i would receive 6 for cooperating, but would receive 10 for defecting.  And if one of j and k cooperates and the other defects, i would receive 3 for cooperating and 6 for defecting.  It is apparent that defecting dominates cooperating in all cases thereby leading all three players to defect.  Mutual defection, however, results in a sub-optimal outcome in that each player will receive only 1 point whereas complete cooperation would have given each player 6 points.  Given that all of the characteristics of the 2-person prisoners' dilemma hold for the n-person version, solution concepts from the 2-person version can be successfully applied to the n-person version as well (Hardin, 1982).

 

 

Figure 2: An Example of a Three-Person Prisoners' Dilemma