## Continuous Function for Prisoners' Dilemma |
|||

HOME | |||

The concept of a two-player non-zerosum game, commonly known as "Prisoner's Dilemma"
has been written about extensively elsewhere, so this introduction will only outline
the basic ideas. The termology of calling it a *game* with two *players*,
is taken from the study of *game theory*, but the same ideas can occur in economic
situations, or social situations, even international diplomacy.

In its simplest form, each player has two possible choices, and they reveal their
choices simultaneously. Their choices are to be *COOPERATIVE* or *UNCOOPERATIVE*
and this results in four possible outcomes:

Player [A] is COOPERATIVE | Player [A] is UNCOOPERATIVE | |
---|---|---|

Player [B] is COOPERATIVE |
Players work together; both players get the PEACE reward. |
Player [A] gets the TEMPTATION reward;Player [B] gets the SUCKER penalty. |

Player [B] is UNCOOPERATIVE |
Player [A] gets the SUCKER penalty;Player [B] gets the TEMPTATION reward |
Players go to war; both players get the WAR penalty. |

It should be noted that the action *COOPERATIVE* means being being cooperative with the other player.
A full background on the story behind the Prisoners' Dilemma can be found elsewhere, and sometimes
other names are given to the outcomes. In general though, we can put the four possible payoffs into
a rank order from most desirable to least desirable:

Most desirable | TEMPTATION |
Gain 10 points |
---|---|---|

Rewarding | PEACE |
Gain 5 points |

Penalty | WAR |
Lose 15 points |

Least desirable | SUCKER |
Lose 20 points |

The exact point values might change, depending on the circumstance, but the above point
value were chosen for this simulation and (as plotted below) they give a linear transfer when
converted into a continuous value function. A bit of study should make it obvious that
if we start in a position were everyone is *COOPERATIVE*, then everyone gets the *PEACE*
reward, which is not the best option for each individual, but it is the best situation
for the group as a whole. One particular individual might consider switching her attitude
to an *UNCOOPERATIVE* position, which in the short term will provide this one
individual with the *TEMPTATION* payoff (twice the value!) but other individuals end
up getting hit with a penalty as they are made into *SUCKERs* by this action
(the worst possible penalty).

The group as a whole is worse off if individuals choose to go down this path so it is logical that some sort of retribution mechanism would exist to discourage the temptation. It should be clear that if some sort of collective entity did exist that was able to perfectly reward the virtuous and punish the deceivers, then we would not be playing a Prisoner's Dilemma game at all anymore, and the whole payoff structure would be different. In this simulation (and in most real-world situations) no godlike entity exists, and it is merely up to the individuals to devise a mechanism between themselves.

Many studies have already been done on the Iterated Prisoners' Dilemma (IPD) and Spatial Prisoner's Dilemma (SPD) using state machines or similar ideas.

It is a fairly simple process to think of the *UNCOOPERATIVE* action as an input
value of **0.0** into a function,
and the *COOPERATIVE* action as an input value of **1.0**. Thus, the intermediate
values between these two extremes can be interpolated and (for example) a value of
**0.5** is midway between *COOPERATIVE* and *UNCOOPERATIVE*.
The result is a function taking two input values, each of which is limited to the
range from **0.0** to **1.0** and the following code snippit shows this
expression written in the "C" language:

z = ( 0 + att_A * att_B * PAYOFF_PEACE + ( 1 - att_A ) * att_B * PAYOFF_TEMPTATION + att_A * ( 1 - att_B ) * PAYOFF_SUCKER + ( 1 - att_A ) * ( 1 - att_B ) * PAYOFF_WAR );

The following plots show the output surface of this function. Since the payoff table is symmetric,
(i.e. it is a fair game) the payoff output of one player is merely flipped on the diagonal
to give the payoff of the other player. More interesting is the collective payoff that is
the some of the two players. The surface is flat in all cases (i.e. the function is linear)
although different payoff weightings can change this. In the situation where it is a flat
surface (i.e. current set of payoff weightings) we can calculate a constant first derivative
for the individual (it is always **-5**) and also calculate a constant first derivative
for the group (it is always **20**). Thus, we have a situation where every incremental
increase in attitude (from *UNCOOPERATIVE* towards *COOPERATIVE*) leaves the individual
worse off, but the group end up better of by 4 times as much.

- Giant list of PD links (by Ariel Dolan)
- Wikipedia article on Prisoner's Dilemma
- Economists frequently use the Prisoner's Dilemma to study competition and cooperation
- Java applet examples to demonstrate Prisoner's Dilemma simulation
- Prisoners' Dilemma at Rational Wiki
- Outline of Robert Axelrod's
*The Evolution of Cooperation*and links, etc - Iterated Prisoners' Dilemma explained on video