📚 The CoCalc Library - books, templates and other resources
License: OTHER
Evolution of cooperation
Code examples from Think Complexity, 2nd edition.
Copyright 2016 Allen Downey, MIT License
Previous code
From the Chapter 11 notebook, we will reuse Simulation
and Instrument
.
PD Agent
The genome of a Prisoner's Dilemma-playing agent is a map from the previous choices of the opponent to the agent's next choice.
Here's the genome for "always cooperate"
And for "always defect"
And for "tit for tat"
The copy
method has some probability of generating a mutation (in this example, values
is initially a string; after mutation, it's a NumPy array of letters).
The Tournament
Tournament
encapsulates the rules for the tournament.
We can test Tournament
with a few known scenarios.
And then test melee
with a list of three agents.
In this population, "always defect" does best.
Probability of survival
We need a function to map from points per round (0 to 5) to probability of survival (0 to 1). I'll use a logistic curve.
The simulator
The biggest change in the simulator is in step
, which runs melee
to determine the fitness of each agent, and prob_survive
to map from fitness to probability of surviving.
We might want to start with random agents.
Or with all identical agents.
Here are the instruments to compute various metrics.
Niceness
is the average number of C
across the genotypes in the population.
Opening
is the fraction of agents that cooperate in the first round.
Retaliating
is the difference between (1) the fraction of agents that defect after the opponent defects and (2) the fraction of agents that defect after the opponent cooperates.
Forgiving is the difference between the number of agents that cooperate after DC minus the number that cooperate after CD.
Here's another metric intended to measure forgiveness.
Results
Here's a simulation that starts with 100 defectors and runs 5000 timesteps.
Run the simulation. If you get a warning about Mean of empty slice
, that's ok.
And let's look at some results.
Initially, mean fitness is 1, because that's the number of points each defector gets per round when facing another defector.
After a few hundred steps, mean fitness climbs to a steady state near 2.5, although it oscillates around this point substantially.
In a world of all cooperators, mean fitness would be 3, so this steady state is hardly utopia, but it's not that far off.
To average number of C's, across all agents and all locations in the genome, is generally more than half, with a long range mean above 0.6.
The results are similar for the opening move: the fraction of agents who start out cooperating is generally more than half, with a long-range mean above 0.6. This fraction varies widely in time.
There might be a weak inclination toward retaliation, with slightly more defections after the opponent defects.
All of the strategies are forgiving in the sense that they have a short memory, so most of them are capable of cooperating at some point after an opponent has defected. But there is no evidence that they are specifically more likely to forgive a defection from two rounds ago, compared to a defection in the previous round.
The following cells explore the composition of the final population. But because the distribution of agents varies so much over time, the details of a single timestep might not mean much.
Here are the final genomes:
And here are the most common genomes:
Exercise: The simulation in this notebook depends on a number of conditions and parameters I chose arbitrarily. As an exercise, I encourage you to explore other conditions to see what effect they have on the results. Here are some suggestions:
Vary the initial conditions: instead of starting with all defectors, see what happens if you start with all cooperators, all TFT, or random agents.
In
Tournament.melee
, I shuffle the agents at the beginning of each time step, so each agent plays against two randomly-chosen agents. What happens if you don't shuffle? In that case, each agent would play against the same neighbors repeatedly. That might make it easier for a minority strategy to invade a majority, by taking advantage of locality.Since each agent only plays against two other agents, the outcome of each round is highly variable: an agent that would do well against most other agents might get unlucky during any given round, or the other way around. What happens if you increase the number of opponents each agent plays against during each round? Or what if an agent's fitness at the end of each step is the average of its current score and its fitness at the end of the previous round?
The function I chose for
prob_survival
varies from 0.7 to 0.9, so the least fit agent, withp=0.7
, lives for3.33
timesteps, on average, and the most fit agent lives for10
timesteps. What happens if you makeprob_survival
more or less "aggressive".I chose
num_rounds=6
so that each element of the genome has roughly the same impact on the outcome of a match. But that is substantially shorter than what Alexrod used in his tournaments. What happens if you increasenum_rounds
? Note: if you explore the effect of this parameter, you might also want to create an instrument to measure the niceness of the last 4 elements of the genome, which will be under more selective pressure asnum_rounds
increases.My implementation has differential survival but just random reproduction. What happens if you add differential reproduction?
Exercise: In these simulations, the population never converges to a state where a majority share the same, presumably optimal, genotype. There are two possible explanations for this outcome: one is that there is no optimal strategy, because whenever the population is dominated by a majority genotype, that condition creates an opportunity for a minority to invade; the other possibility is that the mutation rate is high enough to maintain a diversity of genotypes even if the majority is non-optimal. To distinguish between these explanations, try lowering the mutation rate to see what happens. Alternatively, start with a random population and run without mutation until only one genotype survives. Or run with mutation until the system reaches something like a steady state; then turn off mutation and run until there is only one surviving genotype. What are the characteristics of the genotypes that prevail in these conditions?