AI::Evolve::Befunge
Several years ago, I tried to write a CGI othello game. Implementing an AI for that game proved to be an interesting challenge; my implementation used a deep-search algorithm based on a score map and some heuristics, and once I and my friends got used to playing against it, it turned out to be very predictable and rather easy to beat.
Ever since then, I've been looking for a way to write an AI which is smarter than I am.
AI::Evolve::Befunge is a combination of an evolutionary algorithm, an (optional) board-game engine, and a sandboxed Befunge execution environment. It is no longer boardgame-specific, I have since adapted it to allow solving a very broad range of problems, but it still works very well for board games, of course.
With this framework, I have succeeded in breeding funge98 programs which can play and win board games. The problem to solve (for a board game, this would constitute the board-size, rules, scoring, etc) is just a plugin. I started out with a tic-tac-toe plugin, as a simple means of testing the system. Then, I wrote an othello plugin and let it run for a few months. After a few thousand generations, the individuals actually began to play interesting games with eachother. After about ten thousand generations, species began to evolve which could beat that old CGI script.
Since the evolutionary process only requires individuals to be able to beat eachother, there's no direct evolutionary focus (or bias) toward being able to beat that old CGI script. So as you'd expect, some individuals can beat it, most can't. But I take this partial success to be a good sign.
I have released this project on CPAN, you can find the latest version on its CPAN distribution page.
implementation details
The software maintains a running count of 100 individiduals, as its population. For each generation, it calls the perl sort function (which implements the mergesort algorithm), using a gameplay routine as the callback function.
Each individual is born with a sort of “credit balance”, and every action it performs has an associated cost. Individuals may evolve to have larger or smaller codebases, and there is a cost associated with the individual's initial code size. Every move it makes has a cost; repeat instructions have a cost based on the repeat count (but still cheaper than doing the same instruction over and over directly), spawning new threads has a cost, storing items on the stack has a cost, having multiple stacks has a cost. When the credit balance drops to 0, the individual dies. Thus, the individual can choose what to spend its credits on, and yet the system stays balanced in terms of the overall system resources consumed by that individual. (I'm told this idea is somewhat similar to the Taoist philosophy of Qi.)
The gameplay routine plays two games to determine which individual is better; first A goes first against B, and then B goes first against A. It then returns a negative, 0, or positive number depending on the results of these games. Final scores are not taken into account; only whether an individual won or lost. In the event of a tie, it biases toward locals, meaning individuals who were born here rather than individuals who migrated here from another node. In the event of another tie, it biases toward younger individuals (by looking for the highest individual-ID).
After the mergesort completes, each individual is assigned a “fitness” score, from 1 to 100, with the winner being the most “fit” (100). Then, it plays 25 additional matches with random players, and adjusts the player's fitness scores by one point for each match won or lost.
The list is sorted by fitness, and the top 25 scoring individuals (25%) survive to the next generation. The remaining individuals die, and are replaced by newly created individuals, produced by cloning one individual (chosen by a fitness-weighted random function) and incorporating randomly sized rectangles of code from a second individual (similarly chosen by the same weighted random function). New individuals are then subjected to a random amount of mutation (implemented by replacing a randomly sized rectangle with random code).
As new individuals are created, they are named according to the hostname of the machine, the script-number (for the case when a PC has more than one CPU core, and runs more than one thread), and a sequence number. So, each individual has a name which looks like “tweet.1-165538”.
scalability
The software supports computational clusters in a very simple way: you just run multiple copies of the perl script (one copy on each PC). Each of these scripts will spawn one or more threads, based on a config option, so they can make effective use of SMP and multicore CPUs. Each of these scripts is considered a “population”. Individuals are guaranteed cluster-wide unique names, as long as each node in the cluster has a different hostname.
These nodes will occasionally broadcast one or two of the best-fit creatures to other nodes using a broadcast network protocol (currently implemented via a central server and TCP connections). Received individuals are incorporated into the population at the beginning of the next round.
This implements a cheap and easy form of migration. It requires no time-synchronization or polling between nodes, almost no bandwidth, and very little infrastructure, and is therefore the most efficient way I can think of to propagate useful mutations throughout the cluster. It also provides a convenient way to gather samples of species over time and measure progress, because I can simply write these packets out to a log directory.
future plans
- Add a “go” game plugin
- Add more plugins of other styles, weather forecasting for instance
- Finish the web-interface, so that people can play against befunge scripts
- Port the migration code to NetCluster when that project is ready
