Using Genetic Algorithms Effectively

Using genetic algorithms can be easy. Anyone can write a problem to be solved somewhat randomly through generations of candidates producing the most fit through crossing and mutations. However, I don't see any articles about writing them effectively. I have a process that works for me, that I hope to outline here.

In general, most usage of genetic algorithms I see over the net find the strongest candidate. Me... I look for the smartest candidate.

However, before I get into that, let me start from the beginning. I usually use GA when I have an idea of how to solve an issue, but there are too many factors to consider. So then I write up an environment with a list of questions and use GA to test the possibilities to come up with a viable solution. In Java I use the Watchmaker Framework.

<dependency>
	<groupId>org.uncommons.watchmaker</groupId>
	<artifactId>watchmaker-framework</artifactId>
	<version>0.7.1</version>
</dependency>

What I like about Watchmaker is that its pretty fast out of the box, and generally designed for any situation. It's naturally mult-thread but has the ability to be run in a farm of VMs concurrently.

In this example, I will try to outline my process of using GA effectively based on trading stocks. It is a perfect example of how to use GA and how not to use GA.

MACD Chromosome

Let's say you have a Chromosome in this simple GA example that uses Moving Average Convergence Divergence (MACD) to buy and sell stocks. Of course, realistically you can't use just MACD to buy stocks. With MACD, the theory is that you traditionally have a 12 Exponential Moving Average (EMA) and 26 EMA. The difference of the 2 is called the MACD line. Then there is a 9 EMA called the signal line. Where the MACD is 0 is called the zero line or the cross over line.

Typical times to buy are when the MACD line crosses and goes above the crossline or the signal line. Just as the typical times to sell is when the MACD line crosses the goes below the crossline or the signal line.

With Genetic Algorithms you want to create an environment with different individuals and have them fight it out and see who wins. So with stock trading, you want to create an individual that knows how to trade profitably. We will use MACD to teach our candidates how to trade and see which is better. MACD can be done a number of ways:

Different moving average types: Simple, Weighted, Exponential, Kama, etc..
MACD configuration: 12/26, 9 versus 3/10, 16
Track size, or trends in divergence
Look at gain or lose momentum from crossline or signal crosses

Let's use the first two as an example:

object MacdConfig {

  val bitLayout = LinkedHashMap(
    "mcaSettingBits" -> 1,
    "slowMaTypeBits" -> 3,
    "fastMaTypeBits" -> 3,
    "signalMaTypeBits" -> 3)

  def getMaType(maType: Int) = maType match {
    case 0 => MAType.Sma
    case 1 => MAType.Ema
    case 2 => MAType.Wma
    case 3 => MAType.Dema
    case 4 => MAType.Tema
    case 5 => MAType.Trima
    case 6 => MAType.Kama
    case 7 => MAType.Mama
  }

  val configLength = bitLayout.values.foldLeft(0)(_ + _)

}

class MacdConfig(bitConfig: BitString) extends TradeConfig(bitConfig) {

  val config = TradeConfig.parseBitConfig(MacdConfig.bitLayout, bitConfig)

  def getSlowPeriod = {
    config.get("mcaSettingBits").get match {
      case 0 => 12 // 12/26 9
      case 1 => 3 // 3/10 16
    }
  }

  def getFastPeriod = {
    config.get("mcaSettingBits").get match {
      case 0 => 26 // 12/26 9
      case 1 => 10 // 3/10 16
    }
  }

  def getSignalPeriod = {
    config.get("mcaSettingBits").get match {
      case 0 => 9 // 12/26 9
      case 1 => 16 // 3/10 16
    }
  }

  def getSlowMaType = MacdConfig.getMaType(config.get("slowMaTypeBits").get)
  def getFastMaType = MacdConfig.getMaType(config.get("fastMaTypeBits").get)
  def getSignalMaType = MacdConfig.getMaType(config.get("signalMaTypeBits").get)

}

This class takes 4 parameters as a single a BitString 10 bits long. It then converts this BitString into a class containing MACD configuration.

Evolution Engine

Each candidate would have their own BitString, which is like their DNA. You then create a Fitness Evaluator which takes this configuration and uses it in a test to buy and sell stock.

val createConfig: (BitString => TradeConfig) = (bit: BitString) => new MacdConfig(bit); 
val candidateFactory = new TradeConfigFactory(MacdConfig.configLength, createConfig)

val evoOperators: List[EvolutionaryOperator[TradeConfig]] = List(new TradeConfigCrossover(createConfig), new TradeConfigMutation(createConfig, Probability.EVENS))
val pipeline = new EvolutionPipeline[TradeConfig](evoOperators.asJava)
val selectionStrategy = new RouletteWheelSelection
val rng = new MersenneTwisterRNG

val engine = new IslandEvolution[TradeConfig](5, new RingMigration, candidateFactory, pipeline, MaxProfitFitnessEvaluator, selectionStrategy, rng)


val result = engine.evolve(100, 50, 50, 5, new Stagnation(3, true))

From there you run WatchMaker's Evolution Engine to see which candidate wins. In the example above it will create 100 initial candidates each over 5 islands represented as random BitStrings 10 bits long (only 1024 possiblilities). It then takes mates them over a period of 50 generations. After 50 generates, 5 candidates migrates between the islands. It does this until the fittest candidate is the strongest candidate over the past 150 generations. If you do it this way, you can find configurations that take $1,000 and turn it into $10,000+ or much more within a year. This works great with historical data. However, if you take this same candidate and test it with a different stock, or a different time period, chances are that it will fail.

This is because the algorithm bought and sold stocks randomly enough to make great gains in an environment it was tested against but wasn't smart enough to buy and sell stocks in an unfamiliar environment. What this means is that it is best to take an environment and only test algorithms through GA against partial data. Pick out a few candidates based on greatest impact and best decisions, and then test these winning algorithms against the rest of the data.

From there you will learn how truly smart your algorithms are and can tweak them and continue to make them better in 2 directions:

Removing GA configurations/situations that end up not making much of a difference based on testing
Adding smarter strategies to the DNA (For instance the above only looks at MACD, but there are many stategies out there and rules you can design to make them work together effectively)

Once you have it setup properly, the first test will give you a baseline, and as you tweak the strategies you can see your algorithms getting smarter. Progress at this point becomes measurable and that is where you want to be.