artificial life
Chapter 6: Genetic Programming
6.5 Results
The results of the Symbolic Regression applet are not as obviously positive as the results found with the Genetic Algorithm applet. A major reason is due to the computational limitations of the applet. In general, a population size of 250 is used. Occasionally a very good candidate is found in less than 100 generations, as seen in the screen shots of the applet. However, in general the system quickly finds a local maximum and stays there with no sign of progressing for many generations. When a population of 500 is used good solutions are found very quickly on a regular basis.
The component whose improvement could most affect the performance of the system is the fitness function. In the current implementation the function sums the difference in y-values, regardless of where the goal y-value is relative to the axis. In general, the farther from the x-axis a value is, the less precision that is necessary. Using the current fitness function, a GP that performs well where the function is close to the x-axis but not quite as well for extremely large values is punished greatly, despite its good performance in the most critical part of the problem. A solution to this problem is to normalize the differences calculated by the fitness function so that values of great magnitude in the goal function are not over-valued.
An important detail of the current fitness function is that it punishes larger function trees. This is done so that more efficient programs are found. Another detail is that every randomly generated genotype starts at the top with a Binary Node. This is to guarantee that every generated genotype has at least a minimal level of complexity.