artificial life
Chapter 6: Genetic Programming
6.2 Symbolic Regression
Most Genetic Programming tasks require massively parallel computers. These systems generate complex programs and evaluate populations on the magnitude of thousands over as many generations. A simple example of Genetic Programming is required to demonstrate the power of the technique. Symbolic Regression is the process of determining an equation from a given set of points. For example, the equation y=x2 is regressed from the pairs (0,0), (1,1), (2,4), (3,9), etc.
A genotype is evaluated by comparing the results it generates with the results generated by the goal equation. The differences are summed and the lower this final sum, the better the fitness of the individual.
Equations and other forms of Genetic Programs are represented in tree structures. In a program tree the interior nodes contain operators (+,-,*) or functions, anything that can take parameters. The leaves contain the terminals: identifiers, strings, numbers or anything that has a value. Figure 6.2 shows the trees representing two different equations.

Figure 6.1 : Genetic Programming Representation
The number of children that any given node has is dependant upon the number of parameters that the associated function or operator takes. In the case of addition, multiplication and division operators there are two children for each node. The absolute value operator takes one value as input. The constant and variable nodes have no children since there is no way for them to evaluate children.