Wednesday, December 21, 2011

The Greatest Common Divisor

Another problem suitable to test Genetic Programming approaches is to evolve an algorithm
that computes the greatest common divisor14, the GCD.
Problem Definition
Definition 21.1 (GCD). For two integer numbers a, b ∈ N0, the greatest common divi-
sor (GCD) is the largest number c ∈ N that divides both, a (c |a ≡ a mod c = 0) and b
(c |b ≡ b mod c = 0).
c = gcd (a, b) ⇔ c |a ∧ c |b ∧ (∄d ∈ N : d| a ∧ d |b ∧ d > c) (21.39)
⇔ max {e ∈ N : (a mod e = 0) ∧ (b mod e = 0)} (21.40)
The Euclidean Algorithm
The GCD can be computed with the Euclidean algorithm15 which is specified in its original
version in Algorithm 21.4 and in the improved, faster variant as Algorithm 21.5 [1559, 913].
Algorithm 21.4: gcd (a, b) ←− euclidGcdOrig(a, b)
Input: a, b ∈ N0: two integers
Output: gcd (a, b): the greatest common divisor of a and b
1 begin
2 while b 6= 0 do
3 if a > b then a ←− a − b
4 else b ←− b − a
5 return a
6 end
Algorithm 21.5: gcd (a, b) ←− euclidGcd(a, b)
Input: a, b ∈ N0: two integers
Data: t: a temporary variable
Output: gcd (a, b): the greatest common divisor of a and b
1 begin
2 while b 6= 0 do
3 t ←− b
4 b ←− a mod b
5 a ←− t
6 return a
7 end
The Objective Functions and the Prevalence Comparator
Although the GCD-problems seems to be more or less trivial since simple algorithms exist
that solve it, it has characteristics that make it hard of Genetic Programming. Assume we
14 http://en.wikipedia.org/wiki/Greatest_common_divisor [accessed 2007-10-05]
15 http://en.wikipedia.org/wiki/Euclidean_algorithm [accessed 2007-10-05]
358 21 Benchmarks and Toy Problems
have evolved a program x ∈ X which takes the two values a and b as input parameters and
returns a new value c = x(a, b). Unlike in symbolic regression16, it makes no sense to define
the error between c and the real value gcd (a, b) as objective function, since there is no relation
between the “degree of correctness” of the algorithm and |c − gcd (a, b)|. Matter of fact, we
cannot say that a program returning c1 = x1(20, 15) = 6 is better than c2 = x2(20, 15) = 10.
6 may be closer to the real result gcd (20, 15) = 5 but shares no divisor with it whereas
5 |10 ≡ 10 mod 5 = 0.
Based on the idea that the GCD is of the variables a and b is preserved in each step
of the Euclidean algorithm, a suitable functional objective function f1 : X 7→ [0, 5] for this
problem is Algorithm 21.6. It takes a training case (a, b) as argument and first checks whether
the evolved program x ∈ X returns the correct result for it. If so, f1(x) = 0 is returned.
Otherwise, we check if the greatest common divisor of x(a, b) and a is still the greatest
common divisor of a and b. If this is not the case, 1 is added to the objective value. The
same is repeated with x(a, b) and b. Furthermore, negative values of x(a, b) are penalized
with 2 and results that are larger or equal to a or b are penalized with one additional point
for each violation. Depending on the program representation, this objective function is very
rugged because small changes in the program have a large potential impact on the fitness. It
also exhibits a large amount of neutrality, since it can take on only integer values between
0 (the optimum) and 5 (the worst case).
Algorithm 21.6: r ←− f1(x, a, b)
Input: a, b ∈ N0: the training case
Input: x ∈ X: the evolved algorithm to be evaluated
Data: v: a variable holding the result of x for the training case
Output: r: the functional objective value of the functional objective function f1 for the
training case
1 begin
2 r ←− 0
3 v ←− x(a, b)
4 if v 6= gcd (a, b) then
5 r ←− r + 1
6 if gcd (v, a) 6= gcd (a, b) then r ←− r + 1
7 if gcd (v, b) 6= gcd (a, b) then r ←− r + 1
8 if v ≤ 0 then
9 r ←− r + 2
10 else
11 if v ≥ a then r ←− r + 1
12 if v ≥ b then r ←− r + 1
13 return r
14 end
Additionally to f1, two objective functions optimizing non-functional aspects should be
present. f2(x) minimizes the number of expressions in x and f3(x) minimizes the number
of steps x needs until it terminates and returns the result. This way, we further small and
fast algorithms. These three objective functions, combined to a prevalence17 comparator
cmpF,gcd, can serve as a benchmark on how good a Genetic Programming approach can
cope with the rugged fitness landscape common to the evolution of real algorithms and how
the parameter settings of the evolutionary algorithm influence this ability.
16 See Section 23.1 for an extensive discussion of symbolic regression.
17 See Definition 1.17 on page 39 for more information.
21.3 Genetic Programming Problems 359
cmpF,gcd(x1, x2) =


−1 if f1(x1) < f1(x2)
1 if f1(x1) > f1(x2)
cmpF,Pareto(x1, x2) otherwise
(21.41)
In principle, Equation 21.41 gives the functional fitness precedence before any other
objective. If (and only if) the functional objective values of both individuals are equal,
the prevalence is decided upon a Pareto comparison of the remaining two (non-functional)
objectives.

1 comment:

  1. Good instruction boss! Have a fun with GCD.

    ReplyDelete

Comment of this content!