Genetic algorithms

Applying genetic algorithms to show common uses, encoding methods, and differences between strategies. The TSP and "Add" problems are considered.

Stylized image from rectangles


Petofi 200 - recreated by a genetic algorithm

  • Each chromosome contains the same amount of rectangles: their coordinates and colors.
  • The fitness function is the distance between the image drawn from the chromosome and the original image
    • Each image is transformed into a single vector, and the Euclidian distance is taken.

Genetic algorithm for the "Add" Example

The "Add" example shows the use of a Genetic Algorithm to calculate the maximum sum of nn numbers. The example shows the difference between two distinct insertion strategies and their effect on the outcome.

Chromosomes

Each chromosome is a vector of nn dimensions with values between 0 and 10.

Objective or fitness function

The fitness value is gained from the sum of all elements in a given chromosome:

def fitness(chromosome: list[float]) -> float:
    return sum(chromosome)

Mutation

The mutation operation adds a random number between step_size and -step_size to a value in the chromosome. However, the resulting set of values must be clamped between 0 and 10.

The clamping algorithm:

def clamp(value: float, minimum: float, maximum: float) -> float:
    if value < minimum:
        return minimum
    if value > maximum:
        return maximum
    return value

The mutation operation:

def mutate(chromosome: list[float], step_size = float) -> list[float]:
    """Mutation operation"""

    # Copy the candidate chromosome into a new list
    mutant = list(chromosome)
    # index where the mutation will occur
    mutation_index = random.randint(0, len(chromosome) - 1)

    # Calculate new value
    new_value = mutant[mutation_index] + random.uniform(-step_size, step_size)
    # Assign the clamped new value
    mutant[mutation_index] = clamp(new_value, 0, 10) # 0 and 10 could be extracted to variables
    return mutant

Crossover

Crossover can be implemented in many ways. The simplest is the one-point crossover operation, where we splice two chromosomes together at a given point.

def crossover(parent_1: list[float], parent_2: list[float]) -> list[float]:
    """Crossover operation"""

    # Calculate a random crossover point
    crossover_point = random.randint(1, len(parent_1))

    # Combine the two parents into a child chromosome
    child = list(parent_1[:crossover_point] + parent_2[crossover_point:])

    return child

Tournament selection

Tournament selection is selecting two random chromosomes from a population and then adding the superior one to the new population until the desired population size is reached.

def selection(population: list[list[float]], population_size: int) -> list[list[float]]:
    """Selection operation"""

    # Calculate the fitness values
    fitness_values = [fitness(chromosome) for chromosome in population]

    # No sorting or probability calculation is needed in this case

    # tournament
    new_population = []
    for _ in range(population_size):
        candidate_a = random.choice(population)
        candidate_b = random.choice(population)
        if fitness(candidate_a) > fitness(candidate_b):
            new_population.append(list(candidate_a))
        else:
            new_population.append(list(candidate_b))
    return new_population

The algorithm put together

def genetic_algorithm(
        dimensions=5,
        population_size=10,
        generation_count=10,
        mutation_count=10,
        step_size=2,
        recombination_count=5
) -> list[float]:
    population = []

    # Initialize population as all 0s
    for _ in range(population_size):
        population.append(list([0] * dimensions))

    # Keep track of the best chromosome
    best = list([0] * dimensions)

    # Generations
    for _ in range(generation_count):

        # Mutation
        for _ in range(mutation_count):
            population.append(mutate(random.choice(population), step_size))

        # Recombination
        for _ in range(recombination_count):
            population.append(crossover(
                random.choice(population),
                random.choice(population)
            ))

        # Store best before selection
        best = list(max(population + [best], key=fitness))

        # Selection
        population = selection(population, population_size)

    # return best found when done
    return best

If we run the program:

solution = genetic_algorithm()
print(solution)
print(fitness(solution))

We get a similar output:

[3.3486142698179724, 1.915297468789452, 1.7142444381873236, 2.8145983818426865, 0.21386881852472284]
10.006623377162157

We know that the best possible solution is 510=505 \cdot 10 = 50, so this solution is relatively weak. The problem lies in the combination of our selection and insertion methods. The insertion method appends the newly created chromosomes into the population without any further logic; this may also include several inferior offspring. The population is growing during the mutation and crossover phases. Therefore, the chance of inferior offspring getting into the new population is higher during the tournament selection.

Revising the insertion strategy may yield a favorable result.

In the mutation operation, the original is replaced if the mutant is superior:

def mutate(chromosome: list[float], step_size=float) -> list[float]:
    """Mutation operation"""

    # Copy the candidate chromosome into a new list
    mutant = list(chromosome)
    # index where the mutation will occur
    mutation_index = random.randint(0, len(chromosome) - 1)

    # Calculate new value
    new_value = mutant[mutation_index] + random.uniform(-step_size, step_size)
    # Assign the clamped new value
    # 0 and 10 could be extracted to variables
    mutant[mutation_index] = clamp(new_value, 0, 10)

    if fitness(mutant) > fitness(chromosome):
        return mutant
    else:
        return chromosome

In the crossover operation, the first parent is replaced if the offspring is superior:

def crossover(parent_1: list[float], parent_2: list[float]) -> list[float]:
    """Crossover operation"""

    # Calculate a random crossover point
    crossover_point = random.randint(1, len(parent_1))

    # Combine the two parents into a child chromosome
    child = list(parent_1[:crossover_point] + parent_2[crossover_point:])

    if fitness(child) > fitness(parent_1):
        return child
    else:
        return parent_1

The main body of the algorithm changes very little to accommodate the replacements. Now, the results are far better:

def genetic_algorithm(
        dimensions=5,
        population_size=10,
        generation_count=10,
        mutation_count=10,
        step_size=2,
        recombination_count=5
) -> list[float]:
    population = []

    # Initialize population as all 0s
    for _ in range(population_size):
        population.append(list([0] * dimensions))

    # Keep track of the best chromosome
    best = list([0] * dimensions)

    # Generations
    for _ in range(generation_count):

        # Mutation
        for _ in range(mutation_count):
            candidate_i = random.randint(0, len(population) - 1)
            population[candidate_i] = mutate(
                population[candidate_i], step_size)

        # Recombination
        for _ in range(recombination_count):
            parent_a = random.randint(0, len(population) - 1)
            population[parent_a] = crossover(
                population[parent_a], random.choice(population))

        # Store best before selection
        best = list(max(population + [best], key=fitness))

        # Selection
        population = selection(population, population_size)

    # return best found when done
    return best
[3.457572561043456, 0.5263242940089028, 0.6306894736725606, 7.82397431316552, 3.343323105407211]
15.781883747297652

The results can be improved by increasing the population size, generation count, mutation count, and recombination count.

The Travelling Salesman Problem

The Traveling Salesman Problem (TSP) is a classic example of NP-complete problems. Thus, no algorithm can find the optimal solution in polynomial time. One common approach is the use of genetic algorithms.

Each chromosome contains a route, which is comprised of indices corresponding to cities in the route. Each index appears once and only once. The goal is to find the best order of indices.

During mutation and recombination, the integrity of the permutation must be kept; all indices must appear once and only once. This can be achieved by swapping elements instead of adding, subtracting, and rewriting them.

Utilities for the TSP problem (tsp.py)

import math

def distance(c1, c2):
    """Calculate the Eucledian distance between two cities

    Some benchmarks call for Manhattan distances instead:
    distance = |x2 - x1| + |y2 - y1|
    """
    return math.dist(c1, c2)

def transform_tsp(tsp):
    """Transform the list of coordinates into a dictionary of distances, where
    (city_a, city_b) : distance_from_a_to_b

    Each distance appears twice:
    (city_a, city_b) and (city_b, city_a)

    This approach enables the use of "one way roads" if necessary.
    """
    tsp_dict = {}
    for i in range(len(tsp)):
        for j in range(len(tsp)):
            tsp_dict[(i, j)] = distance(tsp[i], tsp[j])
    return tsp_dict

The genetic algorithm (genetic.py)

import random
import tsp


def mutate(chromosome: list[int]) -> list[int]:
    """Mutation operation

    Mutate a chromosome (route) by swapping two elements (cities).
    """
    mutant = list(chromosome)
    a = random.randint(0, len(mutant) - 1)
    b = random.randint(0, len(mutant) - 1)
    mutant[a], mutant[b] = mutant[b], mutant[a]
    return mutant


def fitness(dict, s):
    """Fitness function

    Calculate the route length by adding the distances together.
    Note: The salesman must return to the origin (base), therefore the distance between the first and last cities must be added too.
    """
    dist = 0
    prev = s[0]
    for i in s:
        dist += dict[(prev, i)]
        prev = i
    dist += dict[(s[-1], s[0])]
    return dist


def crossover(from_c: list[int], to_c: list[int], chunk_size: int) -> list[int]:
    """Crossover operation

    Take a random chunk of chunk_size from parent1 and paste it to a random location in parent2.
    Note: the final result must still be a permutation.
    The desired sequence is obtained by swapping the current values with the desired values inside the chromosome.
    This ensures that every element appears once and only once.

    Example:

    Superior:  | 1 3 2 | 9 5 6 | 8 4 7 |
                         |
                        /
                       |
                       v
    Destination: | 1 3 5 8 4 9 6 2 7 |

    Infected destination: | 1 3 | 9 5 6 | 8 4 2 7 |

    1) Find the segment value in the destination
    2) Swap it with the value currently at the segment location
    3) Repeat for the rest of the segment

    With the previous example:

    segment_0: swapping 9 and 5
    | 1 3 | 9 8 4 | 5 6 2 7 |
    segment_1: swapping 5 and 8
    | 1 3 | 9 5 4 | 8 6 2 7 |
    segment_2: swapping 6 and 4
    | 1 3 | 9 5 6 | 8 4 2 7 |
    """

    # Copy parent2
    new_breed = list(to_c)
    # Starting index of the chunk in parent1
    from_start = random.randint(0, len(from_c) - 1 - chunk_size)
    # Starting index of the chunk in parent2
    to_start = random.randint(0, len(new_breed) - 1 - chunk_size)

    # Swap elements into place
    for i in range(chunk_size):
        v_index = to_c.index(from_c[i + from_start])
        new_breed[to_start +
                  i], new_breed[v_index] = new_breed[v_index], new_breed[to_start + i]
    return new_breed


def genetic_alg(tsp_dict, s: list[int], population_size: int, generations: int, mutations: int, crossovers: int) -> list[int]:

    # Initialize the population as mutation of an original (candidate) solution
    population = [list(s)]
    for _ in range(population_size - 1):
        population.append(mutate(s))
    best_solution = min(population, key=lambda ch: fitness(tsp_dict, ch))
    best_fitness = fitness(tsp_dict, best_solution)

    for _ in range(generations):
        # Mutation with simple insertion strategy (append new results to the population)
        for _ in range(mutations):
            population.append(
                mutate(population[random.randint(0, len(population) - 1)]))

        # Combine superior (better half) with inferior (worse  half) chromosomes
        population = sorted(population, key=lambda ch: fitness(tsp_dict, ch))
        midpoint = int((len(population) - 1) / 2)
        for _ in range(crossovers):
            inferior_i = random.randint(midpoint, len(population) - 1)
            superior_i = random.randint(0, midpoint - 1)
            population.append(
                crossover(population[superior_i], population[inferior_i], 3))

        # Simple selection: keet best n chromosomes
        population = sorted(population, key=lambda ch: fitness(tsp_dict, ch))
        population = population[:population_size]

        # Update best found solution if necessary
        best_chromosome = population[0]
        best_chromosome_fitness = fitness(tsp_dict, best_chromosome)
        if best_chromosome_fitness < best_fitness:
            best_fitness = best_chromosome_fitness
            best_solution = list(best_chromosome)
    return best_solution


def main():
    random.seed(100)
    tsp_problem = ((1, 1), (4, 2), (5, 2), (6, 4),
                   (4, 4), (3, 6), (1, 5), (2, 3))
    tsp_dict = tsp.transform_tsp(tsp_problem)
    s = [0, 4, 2, 6, 5, 3, 7, 1]
    s_opt = genetic_alg(tsp_dict, s, 100, 100, 50, 50)
    print(s_opt, fitness(tsp_dict, s_opt))


if __name__ == "__main__":
    main()
Copyright © Levente Fazekas