Genetic Algorithms : Understanding using Visual Basic
About this tutorial
This tutorial is written with the a strong aim. Its main purpose is to make you understand Genetic Algorithms(GA). This tutorial is written by using examples from the 'Genetic Algorithm using Real Numbers' written by me. It is written in complementary to my code.
Why this tutorial?
This tutorial was written because in my sight there was not a single tutorial with both Genetic Algorithms and Visual Basic in it. I have written this tutorial to help a current VB programmer understand and implement GA with ease.
Genetic Algorithms, What are they? Well ,there is not a single strict definition of GA. Every author gives his own definition. So, I am not going to increase the cluttering but rather give a definition by another author.
Genetic Algorithms are programs that simulate the logic of Darwinian selection, if you understand how populations accumulate differences over time due to the environmental conditions acting as a selective breeding mechanism then you understand GAs. Put another way, understanding a GA means understanding the simple, iterative processes that underpin evolutionary change.
A little bit confusing, isn't it. So, should I expand it to a simple definition. Yes, I should. GA is a algorithm which makes it easy to search a large search space. For example, if we have a number and we want to find its largest divisor. The problem is too easy if the number is small. But the complexity increases as the number increases. Here GAs are used. GAs used Darwinian selection. In 'The Origin of Species', Darwin stated that from a group of individuals the best will survive. By implementing this Darwinian selection to the problem only the best solutions will remain narrowing the search space.
Where GAs can be used?
GAs can be used where optimization is needed. I mean that where there are large solutions to the problem but we have to find the best one. Like we can use GAs in finding best moves in chess, mathematical problems, financial problems and in many more areas.
Are there any disadvantages of GAs?
Yes, there are few disadvantages. But, remember there are more advantages than disadvantages. Disadvantages:
- GAs are very slow.
- They cannot always find the exact solution but they always find best solution.
Explanation of terms
Chromosome: A set of genes. Chromosome contains the
solution in form of genes.
Gene: A part of chromosome. A gene contains a part of solution. It determines the solution. E.g. 16743 is a chromosome and 1, 6, 7, 4 and 3 are its genes.
Individual: Same as chromosome.
Population: No of individuals present with same length of chromosome.
Fitness: Fitness is the value assigned to an individual. It is based on how far or close a individual is from the solution. Greater the fitness value better the solution it contains.
Fitness function: Fitness function is a function which assigns fitness value to the individual. It is problem specific.
Breeding: Taking two fit individuals and intermingling there chromosome to create new two individuals.
Mutation: Changing a random gene in an individual.
Selection: Selecting individuals for creating the next generation.
General Algorithm of GA
The algorithm is almost same in most of the applications only fitness functions are different to different problems. The general algorithm is as follows :
Generate initial population.
Assign fitness function to all individuals.
DO UNTIL best solution is found
Select individuals from current generation
Create new offsprings with mutation and/or breeding
Compute new fitness for all individuals
Kill all the unfit individuals to give space to new offsprings
Check if best solution is found
Explanation of Genetic Algorithm this using Visual Basic
Here is the explanation of GA coded by me in Visual Basic. My algorithm has some differences from the general algorithm.
Define an individual
Public Type Individual
Genome() As Integer 'Genome which holds information
Fitness As Double 'Fitness of Individual
MadeBy As String ' Made By
Here Genome is the array of integers. My algorithm uses integers instead of binary numbers. Integers are easy to handle and probably more efficient in GAs. It's fitness contains the fitness value. And MadeBy holds the information about the process from which it is made.
Public Type Population
NumOfIndivid As Integer 'Number of individuals
Individuals() As Individual 'Individuals
Parents() As Individual 'Parents
MaxFitness As Double 'Maximum fitness
NotifyWhenFitExceeds As Double 'Notify the user if fitness exceeds this value
GenomeLen As Integer 'Length of Genome
DiedIndivid() As Integer 'Individuals who have died coz they had low fitness value
NoOfDied As Integer 'Present number of died individuals
ProbMut As Double 'Mutation probability per cent
ProbCross As Double 'CrossOver probability per cent
StopEvolution As Boolean ' Check if we have to stop evolution
Generation As Integer ' Generation number
FitLim As Long 'Fitness limit below which all will be killed
BestSoFar As Individual ' Best Individual so far
WorstSoFar As Individual ' Worst Individual so far
The code above is self explanatory.
- Build the population
- Evolve the population
- Selection of parents
Sub BuildPopu(Popu As Integer, LenghtOfGenome As Integer, MaxFit As Double, NotifyExceed As Double, Mut As Double, Cross As Double)
'Popu' is the number of individuals. 'LenghtOfGenome' is the length of chromosome 'MaxFit' is the maximum fitness which ca ' n be acquired by an individual. It is ge ' nerally 100. 'NotifyExceed' is the range which tells the algorithm to notify the user when fitness of any individual goes beyond this level. It is generally 99. 'Mut' and 'Cross' are probabilities.
Evolution of the population is defined by following algorithm
DO UNTIL StopEvolution = True
Assign Fitness to each individual
Notify the user if a solution is found
Kill all the worst individuals
IF less then 30% of population is dead Or All the population is dead then
Mutate all 33% of the population
Kill all the worst individuals
Kill all the worst individuals
Select the parents
Mutate a random individual if probability allows.
In the above algorithm the algorithm mutates 33% of population if less or all individuals are dying because in both the situations the necessary evolution does not take place. And crossing is done every time because without crossing new generation cannot be made.
33% of the parents are selected on the basis of their fitness i.e. fitter the parent more the children he will have and another 33% of parents will be selected randomly.
Take two individuals from the parents list. And then take a random crossover point. Interchange the genes to produce two new individuals. For example let the two parents be 1234 and 5678 and the random crossover point be 2 then two new individuals will be 1278 and 5634.
Take any random individual and take a random point. Change the gene on that point with another random value.
How come my algorithm is little different from others?
There are several differences. Some of them are below and you will discover other differences while you study the code.
- It uses real numbers instead of binary numbers.
- CrossOver is done in every generation
- 33% of individuals are mutated if death rate falls down below 30%.
- Written in Visual Basic
- It can be applied to variety of problems very easily
Where to download accompanying code?
Click here to download the accompanying code.