Obstacle Course AI

Last modified: 2019-12-04 22:45:25

The idea

I've built an algorithm that learns by itself how to jump obstacles using Python with a simple form of neuroevolution.

The game is inspired by the Google dinosaur game that appears on Chrome when there is no connection.

The idea is that the red square has to avoid obstacles positioned at different heights and approaching at an increasing speed: the algorithm is given some information and must learn to jump at the right moment and survive as long as possible.

The game

To build the game I've used tkinter module, that allows to handle the graphical interface.

The obstacles are generated on the right side of the window at variable heights and they move towards the red square.

The game is completely handled by Game class, that contains several functions, such as:

The initialization function (constructor), where most of the variables are declared and the interface is created
move() function, that generates and moves the obstacles
jump() function, that allows the square to jump
die() function, that restarts the game when the square hits an obstacle
play() function, that manages the game on the whole

jump() function:

                def jump(self):
                    #SE STO SALTANDO PROSEGUO MOTO FINCHE' NON TORNO A GROUND
                    if self.cang_y <= 330 and self.jumping == True:
                        self.new_y = 330 - 10.5 * self.time_count + 0.2 * self.time_count * self.time_count
                        self.movement_y = self.new_y - self.cang_y
                        self.w.move(self.canguro, 0, self.movement_y)
                        #print ("-->jump " + str(self.cang_y) + " > " + str(self.new_y))
                        self.cang_y = self.new_y
                    elif self.cang_y > 330:
                        #print ("-->FIX jump")
                        self.new_y = 330
                        self.movement_y = self.new_y - self.cang_y
                        self.w.move(self.canguro, 0, self.movement_y)
                        self.cang_y = self.new_y
                        self.jumping = False
                    else:
                        self.new_y = 330
                        self.movement_y = self.new_y - self.cang_y
                        self.w.move(self.canguro, 0, self.movement_y)
                        self.cang_y = self.new_y
                        self.jumping = False
                        self.jump_triggered = False

Learning to play

Once the game is ready, the program has to learn to play. The algorithm has only one command: jump; and it has to rely on three pieces of information: the obstacle's distance, its height and its speed. This data will be interpreted by neural networks structured like this:

The algorithm I have built is a kind of genetic algorithm where the populations are sets of neural networks, each with its own weigths. Each neural network is used to play and is given a score that is calculated according to how long it survived.

Each neural networks is a NeuralNetwork class that, along with Layer and Neuron classes handles the calculations, the weights, the activation function and the outputs.

play() function constantly feeds the neural network with the current information and, according to the output, decides whether to trigger the jump.

                self.population[i].set_input([self.dist_x, self.obs_speed*100, 330-self.obs_y])
                received = self.population[i].calc_output()

                if received[0] > received[1] and not self.jump_triggered:
                    self.time_count=0
                    self.jumping=True
                    self.jump_triggered = True
                else:
                    pass

Mutation and selection

The neural network that survived the longest is used to generate the new population. The first new network is exactly identical to the best one found so far, while the others will have 15% chance for each weight to mutate, that is to be replaced by a random one.

evolve() function:

                def evolve(self):
                    self.generation += 1
                    self.best_score = 0
                    self.best_nn = -1
                    for i in range(0, self.n_entities):
                        if self.scores[i] > self.best_score and self.scores[i] > self.last_best:
                            self.best_score = self.scores[i]
                            self.best_nn = i
                    
                    if self.best_nn >=0:
                        self.population[0] = self.population[self.best_nn]


                    if self.best_score > self.last_best:
                        self.last_best = self.best_score

                    pacchetto_pesi_rand_layer_1 = self.rand.rand(self.n_entities, 5, 3) *2 - 1
                    pacchetto_pesi_rand_layer_2 = self.rand.rand(self.n_entities, 5, 5) *2 - 1
                    pacchetto_pesi_rand_layer_3 = self.rand.rand(self.n_entities, 2, 5) *2 - 1

                    rands = self.rand.random(size=(100000))
                    c=0
                    

                    for y in range(1, self.n_entities):
                        self.population[y] = self.population[self.best_nn]

                        mutation_rate = 0.15
                        for l in range(1, self.population[y].n_layers):
                            for n in range(0, self.population[y].layers_sizes[l]):
                                for w in range(0, self.population[y].layers_sizes[l-1]):
                                    r = rands[c]
                                    c += 1
                                    if r < mutation_rate:
                                        if l==1:
                                            self.population[y].layer[l].weights[n][w] = pacchetto_pesi_rand_layer_1[y][n][w]
                                        elif l==2:
                                            self.population[y].layer[l].weights[n][w] = pacchetto_pesi_rand_layer_2[y][n][w]
                                        elif l==3:
                                            self.population[y].layer[l].weights[n][w] = pacchetto_pesi_rand_layer_3[y][n][w]

Outcome

In the simulations I have run, the algorithm has almost always succeeded in finding a neural network able to survive forever.

🌙 Night mode