Flappy Bot

Last modified: 2022-05-20 18:37:13 - github

The project

The goal of the project is to create an algorithm in Python able to autonomously learn how to play Flappy Bird using a simple kind of neuroevolution, similar to the one used for the Obstacle Course project. However, this new program will be a bit different, since it will also exploit Greedy Algorithms.

The idea behind the game is quite simple: the red square on the left (a bird in the original game) must go through the tubes without touching them and survive as long as possible. The square is subject to gravity and, as like the bird in the game, can use its wings and fly higher.

One of the first attempts of the algorithm:

The game

In order to create the user interface, I use the module tkinter, that makes it quite manageable to handle graphical elements.

The tubes are generated on the right side of the window at a random height, and they approach the square/bird at a fixed speed. The square starts its match at mid-height and is always subject to gravity. It dies when it collides with the tubes or when it falls to the ground, but not when it reaches the upper bound of the window, as like in the original game.

The game is completely contained in the class Game, that includes several methods, such as:

The initialization function (constructor), where most of the variables are declared and the interface is created
move_obstacle() function, that generates and moves the obstacles
move_bird() function, that allows the square to flap its wings
die() function, that restarts the game when the square dies
play() function, that manages the game on the whole

move_bird function:

def move_bird(self):
    if self.jump_triggered == True and self.bird_speed_y < 0: #inizio nuovo salto
        self.time_count=0
        self.bird_speed_y = 10           #cambio la velocità verticale con una positiva
        self.bird_speed_y0 = self.bird_speed_y
        self.bird_y0 = self.bird_y       #posizione verticale iniziale
        self.jump_triggered = False
    else: #è già in un salto
        self.jump_triggered = False
    
    self.bird_speed_y -= 0.4            #considero accelerazione verso il basso di 0.8m/s^2
    self.new_y = self.bird_y0 - self.bird_speed_y0 * self.time_count + 0.2 * self.time_count * self.time_count

    if self.new_y <= 90:
        self.new_y = 90

    self.movement_y = self.new_y - self.bird_y
    self.w.move(self.bird, 0, self.movement_y)
    self.bird_y = self.new_y
    
    self.time_count+=1

Learning to play

Once the game is ready, the algorithm has to learn to play with it. It will have access to two numbers: the horizontal distance from the tubes and the vertical distance from the top of the lower one. Using this information, the algorithm will have to decide whether to flap its wings or not. These two inputs will be interpreted by neural networks structured in this way:

The algorithm I've written is a kind of greedy algorithm applied to the weights and biases of the considered neural network. The programm will start from a neural network initialised at random, it will use it to play a fixed number of games and give it as a score the average time it survived in those games.

After having tested the first neural network, the programm will create a slightly mutated copy and test it in the same way as the first. If the score improves, the new network will take the place of the first one and will be used to create new networks, otherwise another mutation of the original network will be tested.

Some of the methods in Game that handle the greedy approach:

def check_score(self):
    if self.num_move % 1 == 0:
        txt_for_print = "G: " + str(self.num_move) + " BEST: " + str(round((self.best_score), 4)) + " | " + str(round((self.last_score), 4))
        print(txt_for_print)

    if self.last_score >= self.best_score:
        self.best_score = self.last_score
        self.accept_move()
        self.propose_move()
    
    else:
        self.move_attempt += 1
        self.propose_move()


            
def accept_move(self):
    self.move_attempt = 0
    self.config.set_weights_as(self.config_test)


def propose_move(self):
    self.num_move += 1

    mutation_rate = 0.02
    self.config_test.set_mutation_of(self.config, rate=mutation_rate)

Mutation and selection

The neural networks are handled by a NeuralNetwork class that not only handles layers, neurons and computations, but also contains some specific methods that are useful both for greedy and genetic algorithms. For example:

randomize function, randomizes the weights and the biases of the network
set_weights_as(self, other) function, that sets the weights and the biases of one network equal to the ones of another network
set_mutation_of(self, other, rate=0.10) function, like the above, but also mutates a certain percentage of the parameters

set_mutation_of() function:

def set_mutation_of(self, other, rate=0.10):
    if not isinstance(other, NeuralNetwork) or not np.all(self.layers_sizes == other.layers_sizes):
        raise Exception("This works only with another NN of same size")
    
    self.set_weights_as(other)
    
    for l in range(self.n_layers-1):
        random_ws = np.random.rand(self.layers_sizes[l], self.layers_sizes[l+1]) * 2 - 1
        mask_w = np.random.rand(self.layers_sizes[l], self.layers_sizes[l+1]) < rate
        self.weights[l][mask_w] = random_ws[mask_w]

        random_bs = np.random.rand(self.layers_sizes[l+1]) * 2 - 1
        mask_b = np.random.rand(self.layers_sizes[l+1]) < rate
        self.biases[l][mask_b] = random_bs[mask_b]

Results

Completing this project has taken, in total, almost two years. I have changed the approach several times, I have interrupted it even more times, it took a while to spot some bugs... But I can finally be satisfied. The algorithm is able to drastically and rapidly improve its score and to find ways to survive forever.

The average, minimum and maximum scores over time in a short training example:

A neural network that managed to reach incredibly high scores (roughly 500 tubes) in just 74 iterations:

🌙 Night mode