Créer un programme pour PyRat

Dans cet article, nous créons ensemble un programme (pas très intelligent) pour ramasser un unique morceau de fromage dans un labyrinthe. Appelons le AIs/improved_random.py. La commande pour démarrer la partie dans ce mode de jeu sera donc la suivante : python pyrat.py –rat AIs/improved_random.py.

Présentation de l’algorithme

Une idée pour trouver un morceau de fromage dans le labyrinthe est de se déplacer au hasard. Toutefois, cela peut prendre du temps, car on visitera probablement plusieurs fois les mêmes endroits. Nous allons donc bouger au hasard, mais (un peu) plus intelligemment. L’algorithme que nous allons utiliser procède comme suit, tant que le morceau de fromage n’est pas atteint :

On se déplace sur une case pas encore visitée.
Si on ne peut pas le faire, on bouge au hasard.

La seconde étape pourrait être améliorée, par exemple en revenant à la dernière case connue ayant des voisins pas encore visités.

Implémentation

Pour implémenter cet algorithme, nous partons d’une copie du fichier AIs/template.py, renommée AIs/improved_random.py (les commentaires ne sont pas affichés ici par souci de lisibilité) :

TEAM_NAME = "_______"

MOVE_DOWN = 'D'
MOVE_LEFT = 'L'
MOVE_RIGHT = 'R'
MOVE_UP = 'U'

def preprocessing (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, piecesOfCheese, timeAllowed) :
    
    # Example prints that appear in the shell only at the beginning of the game
    print("<b>[mazeMap]</b> " + repr(mazeMap))
    print("<b>[mazeWidth]</b> " + repr(mazeWidth))
    print("<b>[mazeHeight]</b> " + repr(mazeHeight))
    print("<b>[playerLocation]</b> " + repr(playerLocation))
    print("<b>[opponentLocation]</b> " + repr(opponentLocation))
    print("<b>[piecesOfCheese]</b> " + repr(piecesOfCheese))
    print("<b>[timeAllowed]</b> " + repr(timeAllowed))

def turn (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, playerScore, opponentScore, piecesOfCheese, timeAllowed) :
    
    # Example print that appears in the shell at every turn
    print("Move: [" + MOVE_UP + "]")
    
    # We always go up
    return MOVE_UP

TEAM_NAME = "_______"

MOVE_DOWN = 'D'

MOVE_LEFT = 'L'

MOVE_RIGHT = 'R'

MOVE_UP = 'U'

def preprocessing (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, piecesOfCheese, timeAllowed) :

# Example prints that appear in the shell only at the beginning of the game

print("[mazeMap] " + repr(mazeMap))

print("[mazeWidth] " + repr(mazeWidth))

print("[mazeHeight] " + repr(mazeHeight))

print("[playerLocation] " + repr(playerLocation))

print("[opponentLocation] " + repr(opponentLocation))

print("[piecesOfCheese] " + repr(piecesOfCheese))

print("[timeAllowed] " + repr(timeAllowed))

def turn (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, playerScore, opponentScore, piecesOfCheese, timeAllowed) :

# Example print that appears in the shell at every turn

print("Move: [" + MOVE_UP + "]")

# We always go up

return MOVE_UP

Changeons le nom du programme, et de supprimons les corps des méthodes preprocessing et turn (qui servent juste de démo) :

TEAM_NAME = "Improved Random"

MOVE_DOWN = 'D'
MOVE_LEFT = 'L'
MOVE_RIGHT = 'R'
MOVE_UP = 'U'

def preprocessing (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, piecesOfCheese, timeAllowed) :
    # TODO

def turn (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, playerScore, opponentScore, piecesOfCheese, timeAllowed) :
    # TODO

TEAM_NAME = "Improved Random"

MOVE_DOWN = 'D'

MOVE_LEFT = 'L'

MOVE_RIGHT = 'R'

MOVE_UP = 'U'

def preprocessing (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, piecesOfCheese, timeAllowed) :

# TODO

def turn (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, playerScore, opponentScore, piecesOfCheese, timeAllowed) :

# TODO

Une fois cela fait, réfléchissons à ce dont nous avons besoin. Tout d’abord, nous aurons besoin de choisir des directions aléatoirement. Il nous faut donc importer la bibliothèque Python random, et créer une fonction renvoyant une direction aléatoire :

TEAM_NAME = "Improved Random"

MOVE_DOWN = 'D'
MOVE_LEFT = 'L'
MOVE_RIGHT = 'R'
MOVE_UP = 'U'

import random # <-- new code (import of random module)

# v-- new code (returns a random move)
def randomMove () :
    moves = [MOVE_DOWN, MOVE_LEFT, MOVE_RIGHT, MOVE_UP]
    return moves[random.randint(0, 3)]

def preprocessing (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, piecesOfCheese, timeAllowed) :
    # TODO

def turn (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, playerScore, opponentScore, piecesOfCheese, timeAllowed) :
    # TODO

TEAM_NAME = "Improved Random"

MOVE_DOWN = 'D'

MOVE_LEFT = 'L'

MOVE_RIGHT = 'R'

MOVE_UP = 'U'

import random # <-- new code (import of random module)

# v-- new code (returns a random move)

def randomMove () :

moves = [MOVE_DOWN, MOVE_LEFT, MOVE_RIGHT, MOVE_UP]

return moves[random.randint(0, 3)]

def preprocessing (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, piecesOfCheese, timeAllowed) :

# TODO

def turn (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, playerScore, opponentScore, piecesOfCheese, timeAllowed) :

# TODO

A chaque étape, nous voulons nous déplacer sur une case qui n’a pas encore été visitée. Il va donc falloir se souvenir des emplacements visités. Pour cela, nous choisissons d’utiliser une liste. Comme nous utiliserons cette structure lors de plusieurs appels de fonctions distincts, elle doit être déclarée comme variable globale, en dehors de tout corps de fonction :

TEAM_NAME = "Improved Random"

MOVE_DOWN = 'D'
MOVE_LEFT = 'L'
MOVE_RIGHT = 'R'
MOVE_UP = 'U'

import random

visitedCells = [] # <-- new code (list of visited cells to store the token locations)

def randomMove () :
    moves = [MOVE_DOWN, MOVE_LEFT, MOVE_RIGHT, MOVE_UP]
    return moves[random.randint(0, 3)]

def preprocessing (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, piecesOfCheese, timeAllowed) :
    # TODO

def turn (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, playerScore, opponentScore, piecesOfCheese, timeAllowed) :
    # TODO

TEAM_NAME = "Improved Random"

MOVE_DOWN = 'D'

MOVE_LEFT = 'L'

MOVE_RIGHT = 'R'

MOVE_UP = 'U'

import random

visitedCells = [] # <-- new code (list of visited cells to store the token locations)

def randomMove () :

moves = [MOVE_DOWN, MOVE_LEFT, MOVE_RIGHT, MOVE_UP]

return moves[random.randint(0, 3)]

def preprocessing (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, piecesOfCheese, timeAllowed) :

# TODO

def turn (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, playerScore, opponentScore, piecesOfCheese, timeAllowed) :

# TODO

Passons aux deux fonctions principales. Il n’y a rien de particulier à faire à l’initialisation. Nous allons donc utiliser le mot-clé Python pass pour indiquer que le corps de la méthode preprocessing est vide :

TEAM_NAME = "Improved Random"

MOVE_DOWN = 'D'
MOVE_LEFT = 'L'
MOVE_RIGHT = 'R'
MOVE_UP = 'U'

import random

visitedCells = []

def randomMove () :
    moves = [MOVE_DOWN, MOVE_LEFT, MOVE_RIGHT, MOVE_UP]
    return moves[random.randint(0, 3)]

def preprocessing (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, piecesOfCheese, timeAllowed) :
    pass # <-- new code (nothing to do in this function)

def turn (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, playerScore, opponentScore, piecesOfCheese, timeAllowed) :
    # TODO

TEAM_NAME = "Improved Random"

MOVE_DOWN = 'D'

MOVE_LEFT = 'L'

MOVE_RIGHT = 'R'

MOVE_UP = 'U'

import random

visitedCells = []

def randomMove () :

moves = [MOVE_DOWN, MOVE_LEFT, MOVE_RIGHT, MOVE_UP]

return moves[random.randint(0, 3)]

def preprocessing (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, piecesOfCheese, timeAllowed) :

pass # <-- new code (nothing to do in this function)

def turn (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, playerScore, opponentScore, piecesOfCheese, timeAllowed) :

# TODO

A présent, passons au gros du programme : la définition du comportement à chaque tour de jeu. Une première chose à faire est de trouver les mouvements qui peuvent nous emmener sur une case non visitée. Pour cela, on définit une nouvelle fonction qui nous renvoie la liste de ces mouvements. Appelons la listDiscoveryMoves :

TEAM_NAME = "Improved Random"

MOVE_DOWN = 'D'
MOVE_LEFT = 'L'
MOVE_RIGHT = 'R'
MOVE_UP = 'U'

import random

visitedCells = []

def randomMove () :
    moves = [MOVE_DOWN, MOVE_LEFT, MOVE_RIGHT, MOVE_UP]
    return moves[random.randint(0, 3)]

# v-- new code (returns a list of moves that lead to a non-visited cell)
def listDiscoveryMoves (playerLocation, mazeMap) :
    moves = []
    for neighbor in mazeMap[playerLocation] :
        if neighbor not in visitedCells :
            move = moveFromLocations(playerLocation, neighbor)
            moves.append(move)
    return moves

def preprocessing (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, piecesOfCheese, timeAllowed) :
    pass

def turn (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, playerScore, opponentScore, piecesOfCheese, timeAllowed) :
    # TODO

TEAM_NAME = "Improved Random"

MOVE_DOWN = 'D'

MOVE_LEFT = 'L'

MOVE_RIGHT = 'R'

MOVE_UP = 'U'

import random

visitedCells = []

def randomMove () :

moves = [MOVE_DOWN, MOVE_LEFT, MOVE_RIGHT, MOVE_UP]

return moves[random.randint(0, 3)]

# v-- new code (returns a list of moves that lead to a non-visited cell)

def listDiscoveryMoves (playerLocation, mazeMap) :

moves = []

for neighbor in mazeMap[playerLocation] :

if neighbor not in visitedCells :

move = moveFromLocations(playerLocation, neighbor)

moves.append(move)

return moves

def preprocessing (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, piecesOfCheese, timeAllowed) :

pass

def turn (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, playerScore, opponentScore, piecesOfCheese, timeAllowed) :

# TODO

Notez que la fonction listDiscoveryMoves utilise une fonction moveFromLocation que nous n’avons pas encore définie. Créons donc une telle fonction, qui pour une case d’origine et une case de destination, renvoie le mouvement à effectuer. La bibliothèque numpy de Python fournit une fonction permettant de soustraire aisément des paires d’entiers, élément par élément :

TEAM_NAME = "Improved Random"

MOVE_DOWN = 'D'
MOVE_LEFT = 'L'
MOVE_RIGHT = 'R'
MOVE_UP = 'U'

import random
import numpy # <-- new code (import of numpy module)

visitedCells = []

def randomMove () :
    moves = [MOVE_DOWN, MOVE_LEFT, MOVE_RIGHT, MOVE_UP]
    return moves[random.randint(0, 3)]

# v-- new code (returns a move from two locations)
def moveFromLocations (sourceLocation, targetLocation) : 
    difference = tuple(numpy.subtract(targetLocation, sourceLocation))
    if difference == (0, -1) :
        return MOVE_DOWN
    elif difference == (0, 1) :
        return MOVE_UP
    elif difference == (1, 0) :
        return MOVE_RIGHT
    elif difference == (-1, 0) :
        return MOVE_LEFT
    else :
        raise Exception("Impossible move")

def listDiscoveryMoves (playerLocation, mazeMap) :
    moves = []
    for neighbor in mazeMap[playerLocation] :
        if neighbor not in visitedCells :
            move = moveFromLocations(playerLocation, neighbor)
            moves.append(move)
    return moves

def preprocessing (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, piecesOfCheese, timeAllowed) :
    pass

def turn (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, playerScore, opponentScore, piecesOfCheese, timeAllowed) :
    # TODO

TEAM_NAME = "Improved Random"

MOVE_DOWN = 'D'

MOVE_LEFT = 'L'

MOVE_RIGHT = 'R'

MOVE_UP = 'U'

import random

import numpy # <-- new code (import of numpy module)

visitedCells = []

def randomMove () :

moves = [MOVE_DOWN, MOVE_LEFT, MOVE_RIGHT, MOVE_UP]

return moves[random.randint(0, 3)]

# v-- new code (returns a move from two locations)

def moveFromLocations (sourceLocation, targetLocation) :

difference = tuple(numpy.subtract(targetLocation, sourceLocation))

if difference == (0, -1) :

return MOVE_DOWN

elif difference == (0, 1) :

return MOVE_UP

elif difference == (1, 0) :

return MOVE_RIGHT

elif difference == (-1, 0) :

return MOVE_LEFT

else :

raise Exception("Impossible move")

def listDiscoveryMoves (playerLocation, mazeMap) :

moves = []

for neighbor in mazeMap[playerLocation] :

if neighbor not in visitedCells :

move = moveFromLocations(playerLocation, neighbor)

moves.append(move)

return moves

def preprocessing (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, piecesOfCheese, timeAllowed) :

pass

def turn (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, playerScore, opponentScore, piecesOfCheese, timeAllowed) :

# TODO

On peut à présent se servir de la fonction listDiscoveryMoves pour choisir un mouvement amenant à une nouvelle case. Toutefois, si un tel mouvement n’existe pas, on choisit de se diriger dans une direction au hasard. Attention ! N’oubliez pas de déclarer visitedCells comme variable globale pour la mettre à jour :

TEAM_NAME = "Improved Random"

MOVE_DOWN = 'D'
MOVE_LEFT = 'L'
MOVE_RIGHT = 'R'
MOVE_UP = 'U'

import random
import numpy

visitedCells = []

def randomMove () :
    moves = [MOVE_DOWN, MOVE_LEFT, MOVE_RIGHT, MOVE_UP]
    return moves[random.randint(0, 3)]

def moveFromLocations (sourceLocation, targetLocation) : 
    difference = tuple(numpy.subtract(targetLocation, sourceLocation))
    if difference == (0, -1) :
        return MOVE_DOWN
    elif difference == (0, 1) :
        return MOVE_UP
    elif difference == (1, 0) :
        return MOVE_RIGHT
    elif difference == (-1, 0) :
        return MOVE_LEFT
    else :
        raise Exception("Impossible move")

def listDiscoveryMoves (playerLocation, mazeMap) :
    moves = []
    for neighbor in mazeMap[playerLocation] :
        if neighbor not in visitedCells :
            move = moveFromLocations(playerLocation, neighbor)
            moves.append(move)
    return moves

def preprocessing (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, piecesOfCheese, timeAllowed) :
    pass

def turn (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, playerScore, opponentScore, piecesOfCheese, timeAllowed) :

    # v-- new code (the current cell is now visited)
    global visitedCells
    if playerLocation not in visitedCells :
        visitedCells.append(playerLocation)
    
    # v-- new code (we get the moves leading to a new cell and apply one of them at random if possible)
    moves = listDiscoveryMoves(playerLocation, mazeMap)
    if moves :
        return moves[random.randint(0, len(moves) - 1)]
    else :
        return randomMove()

TEAM_NAME = "Improved Random"

MOVE_DOWN = 'D'

MOVE_LEFT = 'L'

MOVE_RIGHT = 'R'

MOVE_UP = 'U'

import random

import numpy

visitedCells = []

def randomMove () :

moves = [MOVE_DOWN, MOVE_LEFT, MOVE_RIGHT, MOVE_UP]

return moves[random.randint(0, 3)]

def moveFromLocations (sourceLocation, targetLocation) :

difference = tuple(numpy.subtract(targetLocation, sourceLocation))

if difference == (0, -1) :

return MOVE_DOWN

elif difference == (0, 1) :

return MOVE_UP

elif difference == (1, 0) :

return MOVE_RIGHT

elif difference == (-1, 0) :

return MOVE_LEFT

else :

raise Exception("Impossible move")

def listDiscoveryMoves (playerLocation, mazeMap) :

moves = []

for neighbor in mazeMap[playerLocation] :

if neighbor not in visitedCells :

move = moveFromLocations(playerLocation, neighbor)

moves.append(move)

return moves

def preprocessing (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, piecesOfCheese, timeAllowed) :

pass

def turn (mazeMap, mazeWidth, mazeHeight, playerLocation, opponentLocation, playerScore, opponentScore, piecesOfCheese, timeAllowed) :

# v-- new code (the current cell is now visited)

global visitedCells

if playerLocation not in visitedCells :

visitedCells.append(playerLocation)

# v-- new code (we get the moves leading to a new cell and apply one of them at random if possible)

moves = listDiscoveryMoves(playerLocation, mazeMap)

if moves :

return moves[random.randint(0, len(moves) - 1)]

else :

return randomMove()

Maintenant, lançons notre programme pour voir comment il se débrouille !

Est-ce mieux ?

Afin d’évaluer si ce nouveau programme est meilleur qu’un simple programme se déplaçant aléatoirement, faisons quelques statistiques.

Pour cela, on va tester le nombre moyen de morceaux de déplacements réalisés par le programme créé, et le comparer au nombre de déplacements moyens réalisés par le programme aléatoire basique (“AIs/random.py”).

On va lancer le programme en utilisant plusieurs paramètres :

D’abord on va réduire la taille du labyrinthe pour ne pas que ça prenne trop de temps : “–width 5 –height 5″
On va ne générer qu’un seul morceau de fromage : “–pieces 1″
On ne va pas attendre 100ms à chaque fois pour que le joueur prenne une décision, mais jouer aussi vite que possible : “–synchronous”
Pour aller encore plus vite, on va désactiver l’affichage graphique des parties : “–nodrawing”
Enfin, on va faire une moyenne sur plusieurs tests : “–tests 100″

Au final :

python pyrat.py --rat AIs/random.py --width 5 --height 5 --pieces 1 --synchronous --nodrawing --tests 100

1	python pyrat.py --rat AIs/random.py --width 5 --height 5 --pieces 1 --synchronous --nodrawing --tests 100

On obtient par exemple :

{'win_python': 0.0, 'score_python': 0.0, 'miss_python': 123.75, 'win_rat': 1.0, 'score_rat': 1.0, 'miss_rat': 66.67, 'moves_python': 0.0, 'moves_rat': 57.08, 'stucks_python': 0.0, 'stucks_rat': 31.11}

1	{'win_python': 0.0, 'score_python': 0.0, 'miss_python': 123.75, 'win_rat': 1.0, 'score_rat': 1.0, 'miss_rat': 66.67, 'moves_python': 0.0, 'moves_rat': 57.08, 'stucks_python': 0.0, 'stucks_rat': 31.11}

Puis :

python pyrat.py --rat AIs/improved_random.py --width 5 --height 5 --pieces 1 --synchronous --nodrawing --tests 100

1	python pyrat.py --rat AIs/improved_random.py --width 5 --height 5 --pieces 1 --synchronous --nodrawing --tests 100

On obtient ici :

{'win_python': 0.0, 'score_rat': 1.0, 'moves_rat': 34.96, 'moves_python': 0.0, 'stucks_python': 0.0, 'win_rat': 1.0, 'miss_rat': 42.15, 'stucks_rat': 17.51, 'score_python': 0.0, 'miss_python': 77.11}

1	{'win_python': 0.0, 'score_rat': 1.0, 'moves_rat': 34.96, 'moves_python': 0.0, 'stucks_python': 0.0, 'win_rat': 1.0, 'miss_rat': 42.15, 'stucks_rat': 17.51, 'score_python': 0.0, 'miss_python': 77.11}

On constate que dans les deux cas le rat est parvenu à trouver le morceau de fromage (ce qui n’était pas garanti car par défaut une partie ne peut pas excéder 2000 tours (voir le paramètre “–max_turns”)). Mais notre programme amélioré a presque divisé par deux le nombre moyen de mouvements pour y parvenir !

Présentation de l’algorithme

Implémentation

Est-ce mieux ?

Laisser un commentaire Annuler la réponse.