簡體   English   中英

在Python中使用卷積神經網絡的自動駕駛汽車(游戲)

[英]Self Driving Car (Game) using Convolutional Neural Network in Python

我正在使用卷積神經網絡(tensorflow,alexnet)在游戲(需要速度)中駕駛汽車。 我只是一個想與機器學習一起工作的學生。 我是這個的初學者。

這是我的計划:

  1. 獲取訓練數據
  2. 平衡並洗牌
  3. 訓練模型
  4. 測試模型

問題是我沒有找到記錄或檢測我在玩游戲時所做的KeyPresses的方法。 我希望python檢測我按下的鍵,並將它們與游戲的幀圖像一起存儲在npy數組中。

我確實遇到了這段記錄KeyPresses的代碼,但只記錄了字母鍵。 我希望python還可以檢測箭頭鍵,空格鍵等。

import win32api as wapio
import time

keyList = ["\b"]
for char in "ABCDEFGHIJKLMNOPQRSTUVWXYZ 123456789,.'£$/\\":
    keyList.append(char)

def key_check():
    keys = []
    for key in keyList:
        if wapi.GetAsyncKeyState(ord(key)):
            keys.append(key)
    return keys

我還想要一種模擬KeyPresses的方法,以便我的模型可以實際駕駛汽車。

我確實有這段代碼,可以很好地工作。 我只是想就此提出建議。

import ctypes
import time

SendInput = ctypes.windll.user32.SendInput

W = 0x11
A = 0x1E
S = 0x1F
D = 0x20

# C struct redefinitions 
PUL = ctypes.POINTER(ctypes.c_ulong)
class KeyBdInput(ctypes.Structure):
    _fields_ = [("wVk", ctypes.c_ushort),
                ("wScan", ctypes.c_ushort),
                ("dwFlags", ctypes.c_ulong),
                ("time", ctypes.c_ulong),
                ("dwExtraInfo", PUL)]

class HardwareInput(ctypes.Structure):
    _fields_ = [("uMsg", ctypes.c_ulong),
                ("wParamL", ctypes.c_short),
                ("wParamH", ctypes.c_ushort)]

class MouseInput(ctypes.Structure):
    _fields_ = [("dx", ctypes.c_long),
                ("dy", ctypes.c_long),
                ("mouseData", ctypes.c_ulong),
                ("dwFlags", ctypes.c_ulong),
                ("time",ctypes.c_ulong),
                ("dwExtraInfo", PUL)]

class Input_I(ctypes.Union):
    _fields_ = [("ki", KeyBdInput),
                 ("mi", MouseInput),
                 ("hi", HardwareInput)]

class Input(ctypes.Structure):
    _fields_ = [("type", ctypes.c_ulong),
                ("ii", Input_I)]

# Actuals Functions

def PressKey(hexKeyCode):
    extra = ctypes.c_ulong(0)
    ii_ = Input_I()
    ii_.ki = KeyBdInput( 0, hexKeyCode, 0x0008, 0, ctypes.pointer(extra) )
    x = Input( ctypes.c_ulong(1), ii_ )
    ctypes.windll.user32.SendInput(1, ctypes.pointer(x), ctypes.sizeof(x))

def ReleaseKey(hexKeyCode):
    extra = ctypes.c_ulong(0)
    ii_ = Input_I()
    ii_.ki = KeyBdInput( 0, hexKeyCode, 0x0008 | 0x0002, 0, 
    ctypes.pointer(extra) )
    x = Input( ctypes.c_ulong(1), ii_ )
    ctypes.windll.user32.SendInput(1, ctypes.pointer(x), ctypes.sizeof(x))


if __name__ == '__main__':
    while (True):
        PressKey(0x11)
        time.sleep(1)
        ReleaseKey(0x11)
        time.sleep(1)

下一步是實際收集游戲的框架(圖像)。 我使用cv2捕獲我在屏幕上玩游戲的區域。 然后,我:

  1. 將其轉換為灰度

  2. 調整圖像本身的大小

  3. 將其另存為numpy文件(每捕獲1000幀)

捕獲的數據非常好,但是一旦我擁有約50000幀(或約500 mb的數據),保存數據就需要很長時間。 有時,在保存numpy文件的同時Python崩潰了,導致所有訓練數據丟失。

這是我捕獲數據的代碼:

import numpy as np
import cv2
import time
from grabscreen import grab_screen
from getkeys import key_check
import os


def keys_to_output(keys):
    output = [0, 0, 0]

    if 'A' in keys:
        output[0] = 1
    elif 'D' in keys:
        output[2] = 1
    else:
        output[1] = 1

    return output


file_name = 'training_data.npy'

if os.path.isfile(file_name):
    print('File exists, loading previous data!')
    training_data = list(np.load(file_name))
else:
    print('File does not exist, starting fresh!')
    training_data = []


def main():
    for i in list(range(5))[::-1]:
        print(i + 1)
        time.sleep(1)

    if os.path.isfile(file_name):
        print('Existing Training Data:' + str(len(training_data)))
        print('Capturing Data!')   
    else:
        print('Capturing Data Freshly!') 


    while True:
        screen = grab_screen(region=(40, 250, 860, 560))
        screen = cv2.resize(screen, (120, 56))
        screen = cv2.cvtColor(screen, cv2.COLOR_BGR2GRAY)

        keys = key_check()
        output = keys_to_output(keys)
        training_data.append([screen, output])


        if len(training_data) % 1000 == 0:
            print('New Training Data: ' + str(len(training_data)))
            print('Saving Data!')
            np.save(file_name, training_data)
            print('Data saved succesfully! You can quit now.')
            print('Capturing data!')


main()

任何人都可以建議一種更好的方法來獲取我的訓練數據。 另外,我聽說過有關使用PyTables的信息,但是我不確定如何在程序中使用它。

之后,我重新整理數據並進行平衡:

import numpy as np
import pandas as pd
from collections import Counter
from random import shuffle

train_data = np.load('training_data.npy')

print('Training Data: ' + str(len(train_data)))
df = pd.DataFrame(train_data)
print(df.head())
print(Counter(df[1].apply(str)))

lefts = []
rights = []
forwards = []

shuffle(train_data)

for data in train_data:
    img = data[0]
    choice = data[1]

    if choice == [1, 0, 0]:
        lefts.append([img, choice])
    elif choice == [0, 1, 0]:
        forwards.append([img, choice])
    elif choice == [0, 0, 1]:
        rights.append([img, choice])
    else:
        print('no matches!!!')


forwards = forwards[:len(lefts)][:len(rights)]
lefts = lefts[:len(forwards)]
rights = rights[:len(forwards)]

final_data = forwards + lefts + rights

shuffle(final_data)
print('Final Balanced Data: ' + str(len(final_data)))
np.save('training_data_balanced.npy', final_data)

模型:

import tflearn
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.estimator import regression
from tflearn.layers.normalization import local_response_normalization

def alexnet(width, height, lr):
    network = input_data(shape=[None, width, height, 1], name='input')
    network = conv_2d(network, 96, 11, strides=4, activation='relu')
    network = max_pool_2d(network, 3, strides=2)
    network = local_response_normalization(network)
    network = conv_2d(network, 256, 5, activation='relu')
    network = max_pool_2d(network, 3, strides=2)
    network = local_response_normalization(network)
    network = conv_2d(network, 384, 3, activation='relu')
    network = conv_2d(network, 384, 3, activation='relu')
    network = conv_2d(network, 256, 3, activation='relu')
    network = max_pool_2d(network, 3, strides=2)
    network = local_response_normalization(network)
    network = fully_connected(network, 4096, activation='tanh')
    network = dropout(network, 0.5)
    network = fully_connected(network, 4096, activation='tanh')
    network = dropout(network, 0.5)
    network = fully_connected(network, 3, activation='softmax')
    network = regression(network, optimizer='momentum',
                         loss='categorical_crossentropy',
                         learning_rate=lr, name='targets')

    model = tflearn.DNN(network, checkpoint_path='model_alexnet',
                        max_checkpoints=1, tensorboard_verbose=2, tensorboard_dir='log')

    return model

訓練模型:

import numpy as np
from alexnet import alexnet

WIDTH = 120
HEIGHT = 56
LR = 1e-3
EPOCHS = 15
MODEL_NAME = 'nfs-car-{}-{}-epochs.model'.format(LR, EPOCHS)

model = alexnet(WIDTH, HEIGHT, LR)

# for every epoch finished, save the model.
# therefore, if accuracy drops or loss increases, we can terminate the script.
# we will have a trained model with the best accuracy and previously saved epoch.

for i in range(EPOCHS):
    train_data = np.load('training_data_balanced.npy')

    train = train_data[:-10] # 80% of balanced data
    test = train_data[-10:] # 20% of balanced data

    X = np.array([i[0] for i in train]).reshape(-1, WIDTH, HEIGHT, 1)
    Y = [i[1] for i in train]

    test_x = np.array([i[0] for i in test]).reshape(-1, WIDTH, HEIGHT, 1)
    test_y = [i[1] for i in test]

    model.fit({'input': X}, {'targets': Y}, n_epoch=1,
              validation_set=({'input': test_x}, {'targets': test_y}),
              snapshot_step=500, show_metric=True, run_id=MODEL_NAME)

    model.save(MODEL_NAME)
    print('Saved epoch: ' + str(i + 1))

然后,當然,我測試了模型:

import numpy as np
import cv2
import time
from grabscreen import grab_screen
from getkeys import key_check
from alexnet import alexnet
from directkeys import PressKey, ReleaseKey, W, A, D

WIDTH = 120
HEIGHT = 56
LR = 1e-3
EPOCHS = 15
MODEL_NAME = 'nfs-car-{}-{}-epochs.model'.format(LR, EPOCHS)


def straight():
    PressKey(W)
    ReleaseKey(A)
    ReleaseKey(D)


def left():
    PressKey(W)
    PressKey(A)
    ReleaseKey(D) # added
    time.sleep(0.09)
    ReleaseKey(A)


def right():
    PressKey(W)
    PressKey(D)
    ReleaseKey(A) # added
    time.sleep(0.09)
    ReleaseKey(D)


model = alexnet(WIDTH, HEIGHT, LR)
model.load(MODEL_NAME)


def main():
    for i in list(range(5))[::-1]:
        print(i + 1)
        time.sleep(1)


    paused = False

    while True:

        if not paused:
            screen = grab_screen(region=(40, 250, 860, 560))
            screen = cv2.cvtColor(screen, cv2.COLOR_BGR2GRAY)
            screen = cv2.resize(screen, (120, 56))

            prediction = model.predict([screen.reshape(WIDTH, HEIGHT, 1)])[0]
            print(prediction)

            turn_thrush = .75
            fwd_thrush = 0.70

            if prediction[0] > turn_thrush:
                left()
            elif prediction[1] > fwd_thrush:
                straight()
            elif prediction[2] > turn_thrush:
                right()
            else:
                straight()

            keys = key_check()

            if 'T' in keys:
                if paused:
                    paused = False
                    print('Unpaused!')
                    time.sleep(1)
                else:
                    print('Pausing!')
                    paused = True
                    ReleaseKey(A)
                    ReleaseKey(W)
                    ReleaseKey(D)
                    time.sleep(1)


main()

是的,這就是我的想法。 但是,由於缺乏適當的信息,我無法完全應用此概念。

  1. 按鍵問題太詳細了。 只是讓您知道audacity課程中有開源模擬器。 您也許可以在那里找到一些線索。
  2. 對於較大的訓練集崩潰,您需要使用生成器技術。 它更多地是將培訓數據(以較小的批次)流式傳輸到培訓中。
  3. 一個簡單的模型也可以實現很多目標。

請查看此完整的實現,包括模擬器,模型訓練,測試數據,源代碼和該項目的摘要。

https://github.com/ericq/CarND-Behavioral-Cloning-P3

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM