[mentor note: code blocks added for readability and syntax hilighting](adsbygoogle = window.adsbygoogle || []).push({});

Hey all, been trying to implement a Q deep learning algorithm, having an issue though, its not working, after 100 000 game plays and using 1000 iterations to train each step (although i have tried lower numbers for both) it's still not learning. Network and game are in the linked image, http://imgur.com/a/hATfB

Training data pair for backprop is (input i linked in image,QTarget = r + gamma * MaxQ) , MaxQ is max network output layer activation or a random one (epsilon greedy). r is reward obtained fromCode (C):

double maxQval;

double[] inputvec;

int MaxQ = GetRandDir(state, out maxQval, out inputvec);//input vec is board

double[] QtarVec = new double[] { 0, 0, 0, 0 };

double r = GetR((int)state[0], (int)state[1]); // GetR is reward

QtarVec[MaxQ] = Qtar(r, maxQval); // backprop vector of 0's except Qtar replaces a value

associator.Train(50, new double[][] { inputvec }, new double[][] { QtarVec });

each move,-10 for obstacle and 10 for goal. (althogh I have tried just 10 for goal and 0 for everything else. Here is training code.

Any Help appreciated.Code (C):

public void Train(int nTrails)

{

double[] state = new double[] { 1, 1 }; // inital position

int its = 0;

for (int i = 0; i < nTrails; i++) {

while (((state[0] < 4) && (state[1] < 4))

&&((state[0] * 100 >0) && (state[1] * 100 >0))

&& (state[0] != 3 && state[1] != 3)) { //while on board and not at goal postion

double temp = r.NextDouble();

int next = -1;

lines.Add(new Vector2((float)(state[0] * 100), (float)(state[1] * 100)));

if (temp < epsilon) {

next = TrainRandIt(state); // move random direction, backprop

} else {

next = TrainMaxIt(state); // move in max activation direction, backprop

}

if (next == 0) { //updating postion

state[0]++;

} else if (next == 1) {

state[0]--;

} else if (next == 2) {

state[1]++;

} else if (next == 3) {

state[1]--;

}

}

}

state[0] = 1;

state[1] = 1; // resetting game

}

**Physics Forums | Science Articles, Homework Help, Discussion**

The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

# C/++/# "Q deep learning" algorithm

Have something to add?

Draft saved
Draft deleted

Loading...

**Physics Forums | Science Articles, Homework Help, Discussion**