Work in Progress

Pong AI Trainer 0.7

7

SP4CEBAR 2021-06-21 19:40 (Edited)

I'm trying to make a neural network to play pong, the AI is getting smarter!
The numbers at the top of the screen are, respectively: generation number, AI number, AI score, best score of this generation, best score

I found a flaw in version 0.2, I multiplied each "neuron" by a factor, but this results in the next neurons getting the same value. Instead, I should multiply each combination of neurons by a different factor, which means that I'll have to use more factors which will make displaying all of them a bit harder
But I fixed it now

Pong AI.nx | Open in app
2022-04-22 14:34
Pong AI.nx | Open in app
2022-04-22 14:15
Pong AI.nx | Open in app
2022-04-22 14:10
Pong AI.nx | Open in app
2021-06-22 19:26
Pong AI.nx | Open in app
2021-06-22 17:02
Pong AI.nx | Open in app
2021-06-21 20:38
Pong AI.nx | Open in app
2021-06-21 19:40

was8bit 2021-06-21 21:51

Ambitious... very interesting :)


TheLivingBlooper 2021-06-22 16:10 (Edited)

I love this! in my opinion there isn't enough neural networking on LowRes


Timo 2021-06-22 16:35

How does this work? What moves the paddle?
I don't know a lot about neural networks, but the paddle needs some logic to move, right? Is it random (and then it learns from it)?


SP4CEBAR 2021-06-22 17:13 (Edited)

@Timo yes, that's kind of how it works
The neural network takes the input values (in this case there are three) and multiplies every combination of them with a unique factor and adds that together to get to the next 4 neurons, in my program it does this 2 times, the two neurons at the end are the outputs: move up and move down

this is the structure of the neurons I'm using:
Inputs Interneurons outputs
O. . . . . O. . . . . O. . . . . O
O. . . . . O. . . . . O. . . . . O
O. . . . . O. . . . . O
. . . . . . O. . . . . O
Instead of the dots, there's every possible combination of the two layers of neurons:
3*4=12 connections for the first space
4*4=16 connections for the second space
4*2=8 connections for the last space
Each of these connections is multiplied by it's own factor, the factors are values between -1 and 1

To get those factors, it randomly mutates the factors of the leader (in the beginning the leader is all zeroes), these factors are then tested with the neural network, the test score determines whether or not this version should become the leader of the next generation

idk if this is going to work, but the current results look promising


SP4CEBAR 2021-06-22 19:28

@Timo, are persistent variables or arrays still possible, or should I use persistent RAM instead?


Timo 2021-06-22 21:22

There is only persistent RAM.


SP4CEBAR 2021-06-22 21:28

Okay


Mrlegoboy 2021-07-22 02:03

damn thats crazy how it just doesnt learn. maybe something like pong is more suited for genetic algorithms.


SP4CEBAR 2021-07-25 19:11

The reason it doesn't learn probably has to do with a number of factors
- the feedback (the highscore number is an integer, it takes a lot of effort to get a highscore that's one higher than the previous: so it can't detect small improvements.
- the mutation rate (with a bit of luck it got a highscore of four once, but after that it didn't do it again, meaning that it mutates way too much to keep the advantageous change. I think it should be decreased a little, but not too much or it won't ever be able to discover new things
- the complexity of the neural network (idk what arrangements of nodes would be optimal for pong)

To make this work better I think some things need to be tweaked


Mrlegoboy 2021-07-26 02:57

what if instead of points, you trained the model by how far away the ball is on the frame where it needs to hit it? that way, if the net sucks at its job it can still learn.


SP4CEBAR 2021-07-30 10:44

@Mrlegoboy good idea, for every time it hits, I could add between 0 and 1 points to the score depending on how close to the center of the bat the ball hit


SP4CEBAR 2022-04-22 14:09

...I forgot about it


SP4CEBAR 2022-04-22 14:11

AI performance is now determined by how close to the center of the bat the ball hit


SP4CEBAR 2022-04-22 14:16 (Edited)

I changed the initialized ball direction, now the AI first needs to learn how to keep the bat centered (it can't just rely on pushing the bat to the wall anymore)


SP4CEBAR 2022-04-22 14:18

It's also now more likely for the AI to score more points


SP4CEBAR 2022-04-22 14:35

The mutation rate now decreases exponentially when the score of the leader increases
And... I forgot to reset the bat position for the next AI, so the "environment" would be different each time


SP4CEBAR 2022-04-22 14:44

I let it run for five minutes, and it still isn't great
I also noticed that quite a lot of AI neurons have show a value of zero, that could be a symptom of a bug


was8bit 2022-04-22 16:26

Keep at it..... you'll crack it ;)


Log in to reply.