Computer automated NES, but not how you think...

Evotistical · Apr 14, 2013

This guy made a way for a computer to automatically "solve" nes games through random chance(and a small train cycle), its own memory, and feedback of the memory addresses. Long story short, its a sped up timeline for the monkey/typewriter/Shakespeare problem:
http://hackaday.com/2013/04/14/teaching-a-computer-to-play-mario-seemingly-through-voodoo/

-=FamilyGuy=- · Apr 14, 2013

Nice find sir, very interesting read!

"The only winning move is not to play!"

[Edit, as I was inspired by RetroSwim's post below]

The amount of possible input combinations for a level of super mario bros, is 8^(300*60). (8 possible buttons, inputs are registered every 1/60 second and the timeout is 300 seconds)

This number is astronomical (~16 kB for the number alone): http://pastebin.com/raw.php?i=AdagzbQf
(pastebin'd to save you from scrolling down too much)

I guess his algorithm saves time! (to simulate all that in 100 years, you'd need to be more than ~3.96*10^16248 times the regular speed...)

RetroSwim · Apr 14, 2013

Evotistical said: ↑

Long story short, its a sped up timeline for the monkey/typewriter/Shakespeare problem
Click to expand...

Er, not really.

The computer isn't brute forcing the solution to the game. It has metrics for progress (points scored, X-axis position on level).

With the monkeys on typewriters problem, the monkeys don't know if a particular permutation is more or less like the target phrase. They have no way to know if they're "getting closer", thereby no way to improve. They will literally keep typing for eternity until the correct permutation is stumbled upon.

In the video you've linked, the program puts together combinations of actions that it knows contribute to progress, and excludes actions that result in a loss of progress. It actually learns to complete the game.

None the less, it is a very cool video, with very cool computer science behind it.

-=FamilyGuy=- · Apr 14, 2013

RetroSwim said: ↑

Er, not really.
Click to expand...

Arguably, he said "long story short" and "a sped up", which seems to indicate he knew the difference but was referring to the monkey problem so people without a big math culture would understand. Also, see my edited post above yours for a nice number, and a less nice one.

RetroSwim · Apr 15, 2013

That's the thing, I don't think it's "long story short" or "sped up" at all. It's a smarter approach to the monkey problem.

The monkey problem deals with the probability of them stumbling on the phrase.

In this case, the AI doesn't have to "stumble" on to the solution for the game, because it knows the conditions for both success and failure. It doesn't have to try every single possible combination of buttons vs time. It is also aware of the outcomes of its actions, unlike the monkeys that are hitting keys at random forever, no way to refine their attack, no "lessons learned" from failed attempts, and so forth.

You could take the monkeys route, and simulate every single combination of buttons for every single frame of the whole game, until you found one that succeeded. The point of the software in this video is that you don't have to, when the algorithm is self-refining.

This would be more like taking the monkeys with typewriters, and offering them a reward for every character that they get in the right position, to rule out combinations of letters that will never work.

It's not simply a "sped up" monkeys problem. It's much more refined. The computer doesn't "random chance" on the solution to the level, it learns steps that result in the conditions for success.

Also, one small flaw in your calculation.

The controller has 8 buttons, yes, but more than one of those can be down per retrace. That's 256 possible combinations. So it's 256[SUP](300*60)[/SUP]. :O

-=FamilyGuy=- · Apr 15, 2013

RetroSwim said: ↑

Also, one small flaw in your calculation.
Click to expand...

You're right, I under-estimated the number. But you've overestimated it, you counted A+B and B+A. Also opposite directions can't be pressed at the same time and it's possible not to press any button! Using original controllers of course. I don't know where your 256 comes from, 2^8 ?

The right number would be the amount of combination of 1 to 6 buttons you can do among 8 possibilities, plus one for no input:
Sum[8!/(i!*(8 - i)!), {i, 1, 6}] + 1 = 247
If we drop the start button as it's not very useful in Mario Bros:
Sum[7!/(i!*(7 - i)!), {i, 1, 5}] +1 = 120

So the real number of different Mario Bros input in a level is 247^(300*60) for all human-pressable buttons, or 120^(300*60) if you wanna avoid pausing all the time...

In any case, all these numbers are far greater than the estimated amount of atoms in the universe: 2^300 ...

HEX1GON · Apr 15, 2013

Wish I understood maths, and able to be more advanced than I am.
I envy those who can do and make such amazing projects like this.

It was funny how the AI played Pac-Man and Tetris, since the AI paused right at the end LOL.

RetroSwim · Apr 15, 2013

-=FamilyGuy=- said: ↑

You're right, I under-estimated the number. But you've overestimated it, you counted A+B and B+A. Also opposite directions can't be pressed at the same time and it's possible not to press any button! Using original controllers of course. I don't know where your 256 comes from, 2^8 ?

The right number would be the amount of combination of 1 to 6 buttons you can do among 8 possibilities, plus one for no input:
Sum[8!/(i!*(8 - i)!), {i, 1, 6}] + 1 = 247
If we drop the start button as it's not very useful in Mario Bros:
Sum[7!/(i!*(7 - i)!), {i, 1, 5}] +1 = 120

So the real number of different Mario Bros input in a level is 247^(300*60) for all human-pressable buttons, or 120^(300*60) if you wanna avoid pausing all the time...

In any case, all these numbers are far greater than the estimated amount of atoms in the universe: 2^300 ...
Click to expand...

You're over-complicating it a bit.

- A+B and B+A are the same, you don't need to specifically account for that. For a controller byte 00000000 to 11111111, A+B and B+A are the same bit pattern: 11. You wouldn't get 00000011 twice, for instance.

- You don't need to "plus one for no input" because 2^8 is 256 values, from 0 to 255, 00 to FF, 00000000 to 11111111. No buttons to all buttons.

- This isn't a human using a d-pad. It's software. It can press Up+Down or Left+Right simultaneously. It wouldn't, because of how it detects progress, but it could. That's the point, the software algorithmically rejects combinations that don't fit its criteria for progress. What "makes sense" to us with our human perception is irrelevant.

Otherwise, yeah, it's all moot because the numbers are so stupidly large! And that's what makes this thing so cool, it can find the best way without having to consider bazillions of possibilities that will never work.

-=FamilyGuy=- · Apr 15, 2013

I took the physicist approach, you took the computer engineer one. I think we're both right but we'd argue for hours on the subject around a beer. My approach falls back to yours if I consider all 8 or less buttons combinations, mine takes an eternity less to compute than yours though! But it's still an eternity more than the age of the universe, so it's a moot point.

Trust me I'm a physicist!

[Edit]
To be clear I was estimating the amount of calculations required for a "Monkey with a gamepad" approach, not the articles heuristic, which is both interesting and infinitely more efficient than Paco the monkey.

RetroSwim · Apr 15, 2013

It's cool too see how people's brains things when they come from from different backgrounds, the things they consider to be assumed and whatnot.

wilykat · Apr 15, 2013

Screw math just get smashed on beer and day old pizza.

smf · Apr 15, 2013

RetroSwim said: ↑

The computer isn't brute forcing the solution to the game. It has metrics for progress (points scored, X-axis position on level).

With the monkeys on typewriters problem, the monkeys don't know if a particular permutation is more or less like the target phrase. They have no way to know if they're "getting closer", thereby no way to improve. They will literally keep typing for eternity until the correct permutation is stumbled upon.

In the video you've linked, the program puts together combinations of actions that it knows contribute to progress, and excludes actions that result in a loss of progress. It actually learns to complete the game.
Click to expand...

The only point that it learns is when you train it, that is where it identifies the lexicographic numbers. This is actually the hard part as it needs to figure out two byte (or more) numbers that increase.

To play the game it picks the next move similar to how you play chess & the video is just a visual representation of the final list of moves that it calculated. When calculating the move list it just runs the code for a certain number of frames using random sequences of moves, finds the best outcome based on how the lexicographic numbers have changed, restores the ram & uses one/some/all of the moves from the sequence, then it starts around the loop again. This stage is what he means by time travel in the introduction.

Evotistical · Apr 16, 2013

Buy a Chinese repro nes controller. The kind that don't have the middle pivot divot; and you can press opposing directions at one time, causing mario or luigi to moonwalk in smb3. Dunno what it does in smb1.

Computer automated NES, but not how you think...

Evotistical Robust Member

-=FamilyGuy=- Site Supporter 2049

RetroSwim <B>Site Supporter 2013</B><BR><B>Site Supporter 20

-=FamilyGuy=- Site Supporter 2049

RetroSwim <B>Site Supporter 2013</B><BR><B>Site Supporter 20

-=FamilyGuy=- Site Supporter 2049

HEX1GON FREEZE! Scumbag

RetroSwim <B>Site Supporter 2013</B><BR><B>Site Supporter 20

-=FamilyGuy=- Site Supporter 2049

RetroSwim <B>Site Supporter 2013</B><BR><B>Site Supporter 20

wilykat Site Supporter 2013

smf mamedev

Evotistical Robust Member

Share This Page

Computer automated NES, but not how you think...

Evotistical Robust Member

-=FamilyGuy=- Site Supporter 2049

RetroSwim <B>Site Supporter 2013</B><BR><B>Site Supporter 20

-=FamilyGuy=- Site Supporter 2049

RetroSwim <B>Site Supporter 2013</B><BR><B>Site Supporter 20

-=FamilyGuy=- Site Supporter 2049

HEX1GON FREEZE! Scumbag

RetroSwim <B>Site Supporter 2013</B><BR><B>Site Supporter 20

-=FamilyGuy=- Site Supporter 2049

RetroSwim <B>Site Supporter 2013</B><BR><B>Site Supporter 20

wilykat Site Supporter 2013

smf mamedev

Evotistical Robust Member

Share This Page

Useful Searches