> While testing Jeff’s brain, we found an unusual bug: Jeff would play perfectly fine for a while, and then rapidly stack pieces on top of each other to reach the top, ending the game. The bug would usually happen when the board was starting to get taller and would be less apparent when we reduced the depth of the search—that is, looked fewer moves ahead while considering our next block placement.
Classic RL reward-hacking! You'll often see 'suicidal' agents when the reward-shaping goes wrong, or they converge prematurely after inadequate exploration. (Tom7's Mario agent famously pauses the game to avoid losing.)
> I’m not sure if there are good ways around this. Rigorous testing, maybe, but it’s hard to improve your algorithm by observation once it’s not making obvious mistakes. This is, perhaps, one of the promises of using reinforcement learning without human play training: if your algorithm is able to achieve human-level performance, it’s probably correct enough to continue well into superhuman performance.
In this case, because the random number generator's decisions are unaffected by your actions, you can use the clairvoyant oracle trick, I think. Just record games, and then do your DFS planning over the entire game with the known sequence of blocks; this then represents perfect play and is a benchmark for your actually feasible algorithms.
speff 24 hours ago [-]
I read the article, but I didn't find a reason as to why they picked the multiplayer Tetris game as opposed to the myriad of singleplayer ones. I think this is a neat beginner project, but can't help but think of the handful of times I tried T99 just to quickly quit again because I got stomped on every time.
How much unnecessary frustration was caused by testing this out in prod? I know they said they only got a handful of wins, but how many people did it knock out over the course of testing it?
chefandy 22 hours ago [-]
It’s weird that caring about people impacted by the training of a hobby NN model feels anachronistic. Well those people were ALL the way at the other end of the internet tube, and it was only a game so their time doesn’t really matter, and think about how many people “weren’t” impacted!
It seems like tech has gotten so dang self-absorbed.
sschul 3 hours ago [-]
Thanks for your critique of the project.
Initially, we chose Tetris 99 because that was the context in which we had the idea. I won't deny, though, that being able to compare Jeff's performance with human players and see his interactions and victories in the Tetris 99 environment made the project more interesting for us; however, it was wrong to do so in a public setting where the players didn't know that our algorithm was involved and didn't opt in to playing against it.
We're no longer testing or developing Jeff, and if I do any similar future projects I'll choose singleplayer games or private environments where opponents voluntarily test their skills against the bot.
Waterluvian 23 hours ago [-]
Feels like feeling slighted by a drop of rain in a storm.
speff 23 hours ago [-]
Your point's not lost on me, but if you allow me to modify the analogy to: Feels like feeling slighted by a water hose in a storm. I think that's more accurate. Yea I'm still drenched, but it would feel (irrationally) worse if someone sprayed me in the rain
hsisfullofbots 19 hours ago [-]
The only positive thing we can say is that the bot appears to be - at best - slightly above average with Tetris 99 compared to an average human. Which makes it pathetic at this task considering it is a bot and should have near perfection with minimal effort. One extra slightly above human has little impact in the game honestly.
They don't do T-spins. They don't stack large attacks for KO's, or even seems to alter targets under any condition at all even. The targeting reticule is hard stuck on attackers with zero input put into it. So it has no overarching strategy besides survival, but even then, it doesn't try to play the badge stealth game either - backstabbing a leading player is the leading method of making comebacks, and necessary ones too. Which is why it reliably gets in the top 15 and then loses, you can not outfire someone without badges, there ARE one-shots in Tetris 99, the bot probably only won when it lucked out early on in being the leader of the pack and then fed on a crowd of people with targetting set to "badges" (there's a firepower multiplier when targeted by many people).
Which is... kinda pathetic reading into when I'm someone who reliably got top 3 outside of the Invictus mode. I think the creator might have 0 idea of what defines Tetris 99 at all. Like, straight up no awareness whatsoever. They're bad at this game, and thus, made a bot equally bad and only wins with luck.
It's just playing normal Tetris. It literally has no reason to do this against humans, does nothing with the human element. So it'll beat other clueless humans and then lose to the actually good humans. Between those two human groups, they're just one more in a sea of typical indistinguishable threats and fodder.
Good on the creator for stopping before making something more competent.
novia 19 hours ago [-]
I am good at Tetris 99. I've placed first many times.
After you place first it unlocks a level where you battle against only other users who have also placed first. I don't see any mention of this second level in the article (though admittedly I skimmed it), which makes me suspicious of whether they actually got their bot to place first. Obviously the next step would be to try it out on that harder level.
Also, more often than not, there are not 99 player characters who you are battling against. The field has always been filled out with other bots, generated by the game designers themselves. I don't think they would ban a bot player from the game.
zamadatix 18 hours ago [-]
Someone generating dud/faked source code, examples of the algorithms and challenges, videos of gameplay, and an article... rather than just winning a game of Tetris 99 with the approach? That would be more work than the face value of the article, not less, so I don't buy that theory at all.
Plenty of games have both official bots and anti-cheat and player botting rules, the presence of one does not define what the presence of another will or won't be.
umvi 17 hours ago [-]
Pretty sure Invictus is mostly bots these days
soganess 1 days ago [-]
This is really cool! Plus I like the ramification that the 'hard bits' of playing a game like tetris might be contained in the vision/recognition phase.
My only gotcha is with calling DFS 'classical AI'. I get they have a fancy hand-tuned cost function, but if that is AI, then I guess all approximate optimization algorithms are AI? Maybe in 60s that kinda of classification would have been fine, but not now, right?
Either way, that is a minor nerd-gripe with the messaging and does not, at all, detract from an otherwise rad project.
Retric 1 days ago [-]
> Maybe in 60s that kinda of classification would have been fine, but not now, right?
It’s kind of surprising to me we still call feedforward neural networks AI.
soganess 21 hours ago [-]
I can totally see where you are coming from and I like the wink and nod to the article!
I guess...hrm... if someone asked me what classical AI was today... I would answer with something like A*, decision trees, or Markov model-style algorithms (but never DFS). All of these have 'worked' for a long time but still feel like classical (ie no back-prop) AI.
To your second point, I'd be the first to admit I'm no expert (I don't do any ML professionally). In my limited understanding, it feels odd to separate MLPs from 'modern' AI. It's all still back-prop, yeah? I get that universal approximators and (idealized) Turing machine approximators like LSTMs are old hat. But Attention-based (key, value) query models just seem like the next step on the same ladder of "increasingly complex computing paradigms we've figured out a way to back-prop through."
Not something uniquely separate. First, we could do functions, then we could do Turing machines, and now we can do data queries.
But again, I'm probably just showing how green I am. I would, time permitting, honestly appreciate clarification/correction.
robertclaus 17 hours ago [-]
A bit into the article I reframed "AI" to mean "bot".
free_bip 24 hours ago [-]
Hopefully the author gets banned from the game for this. It's clearly cheating.
tobyhinloopen 23 hours ago [-]
I agree.
It is the future of cheating, completely undetectable
Daneel_ 23 hours ago [-]
The inputs would form a very distinct pattern that likely doesn’t look anything like what a human creates - I think this would certainly be detectable, although you could also account for this in your program (similar to the ongoing back and forth with captcha).
hsisfullofbots 19 hours ago [-]
Unfortunately, hard disagree. There are input "fuzzing" methods, and I know because I saw them operating in person.
And that would still be in the realm of "robotic" behaviours. We're beyond that point. Hearthstone is a game where bots will do everything in their power to fake humanity, even display intentionally toxic behaviors, like using certain lines in certain scenarios specifically known to mean to aggravate you (hovering cards that will get lethal next turn back and forth followed by Priest's Hello) or even straight up roping.
Daneel_ 13 hours ago [-]
I meant that the inputs from TFA would be easily identifiable as there was no attempt to ‘humanise’ the inputs. If the author had attempted to do so then obviously it’s more difficult (or possibly impossible).
johnisgood 12 hours ago [-]
Most "advanced enough" cheats supports "humanization".
It looks like the bot just beat humans because they react faster
foobaw 1 days ago [-]
yes but this aligns with the initial statement in the post that top 1 doesn't happen as much and top 15 is more common.
gildas 1 days ago [-]
I must admit I'm a little disappointed that the bot isn't able to do this. It would be much more efficient.
minimaxir 24 hours ago [-]
Given the computer vision approach, I'm surprised they didn't try different themes other than the Fire Emblem: Three Houses theme. There are many other themes for Tetris 99 with likely better visual clarity for the tetriminos.
toast0 23 hours ago [-]
For the author, if they revist this, they might look into 7-bag random generator. TLDR, most modern tetris games, including tetris 99 select a permutation of the 7 unique pieces every 7 pieces; this means even though Tetris 99 shows you the next 6 pieces, you may know more about the next pieces.
For example, at the beginning of the game, it shows you 6 pieces, but you can determine the 7th piece because it's the one that's not shown. The 8th piece could be any of the 7 pieces, but when you see the 13th piece, you know the 14th piece (and when you see the 12th piece, you know there are only two possibilities for the 13th piece).
Additional look ahead every so often might be helpful, given that they found
> When looking only 3 moves ahead, Jeff achieves a Tetris percent of 8.9%. When looking 6 moves ahead, he improves to 9.7%—nearly perfect!
I'd also be interested in seeing optimization for combos. Or t-spins; t-spins send 2x lines of garbage for each line cleared; which I hate, because I grew up before t-spins were encouraged. :P
Sohcahtoa82 23 hours ago [-]
> TLDR, most modern tetris games, including tetris 99 select a permutation of the 7 unique pieces every 7 pieces; this means even though Tetris 99 shows you the next 6 pieces, you may know more about the next pieces.
Wait, is this true?
I always assumed each piece was selected perfectly randomly, making it possible (though rare) to get the same piece 3 times in a row.
If what you're saying is true, then that mean there should never be more than 12 pieces between I pieces, and if you get two in a row, then it'll be a minimum of 6 before you see another.
Which...all seems within the realm of possibility. Tetris always seemed really good at having a very even spread of pieces without "streaks".
Yeah, it's true. Old games like gameboy and NES tetris (either flavor), and the Atari arcaee machine were one piece at a time random, and not particularly evenly distributed either.
vikingerik 7 hours ago [-]
The Gameboy one attempts to compensate for clumpiness by reducing the chance of repeating a piece, but it turns out that that's even bugged so that the overall randomization is unbalanced.
I considered making a Tetris 99 bot for a while just for the sake of doing it, but interfacing with the switch sounded like a huge pain. Cool to see someone actually went through with it.
bl4kers 19 hours ago [-]
Doesn't Nintendo frequently fill the 99 spots with bots?
TZubiri 23 hours ago [-]
Using AI to win in a game where reflexes are a key factor is lame.
Bad players are so desperate to cheat in order to win. They don't belive at fair competitions, but instead will use any dirty trick to win.
And this is also why most online games are full of invasive anti-cheat crap.
tobyhinloopen 1 days ago [-]
Alternative title: How I cheated in an online multiplayer game.
excalibur 1 days ago [-]
Welp there it is, I've been replaced by AI.
MisterTea 1 days ago [-]
Relax, it's not an AI. Instead, you were replaced by a depth first search algo.
Kapura 1 days ago [-]
I am not impressed that a computer can beat a human at a computer game. i am more pissed that i may have been playing tetris 99 against unbeatable ai.
ultimafan 1 days ago [-]
100%, can we not normalize this kind of thing in multiplayer games played by humans?
My friends and I have quit a number of multiplayer computer games when they've become infested with cheaters and developers couldn't or wouldn't keep up with the arms race of banning them. This is really not that different from someone writing and using an aimbot in a public lobby for a shooter.
If you're going to be writing and testing cheats or AI or bots do it in single player games or private multiplayer lobbies so normal players aren't affected.
HenryBemis 1 days ago [-]
Especially for a game like tetris that has a limited (many but finite) options, it comes down to how fast is the eye-hand coordination/reflexes. "The Flash" should have no point beating the game for that very reason.
GuB-42 24 hours ago [-]
It is not unbeatable. The author says it doesn't always get first place, far from it, and the video is one if its best games. He deliberately stopped working on the project before it would become unbeatable by a human.
dankwizard 19 hours ago [-]
Even if it isn't winning, I'm choosing multiplayer to play against others. If I wanted to verse machine code, I'd play single player.
1 days ago [-]
Rendered at 22:51:34 GMT+0000 (Coordinated Universal Time) with Vercel.
Classic RL reward-hacking! You'll often see 'suicidal' agents when the reward-shaping goes wrong, or they converge prematurely after inadequate exploration. (Tom7's Mario agent famously pauses the game to avoid losing.)
> I’m not sure if there are good ways around this. Rigorous testing, maybe, but it’s hard to improve your algorithm by observation once it’s not making obvious mistakes. This is, perhaps, one of the promises of using reinforcement learning without human play training: if your algorithm is able to achieve human-level performance, it’s probably correct enough to continue well into superhuman performance.
In this case, because the random number generator's decisions are unaffected by your actions, you can use the clairvoyant oracle trick, I think. Just record games, and then do your DFS planning over the entire game with the known sequence of blocks; this then represents perfect play and is a benchmark for your actually feasible algorithms.
How much unnecessary frustration was caused by testing this out in prod? I know they said they only got a handful of wins, but how many people did it knock out over the course of testing it?
It seems like tech has gotten so dang self-absorbed.
Initially, we chose Tetris 99 because that was the context in which we had the idea. I won't deny, though, that being able to compare Jeff's performance with human players and see his interactions and victories in the Tetris 99 environment made the project more interesting for us; however, it was wrong to do so in a public setting where the players didn't know that our algorithm was involved and didn't opt in to playing against it.
We're no longer testing or developing Jeff, and if I do any similar future projects I'll choose singleplayer games or private environments where opponents voluntarily test their skills against the bot.
They don't do T-spins. They don't stack large attacks for KO's, or even seems to alter targets under any condition at all even. The targeting reticule is hard stuck on attackers with zero input put into it. So it has no overarching strategy besides survival, but even then, it doesn't try to play the badge stealth game either - backstabbing a leading player is the leading method of making comebacks, and necessary ones too. Which is why it reliably gets in the top 15 and then loses, you can not outfire someone without badges, there ARE one-shots in Tetris 99, the bot probably only won when it lucked out early on in being the leader of the pack and then fed on a crowd of people with targetting set to "badges" (there's a firepower multiplier when targeted by many people).
Which is... kinda pathetic reading into when I'm someone who reliably got top 3 outside of the Invictus mode. I think the creator might have 0 idea of what defines Tetris 99 at all. Like, straight up no awareness whatsoever. They're bad at this game, and thus, made a bot equally bad and only wins with luck.
It's just playing normal Tetris. It literally has no reason to do this against humans, does nothing with the human element. So it'll beat other clueless humans and then lose to the actually good humans. Between those two human groups, they're just one more in a sea of typical indistinguishable threats and fodder.
Good on the creator for stopping before making something more competent.
After you place first it unlocks a level where you battle against only other users who have also placed first. I don't see any mention of this second level in the article (though admittedly I skimmed it), which makes me suspicious of whether they actually got their bot to place first. Obviously the next step would be to try it out on that harder level.
Also, more often than not, there are not 99 player characters who you are battling against. The field has always been filled out with other bots, generated by the game designers themselves. I don't think they would ban a bot player from the game.
Plenty of games have both official bots and anti-cheat and player botting rules, the presence of one does not define what the presence of another will or won't be.
My only gotcha is with calling DFS 'classical AI'. I get they have a fancy hand-tuned cost function, but if that is AI, then I guess all approximate optimization algorithms are AI? Maybe in 60s that kinda of classification would have been fine, but not now, right?
Either way, that is a minor nerd-gripe with the messaging and does not, at all, detract from an otherwise rad project.
Thus classical, as a reference to the idea that:
As Soon As It Works, No One Calls It AI Anymore: https://quoteinvestigator.com/2024/06/20/not-ai/
It’s kind of surprising to me we still call feedforward neural networks AI.
I guess...hrm... if someone asked me what classical AI was today... I would answer with something like A*, decision trees, or Markov model-style algorithms (but never DFS). All of these have 'worked' for a long time but still feel like classical (ie no back-prop) AI.
To your second point, I'd be the first to admit I'm no expert (I don't do any ML professionally). In my limited understanding, it feels odd to separate MLPs from 'modern' AI. It's all still back-prop, yeah? I get that universal approximators and (idealized) Turing machine approximators like LSTMs are old hat. But Attention-based (key, value) query models just seem like the next step on the same ladder of "increasingly complex computing paradigms we've figured out a way to back-prop through." Not something uniquely separate. First, we could do functions, then we could do Turing machines, and now we can do data queries.
But again, I'm probably just showing how green I am. I would, time permitting, honestly appreciate clarification/correction.
It is the future of cheating, completely undetectable
And that would still be in the realm of "robotic" behaviours. We're beyond that point. Hearthstone is a game where bots will do everything in their power to fake humanity, even display intentionally toxic behaviors, like using certain lines in certain scenarios specifically known to mean to aggravate you (hovering cards that will get lethal next turn back and forth followed by Priest's Hello) or even straight up roping.
[1] https://harddrop.com/wiki/T-Spin_Guide
It looks like the bot just beat humans because they react faster
For example, at the beginning of the game, it shows you 6 pieces, but you can determine the 7th piece because it's the one that's not shown. The 8th piece could be any of the 7 pieces, but when you see the 13th piece, you know the 14th piece (and when you see the 12th piece, you know there are only two possibilities for the 13th piece).
Additional look ahead every so often might be helpful, given that they found
> When looking only 3 moves ahead, Jeff achieves a Tetris percent of 8.9%. When looking 6 moves ahead, he improves to 9.7%—nearly perfect!
I'd also be interested in seeing optimization for combos. Or t-spins; t-spins send 2x lines of garbage for each line cleared; which I hate, because I grew up before t-spins were encouraged. :P
Wait, is this true?
I always assumed each piece was selected perfectly randomly, making it possible (though rare) to get the same piece 3 times in a row.
If what you're saying is true, then that mean there should never be more than 12 pieces between I pieces, and if you get two in a row, then it'll be a minimum of 6 before you see another.
Which...all seems within the realm of possibility. Tetris always seemed really good at having a very even spread of pieces without "streaks".
https://tetris.wiki/Tetris_Guideline
https://simon.lc/the-history-of-tetris-randomizers
via: https://news.ycombinator.com/item?id=20872110
https://harddrop.com/wiki/Tetris_(Game_Boy)#Randomizer
> mention graph search for vision correction? https://github.com/bpinzone/TetrisAI/blob/master/tetris_ai_e... do we still use this?
Bad players are so desperate to cheat in order to win. They don't belive at fair competitions, but instead will use any dirty trick to win.
And this is also why most online games are full of invasive anti-cheat crap.
My friends and I have quit a number of multiplayer computer games when they've become infested with cheaters and developers couldn't or wouldn't keep up with the arms race of banning them. This is really not that different from someone writing and using an aimbot in a public lobby for a shooter.
If you're going to be writing and testing cheats or AI or bots do it in single player games or private multiplayer lobbies so normal players aren't affected.