Comments #1

SarvagyaVaish · 2014-02-15T21:48:13Z

Leave your comments here...

xissy · 2014-02-15T23:26:42Z

wow, this is amazing. inspired by your practical ML approach.

iandanforth · 2014-02-15T23:57:12Z

You should get a tapsterbot! https://github.com/hugs/tapsterbot

Aaron1011 · 2014-02-16T00:05:49Z

This is incredible!

joeyslater · 2014-02-16T01:04:28Z

That's what's up man.

dend · 2014-02-16T02:46:44Z

Awesome job, Survy!

halfdan · 2014-02-16T03:16:16Z

Nice job - please add a proper reference to the source of the pseudo code though. It's clearly taken out of a publication.

Giszmo · 2014-02-16T03:24:45Z

I never did image analysis but I assume it to be trivial to do with a camera for your android bot. You said the image (screenshot) takes 2s to get to the PC? A cam should be much much faster. The image analysis would basically just need to scan the right side of the screen for green-notgreen-green. The timing is constant.

ztl2004 · 2014-02-16T03:26:17Z

Dude, this is fantastic and it's what I be thought about for a long time, I ve noticed that u want to do this on mobile, I ve studied ios private Apis and I ve done the screen capture and touch simulation, do u think there is a possibility that we work it out

bolte-17 · 2014-02-16T06:57:44Z

Any thought to adding either bird's current velocity (or as a proxy, time since last tap) to the state space? That seems to be the only missing parameter.

ztl2004 · 2014-02-16T10:45:22Z

but I think it's hard to get

在 Feb 16, 2014，2:57 PM，bolte-17 notifications@github.com 写道：

Any thought to adding either bird's current velocity (or as a proxy, time since last tap) to the state space? That seems to be the only missing parameter.

—
Reply to this email directly or view it on GitHub.

cbbayburt · 2014-02-16T14:10:59Z

Actually, simulating the game's dynamics might lead to a simpler and more precise solution. the game doesn't really involve sophisticated decision steps which requires ML. Since it is really a pure physics problem, simpler solution depends on some simple observations though:

Through observation, I found out that keeping the bird level requires a tapping period of 600ms.
Let's say the bird's jump height is hb. So in original game, every tap makes the bird go up hb units, while every 600ms its height goes down hb units.
Ascending and descending actions are achieved by simply modifying the tapping period (Smaller to ascend, lower to descend).
Actual amount of ascend can be calculated as: dH = hb - (hb*ptap/600) From here, the required tapping period to achieve a specific ascend/descend amount 'dH' can be calculated as: ptap = 600 - (600 * dH / hb)

So the algorithm would be:

hb: The bird's jump height for a single tap, in other words, the amplitude of the bird's harmonic motion in level flight (it is constant and can be measured in means of pixels).
hBird: Height of the middle point of the bird's harmonic motion.
hObstacle: Height of the middle point of the space between the pipes.
ptap: Waiting period before the next tap.
dh: The height difference between the bird and the obstacle path.

for each immediate uncleared obstacle:
  while(obstacle_not_cleared)
    dh <- hObstacle - hBird
    ptap <- 600 - (600 * dh / hb)
    if ptap < 0 then ptap <- 0  //Gonna fall, tap immediately
    sleep(ptap)
    tap()

This algorithm can make the flappy lips fly forever. For android, instead of requesting .png screenshots which really takes about 1-2 seconds, you can analyze specific pixels in the raw frame buffer (some unix device file like /dev/graphics/fb0) which gives you enough speed to run the algorithm. But for that, you obviously need a rooted device.

SarvagyaVaish · 2014-02-16T16:56:00Z

Analyzing the "specific pixels in the raw frame buffer" is worth a shot! Thanks!
And I agree with the solution being nicer if I simulated the game dynamics, but I wanted to approach the problem using machine learning. Thanks for the solution though.

savraj · 2014-02-16T17:23:20Z

I'd love a deeper walkthrough of this -- maybe a youtube video.

metaylor · 2014-02-17T03:01:02Z

This is very cool. Good idea to pick a popular game and show that ML can solve it! I'm going to bring up your project as a discussion topic in the graduate reinforcement learning class I'm currently teaching.

http://www.eecs.wsu.edu/~taylorm/14_580/index.html

SarvagyaVaish · 2014-02-17T03:03:27Z

That is awesome!! I am honored. Thanks :)
May I ask how you found the link?

metaylor · 2014-02-17T03:06:13Z

My brother pointed me to it. I'm not sure how he found out about it though.

Best,
Matt

Matt Taylor
http://eecs.wsu.edu/~taylorm

On Sun, Feb 16, 2014 at 7:03 PM, Sarvagya Vaish notifications@github.comwrote:

That is awesome!! I am honored. Thanks :)
May I ask how you found the link?

Reply to this email directly or view it on GitHubhttps://github.com//issues/1#issuecomment-35225526
.

ztl2004 · 2014-02-17T06:17:05Z

maybe reddit

在 Feb 17, 2014，11:06 AM，metaylor notifications@github.com 写道：

My brother pointed me to it. I'm not sure how he found out about it though.

Best,
Matt

Matt Taylor
http://eecs.wsu.edu/~taylorm

On Sun, Feb 16, 2014 at 7:03 PM, Sarvagya Vaish notifications@github.comwrote:

That is awesome!! I am honored. Thanks :)
May I ask how you found the link?

Reply to this email directly or view it on GitHubhttps://github.com//issues/1#issuecomment-35225526
.

—
Reply to this email directly or view it on GitHub.

billhao · 2014-02-17T07:43:24Z

this is very cool!

ataugeron · 2014-02-17T14:47:20Z

Get this to work on a mobile phone!! If anyone has any ideas , please let me know in the comments :)

Did you try using monkeyrunner (Python, Android) or UIAutomation (Javascript, iOS)?

SarvagyaVaish · 2014-02-17T14:51:32Z

monkeyrunner took about 1-2 seconds to get a screenshot.. so not responsive enough.
Haven't tried UIAutomation, but do you know if the response time is any better?

cxt120 · 2014-02-17T23:43:34Z

Is the training only works on a specific map?

SarvagyaVaish · 2014-02-17T23:46:20Z

There is no "map". There is randomness as far as the pipe height is concerned, but the game is basically just one never ending randomized "map" of pipes coming towards you.

cooperjay · 2014-02-28T07:02:44Z

i just another working method over here : http://flappybirdhack.hol.es/

thebino · 2014-03-03T21:40:55Z

How do you want to grap /dev/graphics/fb0 and use the result image for calculating? Do you want to write something like an TestCase with Events injection on the WindowManager?

Eniac-Xie · 2014-04-22T07:17:59Z

Is Q[s,a] just a large array?Or a function like BP Neural Networks?

SarvagyaVaish · 2014-04-22T14:21:32Z

Yeah. Q is a multi-dimensional array representing the entire state space.

Eniac-Xie · 2014-04-23T04:15:56Z

I'm a little curious. I think the bird's speed should also be considered. I mean that birds with the same position but different speed will lead to different result,didn't it?

SarvagyaVaish · 2014-04-23T16:16:05Z

Based on the game dynamics, the bird always gets the same upward velocity irrespective of its velocity at the time of input. So weirdly enough, two birds at the same position with different speeds will end up at the same position when the user tell them to jump.

Eniac-Xie · 2014-04-23T17:02:46Z

thank you！

2014-04-24 0:16 GMT+08:00 Sarvagya Vaish notifications@github.com:

Based on the game dynamics, the bird always gets the same upward velocity
irrespective of its velocity at the time of input. So weirdly enough, two
birds at the same position with different speeds will end up at the same
position when the user tell them to jump.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/1#issuecomment-41181440
.

Eniac-Xie · 2014-04-24T02:27:09Z

I try it myself but find Q cannot converge in a short time, maybe my Q is too large(160_401_2. it seems that 160_401_2 is not large). How large is your Q?

SarvagyaVaish · 2014-04-24T02:29:32Z

It takes about 6-8 hours at regular game speed for flappy to learn a good model.

andreydung · 2014-05-17T21:08:16Z

How do you run the code? Is it simply running index.html?

SarvagyaVaish · 2014-05-17T21:18:07Z

Yeah. Just start up a local server (wamp, xampp) and open the index.html

junzhez · 2014-06-05T04:50:10Z

Just a quick question. Can the vertical distance to pipe bottom be negative?

SarvagyaVaish · 2014-06-07T00:24:41Z

Yes, if the bird is below the pipe :)

junzhez · 2014-06-07T00:59:25Z

Thanks for your reply. May I ask you about the dimension of your state space? I am trying to reproducing you work with another copy of Flappy Bird. It seems that my state space is way to large.

SarvagyaVaish · 2014-06-07T03:12:54Z

I dont remember exactly, but it was huge! Takes a while to train. Check out http://sarvagyavaish.github.io/FlappyBirdRL/ for more details.

tropicdome · 2014-10-06T07:46:18Z

Nice work, I love application of RL for something some fun like this, kudos :)

I have tried out your implementation but with different resolutions since this can greatly decrease the number of states. Using a resolution of 10 instead of 4 lowered the state space from 12150 states to 1944. Here is my data after running this

14 points after 7 min
17 points after 9 min
48 points after 10 min
62 points after 14 min
145 points after 19 min
496 points after 25 min
1000+ after 1h 10min

One question, does it or should it take the distance to the ground into account? When you get a pipe that is really close to the ground, the bird would sometimes like to go below and then jump, which it obviously can't, but it is not learning from this?

SarvagyaVaish · 2014-10-06T14:40:39Z

Thanks for crunching the numbers! Its cool to see that the state space affects the learning times so drastically.
About the distance from the ground, its true that the model doesn't learn that it should jump when close to the ground. I didn't want to add another dimension to my state space, and that's primarily why i don't take that into account. But for better results, you could probably add a general (non-learned) rule that says that the bird must jump when close to the ground. Another idea would be to add that third dimension of distance to ground but only have two state in it - less than xx units from the ground, more than xx units from the ground. That way you would only be doubling the state space, but can have the system learn the rule anyway :)

SteveRik · 2015-01-05T11:42:41Z

Very nice blog. Thanks for sharing! Is there any possibility that the vertical distance to pipe bottom be negative? Please advise. Thanks! https://intellipaat.com/

SarvagyaVaish · 2015-01-05T13:23:37Z

Yes. It is possible and the model accounts for that :)

On Mon, Jan 5, 2015, 06:42 SteveRik notifications@github.com wrote:

Very nice blog. Thanks for sharing! Is there any possibility that the
vertical distance to pipe bottom be negative? Please advise. Thanks!
https://intellipaat.com/

—
Reply to this email directly or view it on GitHub
#1 (comment)
.

xoancosmed · 2015-04-06T09:35:52Z

It is Open Source ?

SarvagyaVaish · 2015-04-06T12:28:54Z

Yes.

On Mon, Apr 6, 2015, 5:35 AM Xoán Carlos Cosmed Peralejo <
notifications@github.com> wrote:

It is Open Source ?

—
Reply to this email directly or view it on GitHub
#1 (comment)
.

AIForex · 2015-11-29T16:19:47Z

I'm working on a similar program that would involve the Forex market ,

Actions per bar would be as follows 1 buy open exit close, 2) sell open exit close 3) Do nothing soon to be published on www.marketcheck.co.uk

Peter
peterkhenry@gmail.com

Aytros · 2016-06-11T18:35:05Z

This is great! I recently graduated with a degree in Comp. Sci. My last semester I took Intro to AI and our final project was to implement this on our own and we were provided with a working python flappy bird. My agent was not very effiecient but it did learn a little so I did well. NOw that I am graduated, I would like to iprove my agent for my own sake. Would you be able to look over my algorithm and give some feedback on how I might be able to improve?

paulocastroo · 2018-04-06T01:24:49Z

6-7 hours is not good at all, made this flappy bird bot training in 3 minutes with random forest, I'll see if I could fit some room of improvement with your code in qlearn

SarvagyaVaish · 2018-04-06T08:15:04Z

@paulocastroo that's because i was running the flappy bird in realtime using the game engine. If you could speed up the simulation, training would end up being significantly faster.
Curious to learn how you used random forest to train. Let me know :) Thanks!

tropicdome · 2018-04-06T17:45:08Z

For classic Q-learning @SarvagyaVaish implementation is already quite good. Doesn't have to take 6-7 hours, besides the real-time perspective as @SarvagyaVaish mentioned, you could/should optimize your state space representation. For example, change the resolution to e.g. 20 to reduce the state space significantly (which is reasonable) and it will train in <15 minutes running in real-time, or even 30 and it trained for me in 2.5 min.

paulocastroo · 2018-04-11T00:58:41Z

@SarvagyaVaish oh sorry was not paying attention with the real life. I made some changes with the states, I tried to compress the states as much as possible, it ended as difference/distance between the height of the bird and the pipe hole making the overall matrix much smaller, here's a demo: https://planktonfun.github.io/q-learning-js/step-6.html

Comments #1

Comments #1

Comments

SarvagyaVaish commented Feb 15, 2014

xissy commented Feb 15, 2014

iandanforth commented Feb 15, 2014

Aaron1011 commented Feb 16, 2014

joeyslater commented Feb 16, 2014

dend commented Feb 16, 2014

halfdan commented Feb 16, 2014

Giszmo commented Feb 16, 2014

ztl2004 commented Feb 16, 2014

bolte-17 commented Feb 16, 2014

ztl2004 commented Feb 16, 2014

cbbayburt commented Feb 16, 2014

SarvagyaVaish commented Feb 16, 2014

savraj commented Feb 16, 2014

metaylor commented Feb 17, 2014

SarvagyaVaish commented Feb 17, 2014

metaylor commented Feb 17, 2014

ztl2004 commented Feb 17, 2014

billhao commented Feb 17, 2014

ataugeron commented Feb 17, 2014

SarvagyaVaish commented Feb 17, 2014

cxt120 commented Feb 17, 2014

SarvagyaVaish commented Feb 17, 2014

cooperjay commented Feb 28, 2014

thebino commented Mar 3, 2014

Eniac-Xie commented Apr 22, 2014

SarvagyaVaish commented Apr 22, 2014

Eniac-Xie commented Apr 23, 2014

SarvagyaVaish commented Apr 23, 2014

Eniac-Xie commented Apr 23, 2014

Eniac-Xie commented Apr 24, 2014

SarvagyaVaish commented Apr 24, 2014

andreydung commented May 17, 2014

SarvagyaVaish commented May 17, 2014

junzhez commented Jun 5, 2014

SarvagyaVaish commented Jun 7, 2014

junzhez commented Jun 7, 2014

SarvagyaVaish commented Jun 7, 2014

tropicdome commented Oct 6, 2014

SarvagyaVaish commented Oct 6, 2014

SteveRik commented Jan 5, 2015

SarvagyaVaish commented Jan 5, 2015

xoancosmed commented Apr 6, 2015

SarvagyaVaish commented Apr 6, 2015

AIForex commented Nov 29, 2015

Aytros commented Jun 11, 2016

paulocastroo commented Apr 6, 2018 • edited

SarvagyaVaish commented Apr 6, 2018

tropicdome commented Apr 6, 2018 • edited

paulocastroo commented Apr 11, 2018

paulocastroo commented Apr 6, 2018 •

edited

tropicdome commented Apr 6, 2018 •

edited