New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spades Implementation #1214
Comments
Hi @i-Madsen That sounds about right to me, but I will give a bit of advice: those observation tensor functions are the most difficult ones to implement. Leave them until the end. I suggest getting a working implementation of Spades with those functions left blank or returning bogus values first because that's already a nontrivial task. It's best to discuss those functions only after the basic game logic functionality is implemented. |
Gotcha, that makes sense. Thanks, I'll stick with this for now and continue on with getting the game logic all working. |
I've now got the game functionality able to build and run, completing the basic game tests successfully (I'm ignoring/disabling anything to do with double_double and the uncontested_bidding variant). I think I'm about ready to take a stab at implementing the tensor functions, but have a question about the overall approach for this game. The Bridge implementation runs only a single round as a zero sum game, always starting with the same player. While this makes sense for Bridge, Spades is a little different as both teams can go either positive or negative during a round. Therefore, the overall game state is much more important as a lower-risk bid that results in that partnership winning the overall game is much better than a high-risk bid that results in a higher point difference. (Overall game state also determines if bag penalties are a risk during the round.) Currently, I've kept the Bridge structure of always starting with player 0 and playing a single round and my thoughts are that I'll just feed in game parameters like this:
The outside training script can track the scores and manage the teams by 'rotating' what positions the players/team scores are in. (Also need to add a big bonus for winning to the returned rewards in the game environment?) @lanctot Does this seem reasonable? Or do I need to rethink the approach? |
Hi @i-Madsen , yes, that sounds very reasonable to me! And using the Bridge functions as a reference is good. You can also check out Hearts, Skat, and Dou Di Zhu games as well for reference. Good luck! |
Hi there, I'm trying to make a new implementation of Spades and am looking for some help along the way as I'm pretty new to RL. Ultimately, I'm hoping to use the framework to train a Spades AI and use it in an iOS/Android app made in Unity (but that's getting a little ahead of myself).
Since the game of Bridge has a similar structure (partnerships, bidding phase, trick taking) and is already in OpenSpiel, I'm using it as a base, but some game concepts are pretty different - particularly how bidding works.
So the first thing I want to make sure I'm doing correctly is dealing with the tensor sizing/representation.
bridge.h starts out with:
As far as I understand, Bridge bidding has multiple actions you can take and keeps going until all other players pass, culminating in a single contract. Spades, on the other hand, has each player make a single bid and then the bidding phase ends, leaving '4' contracts (partnerships typically work together to meet their combined contract, but a 'Nil' bid (0) is player dependent). Spades also has no concepts of being vulnerable or having a declarer.
So with that in mind, I'm thinking I change it like this:
Later on, there is a method for getting the play tensor size.
bridge.h
Again, bids in spades are a simple. single bid, one per player, with no concept of modifying calls, a declarer, or vulnerability. Spades will always be trump and there is no 'dummy' hand, so I'm assuming I can just remove half of these and then multiply bids/tricks by kNumPlayers since we'll want to track each player individually.
spades.h
Am I misunderstanding anything or does this seem correct?
The text was updated successfully, but these errors were encountered: