Progress in Approximate Nash Equilibria - Corelab

Progress in Approximate Nash EquilibriaConstantinos Daskalakis ∗Computer ScienceUC BerkeleyBerkeley, CAcostis@cs.berkeley.eduAranyak MehtaIBM Almaden ResearchCenterSan Jose, CAmehtaa@us.ibm.comChristos PapadimitriouComputer ScienceUC BerkeleyBerkeley, CAchristos@cs.berkeley.eduABSTRACTIt is known [5] that an additively ɛ-approximate Nash equilibrium(with supports of size at most two) can be computedin polynomial time in any 2-player game with ɛ = .5. It isalso known that no approximation better than .5 is possibleunless equilibria with support larger than log n are considered,where n is the number of strategies per player. Wegive a polynomial algorithm for computing an ɛ-approximateNash equilibrium in 2-player games with ɛ ≈ .38; our algorithmcomputes equilibria with arbitrarily large supports.Categories and Subject DescriptorsF.2.0 [Theory of Computation]: analysis of algorithmsand problem complexity—generalGeneral TermsAlgorithms, Economics, TheoryKeywordsNash Equilibrium, Approximate Nash, Algorithm1. INTRODUCTIONIt was recently shown that finding a Nash equilibrium isPPAD-complete [4], even for 2-player games [2]. As a consequence,finding approximate Nash equilibria has emergedas the main remaining open question in the area of equilibriumcomputation. The most commonly studied form of approximationis additive approximation in games in which allutilities have been normalized to be between 0 and 1 (this isa common assumption, since scaling the utilities of a playerby any positive factor, and applying any additive constant,results in an equivalent game). A set of mixed strategies iscalled an ɛ-approximate Nash equilibrium, whereɛ>0, if∗ The first and third authors were supported through NSFgrant CCF - 0635319, a gift from Yahoo! Research and aMICRO grant.Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.E C’07, J une 11–15, 2007, San D iego, California, USA.Copyright 2007 ACM 978-1-59593-653-0/07/0006 ...$5.00.for each player all strategies have expected payoff that is atmost ɛ more than the expected payoff of the given strategy.Clearly, any mixed strategy combination is a 1-approximateNash equilibrium, and it is quite straightforward to find a34-approximate Nash equilibrium in 2-player games by examiningall supports of size two; see [8] for a slightly improvedresult. In [9] it was shown that an ɛ-approximateNash equilibrium can be found in time O(n log nɛ 2 )byexaminingall supports of size log n . It was pointed out in [1] thatɛno algorithm that examines 2supports smaller than aboutlog n can achieve an approximation better than 1 ,evenfor4zero-sum games. In fact [6] have sharpened the 1 to a 1 .4 2A very simple algorithm achieving ɛ = 1 in 2-person2games was pointed out in [5]: For any strategy i of playerIletj be the best response of player II, and let k be thebest response of player I to j. Then player I plays an equalmixture of i and k, while player II plays j. The proof of1-approximation is not very hard.2In this paper we give an algorithm which breaks the barrierof 1 by considering supports of arbitrarily large cardinality.It achieves an approximation ratio 3−√ 52+ɛ ≈ .38+ɛ,2for any ɛ>0, in time n O( 1ɛ 2 ) . It is based on the followingtwo ideas: (a) If the values (u, v) of a Nash equilibrium tothe two players were known, then we would be able to find amax{u, v}-approximate Nash equilibrium by solving a set oflinear inequalities; and (b) for every Nash equilibrium, thereis a pair of mixed strategies with support size O( 1 )whichɛapproximates within ɛ the true values of that Nash 2 equilibrium[9]. Our technique uses both these ideas and also takescare of the interaction between them. The linear programin (a) will, of course, return in general mixed strategies ofarbitrarily large support.2. PRELIMINARIESWe consider normal form games between two players, therow player and the column player, each with n strategiesat his disposal. The game is defined by two n × n payoffmatrices, R for the row player, and C for the column player.The pure strategies of the row player correspond to the nrows and the pure strategies of the column player correspondto the n columns. If the row player plays row i and thecolumn player plays column j, then the row player receivesa payoff of R ij and the column player gets C ij. Payoffsareextended linearly to pairs of mixed strategies — if the rowplayer plays a probability distribution x over the rows andcolumn player plays a distribution y over the columns, then355

the row player gets a payoff of x T Ry and the column playergets a payoff of x T Cy.A Nash equilibrium in this setting is a pair of mixed strategies,x ∗ for the row player and y ∗ for the column player, suchthat neither player has an incentive to unilaterally defect.Note that, by linearity, the best defection is to a pure strategy.Let e i denote the vector with a 1 at the ith coordinateand 0 elsewhere. A pair of mixed strategies (x ∗ ,y ∗ )isaNash equilibrium if∀ i =1..n, e T i Ry ∗ ≤ x ∗T Ry ∗∀ i =1..n, x ∗T Ce i ≤ x ∗T Cy ∗It can be easily shown that every pair of equilibrium strategiesof a game does not change upon multiplying all the entriesof a payoff matrix by a constant, and upon adding thesame constant to each entry. We shall therefore assume thatthe entries of both payoff matrices R and C are between 0and 1.For ɛ>0, we define an ɛ-approximate Nash equilibrium tobe a pair of mixed strategies x ∗ for the row player and y ∗for the column player, so that the incentive to unilaterallydeviate is at most ɛ:∀ i =1..n, e T i Ry ∗ ≤ x ∗T Ry ∗ + ɛ∀ i =1..n, x ∗T Ce i ≤ x ∗T Cy ∗ + ɛA stronger notion of approximately equilibrium strategieswas introduced in [7, 4]: For ɛ > 0, a well-supported ɛ-approximate Nash equilibrium, oranɛ-well-supported Nashequilibrium, is a pair of mixed strategies, x ∗ for the rowplayer and y ∗ for the column player, so that a player playsonly approximately best-response pure strategies with nonzeroprobability:∀ i : x ∗ i > 0 ⇒ e T i Ry ∗ ≥ e T j Ry ∗ − ɛ, ∀ j∀ i : yi ∗ > 0 ⇒ x ∗T Ce i ≥ x ∗T Ce j − ɛ, ∀ jThis is indeed a stronger definition, in the sense that everyɛ-well supported Nash equilibrium is also an ɛ-approximateNash equilibrium, but the converse need not be true. However,the following lemma from [3] shows that there doesexist a polynomial relationship between the two:Lemma 2.1. [3] For every 2 player normal form game,for every ɛ>0, given an ɛ2 -approximate equilibrium we can8ncompute in polynomial time an ɛ-well-supported equilibrium.It is known, see [4, 2, 3], that computing an 1 -well supportedNash equilibrium is PPAD-complete, for any con-n αstant α > 0, and, by the above lemma, so is computing1a -approximate equilibrium. Therefore, a FPTAS fornthe α+1 problem is unlikely. In [5] we provide an algorithm forcomputing a 1 -approximate Nash equilibrium via a simple2algorithm which considers strategies of support 2 and an algorithmfor computing a 2 -well supported equilibrium conditionalon some graph theoretic conjecture. Another at-3tempt towards computing approximate equilibria (althoughwith inferior approximation guarantees) can be found in [8].3. COMPUTATION OF APPROXIMATE NASHEQUILIBRIA3.1 An Existence ProofUsing the properties of Nash equilibrium and Hoefding-Chernoff bounds we derive the following.Lemma 3.1. For any ɛ>0 and for any two-player gameG =(R, C), whereR, C are n × n matrices with entries in[0, 1], there exist v R,v C ∈ [0, 1], mixed strategies x, α forthe row player, y,β for the column player, with |supp(α)| ≤4/ɛ 2 , |supp(β)| ≤4/ɛ 2 , such that the following are satisfied|α T Rβ − v R|≤ɛ (1)∀ i ∈ supp(α) : e T i Ry = v R (2)|x T Rβ − v R|≤ɛ (3)∀ i : e T i Ry ≤ v R (4)|α T Cβ − v C|≤ɛ (5)∀ j ∈ supp(β) : x T Ce j = v C (6)|α T Cy − v C|≤ɛ (7)∀ j : x T Ce j ≤ v C (8)Proof. Let (x ∗ ,y ∗ ) be a Nash equilibrium of game Gand let v R = x ∗T Ry ∗ , v C = x ∗T Cy ∗ be the values obtainedby the two players in the Nash equilibrium. Let us choosex = x ∗ and y = y ∗ . We will argue that there exists apair of mixed strategies α, β for the row and column playerrespectively so that the required properties hold.To show that such strategies exist we apply the probabilisticmethod. Suppose that we take t independent samplesfrom the strategy space of the row player according tothe distribution induced by x ∗ and let us denote by A theresulting multiset of pure strategies of the row player. Similarly,let us take t samples according to y ∗ and denote by Bthe resulting multiset for the column player. Let then α bethe mixed strategy corresponding to the uniform distributionon multiset A and β the mixed strategy correspondingto the uniform distribution on B.We are interested in the probability that x, y, α and βsatisfy the properties stated above. Clearly, Properties (2),(4), (6), (8) are satisfied with probability 1 by the definitionof Nash equilibrium and since every strategy in A is in thesupport of x and every strategy in B in the support of y.So we only need to worry about Properties (1), (3), (5) and(7).Let X and Y be independent random variables such thatX = e i, with probability x(i), for all i ∈{1,...,n},Y = e i, with probability y(i), for all i ∈{1,...,n}.Let then X 1,...,X t be t copies of variable X and Y 1,...,Y tbe t copies of variable Y , where variables X 1,...,X È t,Y 1,...,Y tare mutually independent. Denoting A = 1 tt i=1Xi andÈB = 1 tt i=1Yi we will argue that with positive probabilitythe random variables A and B satisfy Properties (1), (3),(5) and (7), i.e.È[(|A T RB−v R|≤ɛ) ∧ (|x T RB−v R|≤ɛ)∧ (|A T CB−v C|≤ɛ) ∧ (|A T Cy − v C|≤ɛ)] > 0. (9)356

Clearly, [x T RY È i]=i,j Rijx(i)[1 {Y =e j }]= È i,jRijx(i)y(j) =vR. Therefore, a Chernoff bound onthe sequence of random variables Z i = x T RY i givesÈ[|x T RB−v R| >ɛ] ≤ e −2tɛ2 . (10)Via similar argumentsÈ[|A T Cy − v C| >ɛ] ≤ e −2tɛ2 .To bound the probability of the event |A T RB−v R| >ɛwe note that|A T RB−v R|≤|A T RB−x T RB| + |x T RB−v R|.Therefore,È[|A T RB−v R| >ɛ]≤ È[|A T RB−x T RB| >ɛ/2 ∨|x T RB−v R| >ɛ/2]≤ È[|A T RB−x T RB| >ɛ/2] + È[|x T RB−v R| >ɛ/2]The second term of the latter expression can be bounded asfollows from (10)È[|x T RB−v R| >ɛ/2] ≤ e −tɛ2 /2 .To bound the first term we note that[Xi T RB|B] = È i,j Rij [1 {X=e i }]B(j) = È i,j Rijx(i)B(j) =x T RB. Therefore, conditioned on the value of B, aChernoffbound on the sequence of random variables Z i′ = Xi T RBgiveswhich impliesÈ[|A T RB−x T RB| >ɛ/2|B] ≤ e −tɛ2 /2È[|A T RB−x T RB| >ɛ/2] ≤ e −tɛ2 /2Putting everything together we getSimilarly,Choosing t = 4ɛ 2È[|A T RB−v R| >ɛ] ≤ 2e −tɛ2 /2È[|A T CB−v C| >ɛ] ≤ 2e −tɛ2 /2we getTherefore, by union boundÈ[|x T RB−v R| >ɛ] ≤ e −8 ,È[|A T Cy − v C| >ɛ] ≤ e −8 ,È[|A T RB−v R| >ɛ] ≤ 2e −2È[|A T CB−v C| >ɛ] ≤ 2e −2È[(|A T RB−v R| >ɛ) ∨ (|x T RB−v R| >ɛ)∨ (|A T CB−v C| >ɛ) ∨ (|A T Cy − v C| >ɛ)] < 0.55.The latter implies (9). This completes the proof. We notethat (1) and (5) were also proved to be true in [9] using asimilar method.3.2 An AlgorithmLet ɛ>0. The following algorithm computes an 3−√ 52+ ɛapproximate equilibrium.1. Discretize [0, 1] into the set V = {0,ɛ,2ɛ,...,kɛ}, wherek = k(ɛ) is such that kɛ ≤ 1and(k +1)ɛ>1.2. Guess a pair of values v R,v C ∈Vthat are ɛ-close to thevalues of a Nash equilibrium to the two players (thisis, of course, done by trying all O( 1 ) pairs until theɛA, B, x, y sought by the algorithm below 2 are found).(a) Let v max =max{v R,v C}.(b) Find a multiset A of row player’s pure strategies,and a multiset B of column player’s pure strategies,both multisets of size 4/ɛ 2 so that the followingis satisfied.α T Rβ ≥ v R − 3ɛ/2 (11)α T Cβ ≥ v C − 3ɛ/2 (12)where α, respectively β, is the uniform distributionon the elements of the multiset A, respectivelyB.(c) Find x, a mixed strategy for the row player, andy, a mixed strategy for the column player, as anysolution of the following linear programL(v R,v C,α,β): α T Ry ≥ v R − 3ɛ/2 (13)∀ i : e T i Ry ≤ v R + ɛ/2 (14)x T Rβ ≥ v R − 3ɛ/2 (15)α T Cy ≥ v C − 3ɛ/2 (16)∀ j : x T Ce j ≤ v C + ɛ/2 (17)x T Cβ ≥ v C − 3ɛ/2 (18)(d) If v max ≥ 1/3 then return the pair of strategies:(δα +(1− δ)x , δβ+(1− δ)y)where δ = δ(v R,v C)= 3 − 12 2v max.(e) If v max < 1/3, then return the pair of strategies(x, y).Theorem 3.2. For every ɛ>0, the Algorithm returns an3− √ 5+ ɛ approximate equilibrium, in time n O(1/ɛ2 )2Proof. The algorithm exhaustively searches through itsguesses of v R,v C,αand β, and for each guess it checks if thesystem of equations L(v R,v C,α,β) has a feasible solution.¡Since there are only (1/ɛ) 2 ×n 2, i.e., n O(1/ɛ2) differentchoices for the tuple (v R,v C,α,β), so it is clear that,1/ɛ 2provided that the algorithm finds a solution, it terminatesin time n O(1/ɛ2) . Now, to prove that the algorithm indeedfinds a solution, we have to show that there is some guessof (v R,v C,α,β)forwhichL(v R,v C,α,β) has a feasible solution.By Lemma 3.1, we know that there exists a tuple(vR,v ∗ C,x,α,y,β), ∗ with |supp(α)|, |supp(β)| ≤4/ɛ 2 ,s.t. (1)–(8)hold. Letv R,v C ∈Vbe s.t. |v R − vR|≤ɛ/2 ∗ and|v C − vC|≤ɛ/2 ∗ (these exist because of the discretizationin V). By (1) α T Rβ ≥ vR ∗ − ɛ ≥ v R − 3ɛ/2, proving (11).Similarly (12) is true as well. Hence (v R,v C,α,β)isoneof the possible guesses of the algorithm. We will now showthat L(v R,v C,α,β) has a feasible solution.Now (2) implies ∀ i ∈ supp(α) : e T i Ry = vR ∗ ≥ v R −ɛ/2 which proves (13). Similarly (4) and (3) give (14) and(15) respectively. Similarly the column side inequalities alsohold. Thus (x, y) is a feasible solution to L(v R,v C,α,β), asrequired.357

So it remains to determine the value of the approximation.Define val R,val C as the values obtained by the rowand column player (resp.) in the solution output by the algorithm.Define def R,def C as the maximum value that theplayer can get by defecting to some pure strategy. Thus ouralgorithm outputs a max{def R −val R,def C −val C} approximateequilibrium. The analysis is in two cases.In the easy case, when v max < 1/3, the algorithm outputs(x, y). By (14), we see that def R ≤ v R + ɛ/2 < 1/3 +ɛ/2.Similarly def C < 1/3 +ɛ/2. Since val R,val C ≥ 0, we havea1/3+ɛ/2 approximate equilibrium.In the case when v max ≥ 1/3, we have, from (11), (13)and (15) thatval R ≥ δ 2 (v R − 3ɛ/2) + 2δ(1 − δ)(v R − 3ɛ/2)=(v R − 3ɛ/2)(δ 2 +2δ(1 − δ))From (14), and the fact that all the entries are at most 1,we get:def R ≤ δ +(1− δ)(v R + ɛ/2)Hencedef R − val R ≤ v Rg(δ)+ɛf(δ)where g(δ) =δ +(1− δ) − δ 2 − 2δ(1 − δ) andf(δ) =(1−δ)/2+3/2(δ 2 +2δ(1 − δ))Similarly, we havedef C − val C ≤ v Cg(δ)+ɛf(δ)Assume, w.l.o.g., that v max = v C. Then, from abovewe see that def R − val R ≤ def C − val C. Plugging in thechoice of δ in the algorithm, we get def C − val C ≤ 3 − 214 (5vmax + 1v max)+ɛf(δ). This function has a maximum atv max =1/ √ 5, where it equals (3− √ 5)/2+ɛf(δ). This meansthat for every ɛ ′ > 0, the algorithm returns an (3− √ 5)/2+ɛ ′approximate equilibrium.4. REFERENCES[1] I. Althofer. On sparse approximations to randomizedstrategies and convex combinations. Linear Algebra andits Applications, 1994.[2] X. Chen and X. Deng. Settling the complexity oftwo-player nash equilibrium. FOCS, 2006.[3] X. Chen, X. Deng, and S.-H. Teng. Computing nashequilibria:approximation and smoothed complexity.FOCS, 2006.[4] C. Daskalakis, P. Goldberg, and C. Papadimitriou. Thecomplexity of computing a nash equilibrium. STOC,2006.[5] C. Daskalakis, A. Mehta, and C. Papadimitriou. A noteon approximate nash equilibria. WINE, 2006.[6] T. Feder, H. Nazerzadeh, and A. Saberi. Personalcommunication, 2006.[7] P. Goldberg and C. Papadimitriou. Reducibility amongequilibrium problems. STOC, 2006.[8] S. C. Kontogiannis, P. N. Panagopoulou, and P. G.Spirakis. Polynomial algorithms for approximating nashequilibria of bimatrix games. WINE, 2006.[9] R. Lipton, E. Markakis, and A. Mehta. Playing largegames using simple strategies. Electronic Commerce,2003.358

Progress in Approximate Nash Equilibria - Corelab

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?