11.07.2015 Views

Progress in Approximate Nash Equilibria - Corelab

Progress in Approximate Nash Equilibria - Corelab

Progress in Approximate Nash Equilibria - Corelab

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Progress</strong> <strong>in</strong> <strong>Approximate</strong> <strong>Nash</strong> <strong>Equilibria</strong>Constant<strong>in</strong>os Daskalakis ∗Computer ScienceUC BerkeleyBerkeley, CAcostis@cs.berkeley.eduAranyak MehtaIBM Almaden ResearchCenterSan Jose, CAmehtaa@us.ibm.comChristos PapadimitriouComputer ScienceUC BerkeleyBerkeley, CAchristos@cs.berkeley.eduABSTRACTIt is known [5] that an additively ɛ-approximate <strong>Nash</strong> equilibrium(with supports of size at most two) can be computed<strong>in</strong> polynomial time <strong>in</strong> any 2-player game with ɛ = .5. It isalso known that no approximation better than .5 is possibleunless equilibria with support larger than log n are considered,where n is the number of strategies per player. Wegive a polynomial algorithm for comput<strong>in</strong>g an ɛ-approximate<strong>Nash</strong> equilibrium <strong>in</strong> 2-player games with ɛ ≈ .38; our algorithmcomputes equilibria with arbitrarily large supports.Categories and Subject DescriptorsF.2.0 [Theory of Computation]: analysis of algorithmsand problem complexity—generalGeneral TermsAlgorithms, Economics, TheoryKeywords<strong>Nash</strong> Equilibrium, <strong>Approximate</strong> <strong>Nash</strong>, Algorithm1. INTRODUCTIONIt was recently shown that f<strong>in</strong>d<strong>in</strong>g a <strong>Nash</strong> equilibrium isPPAD-complete [4], even for 2-player games [2]. As a consequence,f<strong>in</strong>d<strong>in</strong>g approximate <strong>Nash</strong> equilibria has emergedas the ma<strong>in</strong> rema<strong>in</strong><strong>in</strong>g open question <strong>in</strong> the area of equilibriumcomputation. The most commonly studied form of approximationis additive approximation <strong>in</strong> games <strong>in</strong> which allutilities have been normalized to be between 0 and 1 (this isa common assumption, s<strong>in</strong>ce scal<strong>in</strong>g the utilities of a playerby any positive factor, and apply<strong>in</strong>g any additive constant,results <strong>in</strong> an equivalent game). A set of mixed strategies iscalled an ɛ-approximate <strong>Nash</strong> equilibrium, whereɛ>0, if∗ The first and third authors were supported through NSFgrant CCF - 0635319, a gift from Yahoo! Research and aMICRO grant.Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.E C’07, J une 11–15, 2007, San D iego, California, USA.Copyright 2007 ACM 978-1-59593-653-0/07/0006 ...$5.00.for each player all strategies have expected payoff that is atmost ɛ more than the expected payoff of the given strategy.Clearly, any mixed strategy comb<strong>in</strong>ation is a 1-approximate<strong>Nash</strong> equilibrium, and it is quite straightforward to f<strong>in</strong>d a34-approximate <strong>Nash</strong> equilibrium <strong>in</strong> 2-player games by exam<strong>in</strong><strong>in</strong>gall supports of size two; see [8] for a slightly improvedresult. In [9] it was shown that an ɛ-approximate<strong>Nash</strong> equilibrium can be found <strong>in</strong> time O(n log nɛ 2 )byexam<strong>in</strong><strong>in</strong>gall supports of size log n . It was po<strong>in</strong>ted out <strong>in</strong> [1] thatɛno algorithm that exam<strong>in</strong>es 2supports smaller than aboutlog n can achieve an approximation better than 1 ,evenfor4zero-sum games. In fact [6] have sharpened the 1 to a 1 .4 2A very simple algorithm achiev<strong>in</strong>g ɛ = 1 <strong>in</strong> 2-person2games was po<strong>in</strong>ted out <strong>in</strong> [5]: For any strategy i of playerIletj be the best response of player II, and let k be thebest response of player I to j. Then player I plays an equalmixture of i and k, while player II plays j. The proof of1-approximation is not very hard.2In this paper we give an algorithm which breaks the barrierof 1 by consider<strong>in</strong>g supports of arbitrarily large card<strong>in</strong>ality.It achieves an approximation ratio 3−√ 52+ɛ ≈ .38+ɛ,2for any ɛ>0, <strong>in</strong> time n O( 1ɛ 2 ) . It is based on the follow<strong>in</strong>gtwo ideas: (a) If the values (u, v) of a <strong>Nash</strong> equilibrium tothe two players were known, then we would be able to f<strong>in</strong>d amax{u, v}-approximate <strong>Nash</strong> equilibrium by solv<strong>in</strong>g a set ofl<strong>in</strong>ear <strong>in</strong>equalities; and (b) for every <strong>Nash</strong> equilibrium, thereis a pair of mixed strategies with support size O( 1 )whichɛapproximates with<strong>in</strong> ɛ the true values of that <strong>Nash</strong> 2 equilibrium[9]. Our technique uses both these ideas and also takescare of the <strong>in</strong>teraction between them. The l<strong>in</strong>ear program<strong>in</strong> (a) will, of course, return <strong>in</strong> general mixed strategies ofarbitrarily large support.2. PRELIMINARIESWe consider normal form games between two players, therow player and the column player, each with n strategiesat his disposal. The game is def<strong>in</strong>ed by two n × n payoffmatrices, R for the row player, and C for the column player.The pure strategies of the row player correspond to the nrows and the pure strategies of the column player correspondto the n columns. If the row player plays row i and thecolumn player plays column j, then the row player receivesa payoff of R ij and the column player gets C ij. Payoffsareextended l<strong>in</strong>early to pairs of mixed strategies — if the rowplayer plays a probability distribution x over the rows andcolumn player plays a distribution y over the columns, then355


the row player gets a payoff of x T Ry and the column playergets a payoff of x T Cy.A <strong>Nash</strong> equilibrium <strong>in</strong> this sett<strong>in</strong>g is a pair of mixed strategies,x ∗ for the row player and y ∗ for the column player, suchthat neither player has an <strong>in</strong>centive to unilaterally defect.Note that, by l<strong>in</strong>earity, the best defection is to a pure strategy.Let e i denote the vector with a 1 at the ith coord<strong>in</strong>ateand 0 elsewhere. A pair of mixed strategies (x ∗ ,y ∗ )isa<strong>Nash</strong> equilibrium if∀ i =1..n, e T i Ry ∗ ≤ x ∗T Ry ∗∀ i =1..n, x ∗T Ce i ≤ x ∗T Cy ∗It can be easily shown that every pair of equilibrium strategiesof a game does not change upon multiply<strong>in</strong>g all the entriesof a payoff matrix by a constant, and upon add<strong>in</strong>g thesame constant to each entry. We shall therefore assume thatthe entries of both payoff matrices R and C are between 0and 1.For ɛ>0, we def<strong>in</strong>e an ɛ-approximate <strong>Nash</strong> equilibrium tobe a pair of mixed strategies x ∗ for the row player and y ∗for the column player, so that the <strong>in</strong>centive to unilaterallydeviate is at most ɛ:∀ i =1..n, e T i Ry ∗ ≤ x ∗T Ry ∗ + ɛ∀ i =1..n, x ∗T Ce i ≤ x ∗T Cy ∗ + ɛA stronger notion of approximately equilibrium strategieswas <strong>in</strong>troduced <strong>in</strong> [7, 4]: For ɛ > 0, a well-supported ɛ-approximate <strong>Nash</strong> equilibrium, oranɛ-well-supported <strong>Nash</strong>equilibrium, is a pair of mixed strategies, x ∗ for the rowplayer and y ∗ for the column player, so that a player playsonly approximately best-response pure strategies with nonzeroprobability:∀ i : x ∗ i > 0 ⇒ e T i Ry ∗ ≥ e T j Ry ∗ − ɛ, ∀ j∀ i : yi ∗ > 0 ⇒ x ∗T Ce i ≥ x ∗T Ce j − ɛ, ∀ jThis is <strong>in</strong>deed a stronger def<strong>in</strong>ition, <strong>in</strong> the sense that everyɛ-well supported <strong>Nash</strong> equilibrium is also an ɛ-approximate<strong>Nash</strong> equilibrium, but the converse need not be true. However,the follow<strong>in</strong>g lemma from [3] shows that there doesexist a polynomial relationship between the two:Lemma 2.1. [3] For every 2 player normal form game,for every ɛ>0, given an ɛ2 -approximate equilibrium we can8ncompute <strong>in</strong> polynomial time an ɛ-well-supported equilibrium.It is known, see [4, 2, 3], that comput<strong>in</strong>g an 1 -well supported<strong>Nash</strong> equilibrium is PPAD-complete, for any con-n αstant α > 0, and, by the above lemma, so is comput<strong>in</strong>g1a -approximate equilibrium. Therefore, a FPTAS fornthe α+1 problem is unlikely. In [5] we provide an algorithm forcomput<strong>in</strong>g a 1 -approximate <strong>Nash</strong> equilibrium via a simple2algorithm which considers strategies of support 2 and an algorithmfor comput<strong>in</strong>g a 2 -well supported equilibrium conditionalon some graph theoretic conjecture. Another at-3tempt towards comput<strong>in</strong>g approximate equilibria (althoughwith <strong>in</strong>ferior approximation guarantees) can be found <strong>in</strong> [8].3. COMPUTATION OF APPROXIMATE NASHEQUILIBRIA3.1 An Existence ProofUs<strong>in</strong>g the properties of <strong>Nash</strong> equilibrium and Hoefd<strong>in</strong>g-Chernoff bounds we derive the follow<strong>in</strong>g.Lemma 3.1. For any ɛ>0 and for any two-player gameG =(R, C), whereR, C are n × n matrices with entries <strong>in</strong>[0, 1], there exist v R,v C ∈ [0, 1], mixed strategies x, α forthe row player, y,β for the column player, with |supp(α)| ≤4/ɛ 2 , |supp(β)| ≤4/ɛ 2 , such that the follow<strong>in</strong>g are satisfied|α T Rβ − v R|≤ɛ (1)∀ i ∈ supp(α) : e T i Ry = v R (2)|x T Rβ − v R|≤ɛ (3)∀ i : e T i Ry ≤ v R (4)|α T Cβ − v C|≤ɛ (5)∀ j ∈ supp(β) : x T Ce j = v C (6)|α T Cy − v C|≤ɛ (7)∀ j : x T Ce j ≤ v C (8)Proof. Let (x ∗ ,y ∗ ) be a <strong>Nash</strong> equilibrium of game Gand let v R = x ∗T Ry ∗ , v C = x ∗T Cy ∗ be the values obta<strong>in</strong>edby the two players <strong>in</strong> the <strong>Nash</strong> equilibrium. Let us choosex = x ∗ and y = y ∗ . We will argue that there exists apair of mixed strategies α, β for the row and column playerrespectively so that the required properties hold.To show that such strategies exist we apply the probabilisticmethod. Suppose that we take t <strong>in</strong>dependent samplesfrom the strategy space of the row player accord<strong>in</strong>g tothe distribution <strong>in</strong>duced by x ∗ and let us denote by A theresult<strong>in</strong>g multiset of pure strategies of the row player. Similarly,let us take t samples accord<strong>in</strong>g to y ∗ and denote by Bthe result<strong>in</strong>g multiset for the column player. Let then α bethe mixed strategy correspond<strong>in</strong>g to the uniform distributionon multiset A and β the mixed strategy correspond<strong>in</strong>gto the uniform distribution on B.We are <strong>in</strong>terested <strong>in</strong> the probability that x, y, α and βsatisfy the properties stated above. Clearly, Properties (2),(4), (6), (8) are satisfied with probability 1 by the def<strong>in</strong>itionof <strong>Nash</strong> equilibrium and s<strong>in</strong>ce every strategy <strong>in</strong> A is <strong>in</strong> thesupport of x and every strategy <strong>in</strong> B <strong>in</strong> the support of y.So we only need to worry about Properties (1), (3), (5) and(7).Let X and Y be <strong>in</strong>dependent random variables such thatX = e i, with probability x(i), for all i ∈{1,...,n},Y = e i, with probability y(i), for all i ∈{1,...,n}.Let then X 1,...,X t be t copies of variable X and Y 1,...,Y tbe t copies of variable Y , where variables X 1,...,X È t,Y 1,...,Y tare mutually <strong>in</strong>dependent. Denot<strong>in</strong>g A = 1 tt i=1Xi andÈB = 1 tt i=1Yi we will argue that with positive probabilitythe random variables A and B satisfy Properties (1), (3),(5) and (7), i.e.È[(|A T RB−v R|≤ɛ) ∧ (|x T RB−v R|≤ɛ)∧ (|A T CB−v C|≤ɛ) ∧ (|A T Cy − v C|≤ɛ)] > 0. (9)356


Clearly, [x T RY È i]=i,j Rijx(i)[1 {Y =e j }]= È i,jRijx(i)y(j) =vR. Therefore, a Chernoff bound onthe sequence of random variables Z i = x T RY i givesÈ[|x T RB−v R| >ɛ] ≤ e −2tɛ2 . (10)Via similar argumentsÈ[|A T Cy − v C| >ɛ] ≤ e −2tɛ2 .To bound the probability of the event |A T RB−v R| >ɛwe note that|A T RB−v R|≤|A T RB−x T RB| + |x T RB−v R|.Therefore,È[|A T RB−v R| >ɛ]≤ È[|A T RB−x T RB| >ɛ/2 ∨|x T RB−v R| >ɛ/2]≤ È[|A T RB−x T RB| >ɛ/2] + È[|x T RB−v R| >ɛ/2]The second term of the latter expression can be bounded asfollows from (10)È[|x T RB−v R| >ɛ/2] ≤ e −tɛ2 /2 .To bound the first term we note that[Xi T RB|B] = È i,j Rij [1 {X=e i }]B(j) = È i,j Rijx(i)B(j) =x T RB. Therefore, conditioned on the value of B, aChernoffbound on the sequence of random variables Z i′ = Xi T RBgiveswhich impliesÈ[|A T RB−x T RB| >ɛ/2|B] ≤ e −tɛ2 /2È[|A T RB−x T RB| >ɛ/2] ≤ e −tɛ2 /2Putt<strong>in</strong>g everyth<strong>in</strong>g together we getSimilarly,Choos<strong>in</strong>g t = 4ɛ 2È[|A T RB−v R| >ɛ] ≤ 2e −tɛ2 /2È[|A T CB−v C| >ɛ] ≤ 2e −tɛ2 /2we getTherefore, by union boundÈ[|x T RB−v R| >ɛ] ≤ e −8 ,È[|A T Cy − v C| >ɛ] ≤ e −8 ,È[|A T RB−v R| >ɛ] ≤ 2e −2È[|A T CB−v C| >ɛ] ≤ 2e −2È[(|A T RB−v R| >ɛ) ∨ (|x T RB−v R| >ɛ)∨ (|A T CB−v C| >ɛ) ∨ (|A T Cy − v C| >ɛ)] < 0.55.The latter implies (9). This completes the proof. We notethat (1) and (5) were also proved to be true <strong>in</strong> [9] us<strong>in</strong>g asimilar method.3.2 An AlgorithmLet ɛ>0. The follow<strong>in</strong>g algorithm computes an 3−√ 52+ ɛapproximate equilibrium.1. Discretize [0, 1] <strong>in</strong>to the set V = {0,ɛ,2ɛ,...,kɛ}, wherek = k(ɛ) is such that kɛ ≤ 1and(k +1)ɛ>1.2. Guess a pair of values v R,v C ∈Vthat are ɛ-close to thevalues of a <strong>Nash</strong> equilibrium to the two players (thisis, of course, done by try<strong>in</strong>g all O( 1 ) pairs until theɛA, B, x, y sought by the algorithm below 2 are found).(a) Let v max =max{v R,v C}.(b) F<strong>in</strong>d a multiset A of row player’s pure strategies,and a multiset B of column player’s pure strategies,both multisets of size 4/ɛ 2 so that the follow<strong>in</strong>gis satisfied.α T Rβ ≥ v R − 3ɛ/2 (11)α T Cβ ≥ v C − 3ɛ/2 (12)where α, respectively β, is the uniform distributionon the elements of the multiset A, respectivelyB.(c) F<strong>in</strong>d x, a mixed strategy for the row player, andy, a mixed strategy for the column player, as anysolution of the follow<strong>in</strong>g l<strong>in</strong>ear programL(v R,v C,α,β): α T Ry ≥ v R − 3ɛ/2 (13)∀ i : e T i Ry ≤ v R + ɛ/2 (14)x T Rβ ≥ v R − 3ɛ/2 (15)α T Cy ≥ v C − 3ɛ/2 (16)∀ j : x T Ce j ≤ v C + ɛ/2 (17)x T Cβ ≥ v C − 3ɛ/2 (18)(d) If v max ≥ 1/3 then return the pair of strategies:(δα +(1− δ)x , δβ+(1− δ)y)where δ = δ(v R,v C)= 3 − 12 2v max.(e) If v max < 1/3, then return the pair of strategies(x, y).Theorem 3.2. For every ɛ>0, the Algorithm returns an3− √ 5+ ɛ approximate equilibrium, <strong>in</strong> time n O(1/ɛ2 )2Proof. The algorithm exhaustively searches through itsguesses of v R,v C,αand β, and for each guess it checks if thesystem of equations L(v R,v C,α,β) has a feasible solution.¡S<strong>in</strong>ce there are only (1/ɛ) 2 ×n 2, i.e., n O(1/ɛ2) differentchoices for the tuple (v R,v C,α,β), so it is clear that,1/ɛ 2provided that the algorithm f<strong>in</strong>ds a solution, it term<strong>in</strong>ates<strong>in</strong> time n O(1/ɛ2) . Now, to prove that the algorithm <strong>in</strong>deedf<strong>in</strong>ds a solution, we have to show that there is some guessof (v R,v C,α,β)forwhichL(v R,v C,α,β) has a feasible solution.By Lemma 3.1, we know that there exists a tuple(vR,v ∗ C,x,α,y,β), ∗ with |supp(α)|, |supp(β)| ≤4/ɛ 2 ,s.t. (1)–(8)hold. Letv R,v C ∈Vbe s.t. |v R − vR|≤ɛ/2 ∗ and|v C − vC|≤ɛ/2 ∗ (these exist because of the discretization<strong>in</strong> V). By (1) α T Rβ ≥ vR ∗ − ɛ ≥ v R − 3ɛ/2, prov<strong>in</strong>g (11).Similarly (12) is true as well. Hence (v R,v C,α,β)isoneof the possible guesses of the algorithm. We will now showthat L(v R,v C,α,β) has a feasible solution.Now (2) implies ∀ i ∈ supp(α) : e T i Ry = vR ∗ ≥ v R −ɛ/2 which proves (13). Similarly (4) and (3) give (14) and(15) respectively. Similarly the column side <strong>in</strong>equalities alsohold. Thus (x, y) is a feasible solution to L(v R,v C,α,β), asrequired.357


So it rema<strong>in</strong>s to determ<strong>in</strong>e the value of the approximation.Def<strong>in</strong>e val R,val C as the values obta<strong>in</strong>ed by the rowand column player (resp.) <strong>in</strong> the solution output by the algorithm.Def<strong>in</strong>e def R,def C as the maximum value that theplayer can get by defect<strong>in</strong>g to some pure strategy. Thus ouralgorithm outputs a max{def R −val R,def C −val C} approximateequilibrium. The analysis is <strong>in</strong> two cases.In the easy case, when v max < 1/3, the algorithm outputs(x, y). By (14), we see that def R ≤ v R + ɛ/2 < 1/3 +ɛ/2.Similarly def C < 1/3 +ɛ/2. S<strong>in</strong>ce val R,val C ≥ 0, we havea1/3+ɛ/2 approximate equilibrium.In the case when v max ≥ 1/3, we have, from (11), (13)and (15) thatval R ≥ δ 2 (v R − 3ɛ/2) + 2δ(1 − δ)(v R − 3ɛ/2)=(v R − 3ɛ/2)(δ 2 +2δ(1 − δ))From (14), and the fact that all the entries are at most 1,we get:def R ≤ δ +(1− δ)(v R + ɛ/2)Hencedef R − val R ≤ v Rg(δ)+ɛf(δ)where g(δ) =δ +(1− δ) − δ 2 − 2δ(1 − δ) andf(δ) =(1−δ)/2+3/2(δ 2 +2δ(1 − δ))Similarly, we havedef C − val C ≤ v Cg(δ)+ɛf(δ)Assume, w.l.o.g., that v max = v C. Then, from abovewe see that def R − val R ≤ def C − val C. Plugg<strong>in</strong>g <strong>in</strong> thechoice of δ <strong>in</strong> the algorithm, we get def C − val C ≤ 3 − 214 (5vmax + 1v max)+ɛf(δ). This function has a maximum atv max =1/ √ 5, where it equals (3− √ 5)/2+ɛf(δ). This meansthat for every ɛ ′ > 0, the algorithm returns an (3− √ 5)/2+ɛ ′approximate equilibrium.4. REFERENCES[1] I. Althofer. On sparse approximations to randomizedstrategies and convex comb<strong>in</strong>ations. L<strong>in</strong>ear Algebra andits Applications, 1994.[2] X. Chen and X. Deng. Settl<strong>in</strong>g the complexity oftwo-player nash equilibrium. FOCS, 2006.[3] X. Chen, X. Deng, and S.-H. Teng. Comput<strong>in</strong>g nashequilibria:approximation and smoothed complexity.FOCS, 2006.[4] C. Daskalakis, P. Goldberg, and C. Papadimitriou. Thecomplexity of comput<strong>in</strong>g a nash equilibrium. STOC,2006.[5] C. Daskalakis, A. Mehta, and C. Papadimitriou. A noteon approximate nash equilibria. WINE, 2006.[6] T. Feder, H. Nazerzadeh, and A. Saberi. Personalcommunication, 2006.[7] P. Goldberg and C. Papadimitriou. Reducibility amongequilibrium problems. STOC, 2006.[8] S. C. Kontogiannis, P. N. Panagopoulou, and P. G.Spirakis. Polynomial algorithms for approximat<strong>in</strong>g nashequilibria of bimatrix games. WINE, 2006.[9] R. Lipton, E. Markakis, and A. Mehta. Play<strong>in</strong>g largegames us<strong>in</strong>g simple strategies. Electronic Commerce,2003.358

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!