13.07.2015 Views

Dynamic Dopamine Modulation in the Basal Ganglia: A ...

Dynamic Dopamine Modulation in the Basal Ganglia: A ...

Dynamic Dopamine Modulation in the Basal Ganglia: A ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Dynamic</strong> <strong>Dopam<strong>in</strong>e</strong> <strong>Modulation</strong> <strong>in</strong> <strong>the</strong> <strong>Basal</strong> <strong>Ganglia</strong>: A NeurocomputationalAccount of Cognitive Deficits <strong>in</strong> Medicated and Non-medicated Park<strong>in</strong>sonismMichael J. FrankDepartment of PsychologyCenter for NeuroscienceUniversity of Colorado Boulder345 UCBBoulder, CO 80309frankmj@psych.colorado.eduAbstract:<strong>Dopam<strong>in</strong>e</strong> (DA) depletion <strong>in</strong> <strong>the</strong> basal ganglia (BG) of Park<strong>in</strong>son’s patients gives rise to both frontal-like and implicitlearn<strong>in</strong>g impairments. <strong>Dopam<strong>in</strong>e</strong>rgic medication alleviates some cognitive deficits but impairs those that depend on<strong>in</strong>tact areas of <strong>the</strong> BG, apparently due to DA “overdose”. These f<strong>in</strong>d<strong>in</strong>gs are difficult to accommodate with verbal<strong>the</strong>ories of BG/DA function, ow<strong>in</strong>g to complexity of system dynamics: DA dynamically modulates function <strong>in</strong> <strong>the</strong>BG, which is itself a modulatory system. This paper presents a neural network model that <strong>in</strong>stantiates key biologicalproperties and provides <strong>in</strong>sight <strong>in</strong>to <strong>the</strong> underly<strong>in</strong>g role of DA <strong>in</strong> <strong>the</strong> BG dur<strong>in</strong>g learn<strong>in</strong>g and execution of cognitivetasks. Specifically, <strong>the</strong> BG modulates <strong>the</strong> execution of “actions” (e.g., motor responses and work<strong>in</strong>g memory updat<strong>in</strong>g)that are be<strong>in</strong>g considered <strong>in</strong> different parts of frontal cortex. Phasic changes <strong>in</strong> DA, which occur dur<strong>in</strong>g error feedback,dynamically modulate <strong>the</strong> BG threshold for facilitat<strong>in</strong>g/suppress<strong>in</strong>g a cortical command <strong>in</strong> response to particular stimuli.Reduced dynamic range of DA expla<strong>in</strong>s Park<strong>in</strong>son and DA overdose deficits with a s<strong>in</strong>gle underly<strong>in</strong>g dysfunction,despite overall differences <strong>in</strong> raw DA levels. Simulated Park<strong>in</strong>sonism and medication effects provide a <strong>the</strong>oretical basisfor behavioral data <strong>in</strong> probabilistic classification and reversal tasks. The model also provides novel testable predictionsfor neuropsychological and pharmacological studies, and motivates fur<strong>the</strong>r <strong>in</strong>vestigation of BG/DA <strong>in</strong>teractions withprefrontal cortex <strong>in</strong> work<strong>in</strong>g memory.IntroductionIn cognitive neuroscience, bra<strong>in</strong> regions are oftencharacterized as if <strong>the</strong>y implemented localized functions,with relatively little treatment of <strong>in</strong>teractive effects at <strong>the</strong>network level. In part, this is because <strong>in</strong>teractions are difficultto conceptualize and mileage has been ga<strong>in</strong>ed fromsimpler <strong>the</strong>ories. In some cases, however, <strong>the</strong>se <strong>the</strong>oreticalaccounts need to be reconsidered. Some bra<strong>in</strong> regionsexert <strong>the</strong>ir effects only by modulat<strong>in</strong>g function <strong>in</strong> o<strong>the</strong>rregions and <strong>the</strong>refore do not directly implement a cognitiveprocess. This problem is even more elusive whenconsider<strong>in</strong>g effects of neuromodulators <strong>in</strong> a s<strong>in</strong>gle bra<strong>in</strong>region, which may have <strong>in</strong>direct but substantial effectson network dynamics.This issue applies particularly well to <strong>the</strong> effects ofdopam<strong>in</strong>e (DA) <strong>in</strong> <strong>the</strong> basal ganglia (BG), which are criticalfor many aspects of cognition (Nieoullon, 2002).Because stimulus-response (SR) tasks recruit <strong>the</strong> BG,many researchers assume that its function is to encodedetailed aspects of SR mapp<strong>in</strong>gs (e.g., Packard & Knowl-Journal of Cognitive Neuroscience (2005), 17(1): 51-72.Supported by ONR grants N00014-00-1-0246 and N00014-03-1-0428,and NIH grants MH64445 and MH069597.ton, 2002). O<strong>the</strong>rs advocate a subtly different modulatoryrole of <strong>the</strong> BG to facilitate or suppress SR-like associationsthat are represented <strong>in</strong> cortex (Hikosaka, 1998;M<strong>in</strong>k, 1996). This paper explores <strong>the</strong> latter hypo<strong>the</strong>sisand fur<strong>the</strong>r suggests that DA dynamically modulates activity<strong>in</strong> an already modulatory BG, as DA levels change<strong>in</strong> response to different behavioral events. These doublemodulatory effects are complex and difficult to conceptualize,motivat<strong>in</strong>g <strong>the</strong> use of computational model<strong>in</strong>g tomake <strong>the</strong>m more tenable. In so do<strong>in</strong>g, <strong>the</strong> model ties toge<strong>the</strong>ra variety of seem<strong>in</strong>gly unrelated cognitive deficitsstemm<strong>in</strong>g from DA dysfunction <strong>in</strong> <strong>the</strong> BG, as <strong>in</strong> Park<strong>in</strong>son’sdisease (PD).The cognitive deficits <strong>in</strong> PD can be divided <strong>in</strong>totwo general classes: those that are “frontal-like” <strong>in</strong>nature, and those that reflect impairments <strong>in</strong> implicitlearn<strong>in</strong>g. On <strong>the</strong> one hand, patients are impaired attasks <strong>in</strong>volv<strong>in</strong>g attentional processes or work<strong>in</strong>g memory(Partiot, Ver<strong>in</strong>, & Dubois, 1996; Gotham, Brown, &Marsden, 1988; Dubois, Malapani, Ver<strong>in</strong>, Rogelet, Deweer,& Pillon, 1994; Woodward, Bub, & Hunter, 2002;Henik, S<strong>in</strong>gh, Beckley, & Rafal, 1993; Rogers, Sahakian,Hodges, Polkey, Kennard, & Robb<strong>in</strong>s, 1998). Implicitlearn<strong>in</strong>g deficits, on <strong>the</strong> o<strong>the</strong>r hand, do not implicate


frontal processes because <strong>the</strong>y generally do not <strong>in</strong>volvework<strong>in</strong>g memory or conscious knowledge of task demands,and frontal patients do not have such deficits(Knowlton, Mangels, & Squire, 1996). Yet, PD patientsare impaired at implicit sequence learn<strong>in</strong>g and implicitcategorization (Jackson et al., 1995; Ashby et al.,2003; Maddox & Filoteo, 2001). Similar impairmentsare observed <strong>in</strong> probabilistic classification, <strong>in</strong> which participants<strong>in</strong>tegrate over multiple trials to extract statisticalregularities of <strong>the</strong> category structure (Knowlton,Squire, & Gluck, 1994; Knowlton et al., 1996). The<strong>in</strong>volvement of dopam<strong>in</strong>e (DA) <strong>in</strong> <strong>the</strong>se tasks is notstraightforward, as dopam<strong>in</strong>ergic medication has bothpositive and negative effects on cognitive function <strong>in</strong>PD (Gotham et al., 1988; Swa<strong>in</strong>son, Rogers, Sahakian,Summers, Polkey, & Robb<strong>in</strong>s, 2000; Cools, Barker, Sahakian,& Robb<strong>in</strong>s, 2001, 2003).Because <strong>the</strong> neuropathology of PD <strong>in</strong>volves damageto dopam<strong>in</strong>ergic cells <strong>in</strong> <strong>the</strong> BG (Kish, Shannak, &Hornykiewicz, 1988), <strong>the</strong> predom<strong>in</strong>ant explanations for<strong>the</strong> two classes of deficits have been (a) that <strong>the</strong> damagedBG is <strong>in</strong>terconnected <strong>in</strong> a functional circuit withprefrontal cortex (Alexander, DeLong, & Strick, 1986;Middleton & Strick, 2000), <strong>the</strong>reby produc<strong>in</strong>g frontaldeficits, and (b) due to damage to a “neostriatal habitlearn<strong>in</strong>g system” (Knowlton et al., 1996; Hay, Moscovitch,& Lev<strong>in</strong>e, 2002). F<strong>in</strong>ally, <strong>the</strong> selective cognitive impairmentsresult<strong>in</strong>g from dopam<strong>in</strong>ergic medication havebeen attributed to an “overdose” of dopam<strong>in</strong>e <strong>in</strong> regionsof <strong>the</strong> BG that are relatively spared <strong>in</strong> PD (Gotham et al.,1988; Cools et al., 2001).To fur<strong>the</strong>r understand <strong>the</strong> role of <strong>the</strong> BG as a functionalcognitive unit, a more mechanistic explanation<strong>in</strong>volv<strong>in</strong>g its neurobiology, and specifically <strong>the</strong> role ofdopam<strong>in</strong>e, is required. What is <strong>the</strong> role of DA <strong>in</strong> <strong>the</strong>BG <strong>in</strong> modulat<strong>in</strong>g frontal processes, and how is it <strong>in</strong>volved<strong>in</strong> habit learn<strong>in</strong>g? This paper accounts for Park<strong>in</strong>sondeficits by <strong>in</strong>tegrat<strong>in</strong>g aspects of BG biology toge<strong>the</strong>rwith cellular and systems-level effects of dopam<strong>in</strong>e. Inparticular, two ma<strong>in</strong> populations of cells <strong>in</strong> <strong>the</strong> striatumrespond differentially to phasic changes <strong>in</strong> DA thought tooccur dur<strong>in</strong>g error feedback. This causes <strong>the</strong> two groupsof striatal cells to <strong>in</strong>dependently learn positive and negativere<strong>in</strong>forcement values of responses, and ultimatelyacts to facilitate or suppress <strong>the</strong> execution of commands<strong>in</strong> frontal cortex. Because <strong>the</strong>se cortical commands maydiffer widely <strong>in</strong> content, damage to BG DA gives rise toseem<strong>in</strong>gly unrelated deficits.This paper presents a neural network model that <strong>in</strong>corporates<strong>the</strong> above features to test <strong>the</strong>ir potential role<strong>in</strong> cognitive function. One of <strong>the</strong> network’s key emergentproperties is that a large dynamic range <strong>in</strong> DA releaseis critical for BG-dependent learn<strong>in</strong>g. That is, <strong>the</strong>DA signal has to both be able to <strong>in</strong>crease and decreasesubstantially from its basel<strong>in</strong>e levels <strong>in</strong> order to supportdiscrim<strong>in</strong>ation between outcome values of different responses.This dynamic range is reduced <strong>in</strong> PD, account<strong>in</strong>gfor cognitive deficits. The model fur<strong>the</strong>r suggests thatby tonically <strong>in</strong>creas<strong>in</strong>g DA levels, dopam<strong>in</strong>ergic medicationsmight restrict this dynamic range to always beat <strong>the</strong> high end of <strong>the</strong> DA spectrum, adversely affect<strong>in</strong>gsome aspects of cognition. For simplicity, only cognitiveprocedural learn<strong>in</strong>g tasks are modeled, but <strong>the</strong> samearguments can be extended to <strong>in</strong>clude <strong>in</strong>teractions withfrontal cortex <strong>in</strong> work<strong>in</strong>g memory (Frank, Loughry, &O’Reilly, 2001), as discussed later.Probabilistic Classification DeficitsProbabilistic classification deficits have been studiedus<strong>in</strong>g <strong>the</strong> “wea<strong>the</strong>r prediction” task (Knowlton et al.,1994). Participants study sets of cards with multiplecues and have to predict whe<strong>the</strong>r <strong>the</strong> cues presented <strong>in</strong>a given trial are associated with “ra<strong>in</strong>” or “sunsh<strong>in</strong>e”.The cue-outcome relationships are probabilistic and noteasily determ<strong>in</strong>ed. Healthy participants implicitly <strong>in</strong>tegrate<strong>in</strong>formation over multiple trials, progressively improv<strong>in</strong>gdespite not be<strong>in</strong>g able to explicitly state <strong>the</strong> basisof <strong>the</strong>ir choices (Gluck, Shohamy, & Myers, 2002).The BG seems to be recruited for this ability, as it isactivated dur<strong>in</strong>g <strong>the</strong> learn<strong>in</strong>g stages of <strong>the</strong> wea<strong>the</strong>r predictiontask (Poldrack, Prabakharan, Seger, & Gabrieli,1999), and is more generally engaged <strong>in</strong> tasks that emphasizenondeclarative memory (Poldrack, Clark, Pare-Blagoev, Shohamy, Moyano, Myers, & Gluck, 2001).The damaged BG <strong>in</strong> PD likely causes slowed learn<strong>in</strong>gobserved <strong>in</strong> patients, just as it has been implicated as asource for habit learn<strong>in</strong>g deficits <strong>in</strong> <strong>the</strong> motor doma<strong>in</strong>(e.g., Soliveri, Brown, Jahanshahi, Caraceni, & Marsden,1997; Thomas-Ollivier, Reymann, Le Moal, Schueck,Lieury, & Alla<strong>in</strong>, 1999). But how is <strong>the</strong> wea<strong>the</strong>r predictiontask related to habit learn<strong>in</strong>g, and exactly what aboutDA <strong>in</strong> <strong>the</strong> BG supports <strong>the</strong> learn<strong>in</strong>g of <strong>the</strong>se so-calledhabits?Insight comes from <strong>the</strong> observation that PD patientsare selectively impaired <strong>in</strong> cognitive procedural learn<strong>in</strong>gtasks that <strong>in</strong>volve trial-by-trial error feedback. Inpurely observational implicit learn<strong>in</strong>g tasks (e.g., artificialgrammar and prototype learn<strong>in</strong>g), patient performanceis normative (Reber & Squire, 1999). Among twoversions of conditional-associative SR learn<strong>in</strong>g, PD patientswere only impaired <strong>in</strong> <strong>the</strong> one that relied on trialand-error(Vriezen & Moscovitch, 1990). In implicitcategorization tasks, successful <strong>in</strong>tegration of <strong>in</strong>formationdepends on both error feedback (Ashby, Queller, &Berretty, 1999; Ashby, Maddox, & Bohil, 2002) and BG<strong>in</strong>tegrity (Ashby, Alfonso-Reese, Turken, & Waldron,


1998).Taken toge<strong>the</strong>r, <strong>the</strong>se observations support <strong>the</strong> notionthat feedback mediated learn<strong>in</strong>g occurs <strong>in</strong> <strong>the</strong> BG and is<strong>the</strong>refore disrupted <strong>in</strong> PD. Feedback may modulate DArelease <strong>in</strong> <strong>the</strong> BG that, <strong>in</strong> addition to hav<strong>in</strong>g a performanceeffect on response execution, is critical for cognitivere<strong>in</strong>forcement learn<strong>in</strong>g.Phasic Burst<strong>in</strong>g of DA Mediates Trial-and-Error Learn<strong>in</strong>gA healthy range of phasic DA bursts dur<strong>in</strong>g feedbackmay lead to <strong>the</strong> unconscious acquisition of stimulusreward-responseassociations. Data reviewed below suggeststhat positive and negative feedback have oppos<strong>in</strong>geffects on DA release, which <strong>in</strong> turn modulates synapticplasticity and <strong>the</strong>refore supports learn<strong>in</strong>g.A multitude of data <strong>in</strong> primates show that DA releas<strong>in</strong>gcells fire <strong>in</strong> phasic bursts <strong>in</strong> response to unexpectedreward (Schultz, 1998; Schultz, Dayan, & Montague,1997). Equally relevant but sometimes ignored,dopam<strong>in</strong>ergic fir<strong>in</strong>g dips below basel<strong>in</strong>e when a rewardis expected but not received (Hollerman & Schultz, 1998;Schultz, Apicella, & Ljungberg, 1993). In humans, phasicbursts and dips of DA have been <strong>in</strong>ferred to occurdur<strong>in</strong>g positive and negative feedback, respectively (Holroyd& Coles, 2002).Several l<strong>in</strong>es of evidence support <strong>the</strong> notion that <strong>the</strong>sechanges <strong>in</strong> extracellular levels of DA dur<strong>in</strong>g feedback arecritical for learn<strong>in</strong>g. First, DA modifies synaptic plasticity<strong>in</strong> animal experimental conditions. <strong>Dopam<strong>in</strong>e</strong> D1 receptorstimulation leads to long term potentiation (LTP),whereas D2 stimulation restricts LTP (Nishi, Snyder, &Greengard, 1997). Accord<strong>in</strong>gly, LTP is blocked by D1antagonists and enhanced by D2 antagonists (for a review,see Centonze, Picconi, Gubell<strong>in</strong>i, Bernardi, & Calabresi,2001). Second, <strong>the</strong>se effects are behaviorally relevant:adm<strong>in</strong>istration of D1 antagonists disrupted learn<strong>in</strong>g<strong>in</strong> an appetitive condition<strong>in</strong>g task, whereas D2 antagonistspromoted learn<strong>in</strong>g (Eyny & Horvitz, 2003).Third, because dopam<strong>in</strong>e modulates cellular excitability(Nicola, Surmeier, & Malenka, 2000), associative or“Hebbian” learn<strong>in</strong>g may be enhanced <strong>in</strong> <strong>the</strong> presence ofdopam<strong>in</strong>e, s<strong>in</strong>ce this type of learn<strong>in</strong>g depends on <strong>the</strong>levels of activity of <strong>the</strong> cells <strong>in</strong> question (Hebb, 1949;Schultz, 2002). Thus, <strong>the</strong> efficacy of recently activesynapses may be re<strong>in</strong>forced by a burst of DA act<strong>in</strong>g asa “teach<strong>in</strong>g signal”, lead<strong>in</strong>g to <strong>the</strong> learn<strong>in</strong>g of reward<strong>in</strong>gbehaviors (Wickens, 1997). This account predictsthat a delayed DA burst follow<strong>in</strong>g <strong>the</strong> behavior shoulddegrade learn<strong>in</strong>g by enhanc<strong>in</strong>g <strong>the</strong> strengths of <strong>in</strong>appropriatesynapses. In human category learn<strong>in</strong>g, substantialimpairments are <strong>in</strong>deed observed if feedback is delayedby just 2.5 seconds after each response (Maddox,Ashby, & Bohil, 2003).In summary, phasic bursts and dips of DA occurdifferentially dur<strong>in</strong>g positive and negative feedback, result<strong>in</strong> modification of synaptic plasticity, and <strong>the</strong>reforemay be critical for <strong>the</strong> learn<strong>in</strong>g of trial-and-error tasks.A plausible explanation for implicit category learn<strong>in</strong>gdeficits <strong>in</strong> PD is that damage to dopam<strong>in</strong>ergic neurons <strong>in</strong><strong>the</strong> BG reduces both <strong>the</strong> tonic and phasic levels of extracellularDA, dim<strong>in</strong>ish<strong>in</strong>g <strong>the</strong> effectiveness of <strong>the</strong> habitlearn<strong>in</strong>gsystem. Before mov<strong>in</strong>g on to a more explicitbiologically based version of this <strong>the</strong>ory, <strong>the</strong> next sectiondiscusses <strong>the</strong> effects of dopam<strong>in</strong>ergic medication on cognition.By artificially <strong>in</strong>creas<strong>in</strong>g levels of DA, medicationalleviates some cognitive deficits but actually givesrise to o<strong>the</strong>rs. This is taken to <strong>in</strong>dicate that <strong>the</strong> dynamicrange of <strong>the</strong> DA signal may be more critical than its rawlevel.Deficits Induced by <strong>Dopam<strong>in</strong>e</strong>rgic MedicationThe most common treatments for PD are DA agonistsand levodopa (L-Dopa), a DA precursor (Maruyama,Naoi, & Narabayashi, 1996). Many cognitive studies<strong>in</strong> PD do not take <strong>in</strong>to account <strong>the</strong> level of medicationadm<strong>in</strong>istered to <strong>the</strong> patient, somewhat confound<strong>in</strong>g <strong>the</strong><strong>in</strong>terpretation of experimental results. That is, if a nulleffect is found, it could be attributed to <strong>the</strong> successful replenishmentof DA by L-Dopa <strong>the</strong>rapy. Conversely, if aneffect is found, it is difficult to know if this effect stemsfrom a lack of DA <strong>in</strong> PD, or is somehow related to <strong>the</strong>medication. For <strong>in</strong>stance, medication results <strong>in</strong> elevatedlevels of tonic DA <strong>in</strong> undamaged areas. This may preventphasic dips from be<strong>in</strong>g effective and degrade performancewhen <strong>the</strong>y are functionally important (e.g., dur<strong>in</strong>gnegative feedback).A series of studies compared cognitive function <strong>in</strong>medicated versus non-medicated patients, f<strong>in</strong>d<strong>in</strong>g thatL-Dopa <strong>the</strong>rapy had positive or deleterious effects oncognitive function, depend<strong>in</strong>g on <strong>the</strong> nature of <strong>the</strong> task(Gotham et al., 1988; Swa<strong>in</strong>son et al., 2000; Cools et al.,2001). The general conclusion was that dopam<strong>in</strong>ergicmedication ameliorates task-switch<strong>in</strong>g deficits <strong>in</strong> PD, butthat it impairs performance <strong>in</strong> probabilistic reversal (i.e.,learn<strong>in</strong>g to reverse stimulus-reward probabilities afterprepotent responses are <strong>in</strong>gra<strong>in</strong>ed). Deficits <strong>in</strong>duced bymedication are selective to <strong>the</strong> reversal stage, <strong>in</strong> whichparticipants must use negative feedback to override prepotentresponses.The <strong>in</strong>terpretation given by <strong>the</strong>se authors stems from<strong>the</strong> fact that dopam<strong>in</strong>ergic damage <strong>in</strong> early stage PD isrestricted to <strong>the</strong> dorsal striatum, leav<strong>in</strong>g <strong>the</strong> ventral striatumwith normal levels of DA (Kish et al., 1988; Agidet al., 1993). This expla<strong>in</strong>s why DA medication alleviatesdeficits <strong>in</strong> task-switch<strong>in</strong>g, which relies on dor-


sal striatal <strong>in</strong>teractions with dorsolateral prefrontal cortex.However, <strong>the</strong> amount of medication necessary to replenish<strong>the</strong> dorsal striatum might “overdose” <strong>the</strong> ventralstriatum with DA, and is <strong>the</strong>refore detrimental to tasksthat recruit it. Reversal learn<strong>in</strong>g depends on <strong>the</strong> ventralstriatum and ventral prefrontal cortex <strong>in</strong> monkeys (Dias,Robb<strong>in</strong>s, & Roberts, 1996; Stern & Pass<strong>in</strong>gham, 1995)and recruits <strong>the</strong>se same areas <strong>in</strong> healthy humans (Cools,Clark, Owen, & Robb<strong>in</strong>s, 2002). The overdose hypo<strong>the</strong>sisis fur<strong>the</strong>r supported by <strong>the</strong> f<strong>in</strong>d<strong>in</strong>g that medicated, butnot non-medicated, patients exhibited impulsive bett<strong>in</strong>gstrategies <strong>in</strong> a gambl<strong>in</strong>g task known to recruit <strong>the</strong> ventralstriatum (Cools, Barker, Sahakian, & Robb<strong>in</strong>s, 2003).If <strong>the</strong> overdose account is accurate, a key questionis why should high levels of DA <strong>in</strong> <strong>the</strong> ventral striatumproduce deficits <strong>in</strong> reversal learn<strong>in</strong>g? Like categorizationtasks, reversal learn<strong>in</strong>g relies on trial-by-trial feedback.Dur<strong>in</strong>g positive feedback, phasic bursts of DA maystill be released. A notable difference is that higher levelsof tonic DA might functionally elim<strong>in</strong>ate <strong>the</strong> effectivenessof phasic dips <strong>in</strong> DA dur<strong>in</strong>g negative feedback.A DA agonist would cont<strong>in</strong>ue to b<strong>in</strong>d to receptors, as itis not modulated by feedback/reward as is endogenousdopam<strong>in</strong>e. This by-product of dopam<strong>in</strong>ergic medicationmay elim<strong>in</strong>ate an important aspect of <strong>the</strong> natural biologicalcontrol system — namely <strong>the</strong> ability to quickly unlearnpreviously reward<strong>in</strong>g behaviors. In non-medicatedpatients and healthy <strong>in</strong>dividuals, phasic dips <strong>in</strong> DA releasemay ensue after negative feedback <strong>in</strong> <strong>the</strong> reversalstage, allow<strong>in</strong>g <strong>the</strong> participant to unlearn <strong>the</strong> prepotentassociation. The overdose of DA <strong>in</strong> <strong>the</strong> ventral striatumof medicated patients would h<strong>in</strong>der this ability.So far it has been hypo<strong>the</strong>sized that cognitive deficits<strong>in</strong> PD arise from a restricted range of DA signals <strong>in</strong> <strong>the</strong>BG dur<strong>in</strong>g error feedback, that does not get completelyfixed with medication. This is somewhat vague <strong>in</strong> thatit does not clarify what about <strong>the</strong> BG supports implicitlearn<strong>in</strong>g, and how DA modulates processes <strong>in</strong> <strong>the</strong> BG.Why should phasic bursts and dips <strong>in</strong> DA support <strong>the</strong>learn<strong>in</strong>g and unlearn<strong>in</strong>g of responses, respectively? Tobe more clear, I now turn to a general (if highly simplified)description of BG circuitry and function, and review<strong>the</strong> role of DA <strong>in</strong> modulat<strong>in</strong>g this function. A neural networkmodel <strong>in</strong>stantiates <strong>the</strong>se biological properties andprovides a mechanistic account of probabilistic classificationand reversal deficits <strong>in</strong> PD. Besides be<strong>in</strong>g a usefultool for understand<strong>in</strong>g complex system <strong>in</strong>teractions<strong>in</strong> implicit learn<strong>in</strong>g, <strong>the</strong> model can be extended to <strong>in</strong>cludethose <strong>in</strong>volved <strong>in</strong> modulat<strong>in</strong>g prefrontal function<strong>in</strong> higher level cognition.<strong>Basal</strong> <strong>Ganglia</strong> Neuroanatomy andBiochemistryIn <strong>the</strong> context of motor control, various authors havesuggested that <strong>the</strong> BG selectively facilitates <strong>the</strong> executionof a s<strong>in</strong>gle motor command, while suppress<strong>in</strong>g all o<strong>the</strong>rs(e.g., M<strong>in</strong>k, 1996; Chevalier & Deniau, 1990). TheBG does not encode <strong>the</strong> details of motor responses —it simply modulates <strong>the</strong>ir execution by signal<strong>in</strong>g “Go”or “NoGo” (Hikosaka, 1989). Thus, <strong>the</strong> BG is thoughtto act as a brake on compet<strong>in</strong>g motor actions that arerepresented <strong>in</strong> motor (or premotor) cortex — only <strong>the</strong>most appropriate motor command is able to “release <strong>the</strong>brake” and get executed at any particular po<strong>in</strong>t <strong>in</strong> time.This functionality also helps to str<strong>in</strong>g simple motor commandstoge<strong>the</strong>r to form a complex motor sequence, byselect<strong>in</strong>g <strong>the</strong> most appropriate command at any givenportion of <strong>the</strong> sequence and <strong>in</strong>hibit<strong>in</strong>g <strong>the</strong> o<strong>the</strong>r ones until<strong>the</strong> time is appropriate. The circuitry that implements<strong>the</strong>se functions is described next.The <strong>in</strong>put segment of <strong>the</strong> BG is <strong>the</strong> striatum, whichis formed collectively by <strong>the</strong> caudate, putamen, and nucleusaccumbens. The striatum receives <strong>in</strong>put from multiplecortical areas and projects through <strong>the</strong> globus pallidusand substantia nigra to <strong>the</strong> thalamus, ultimatelyclos<strong>in</strong>g <strong>the</strong> circuit back to <strong>the</strong> area of cortex from whichit received (e.g., premotor cortex) (Alexander et al.,1986). 90-95% of all striatal neurons are GABAergicmedium sp<strong>in</strong>y neurons (MSN’s). These are projectioncells that carry <strong>in</strong>formation to be transmitted to BG outputstructures. The rema<strong>in</strong><strong>in</strong>g 5-10% are local <strong>in</strong>terneuronsthat are GABAergic and chol<strong>in</strong>ergic (Gerfen & Wilson,1996).The MSN’s project to <strong>the</strong> globus pallidus and substantianigra via two ma<strong>in</strong> pathways which have oppos<strong>in</strong>geffects on <strong>the</strong> ultimate excitation/<strong>in</strong>hibition of <strong>the</strong>thalamus (Alexander & Crutcher, 1990b). The “direct”pathway facilitates <strong>the</strong> execution of responses, whereas<strong>the</strong> “<strong>in</strong>direct” pathway <strong>in</strong>hibits <strong>the</strong>m. Cells <strong>in</strong> <strong>the</strong> directpathway project from <strong>the</strong> striatum and <strong>in</strong>hibit <strong>the</strong><strong>in</strong>ternal segment of <strong>the</strong> globus pallidus (GPi). 1 In <strong>the</strong>absence of striatal fir<strong>in</strong>g, <strong>the</strong> GPi tonically <strong>in</strong>hibits <strong>the</strong>thalamus, so <strong>the</strong> excitation of direct MSN’s and result<strong>in</strong>gGPi <strong>in</strong>hibition serves to dis<strong>in</strong>hibit <strong>the</strong> thalamus. Notethat <strong>the</strong> “double-negative” <strong>in</strong>voked by this dis<strong>in</strong>hibitiondoes not directly excite <strong>the</strong> thalamus, but <strong>in</strong>stead simplyenables <strong>the</strong> thalamus to get excited from o<strong>the</strong>r excitatoryprojections (e.g., Chevalier & Deniau, 1990; Franket al., 2001), <strong>the</strong>reby provid<strong>in</strong>g a gat<strong>in</strong>g function. Cells<strong>in</strong> <strong>the</strong> <strong>in</strong>direct pathway <strong>in</strong>hibit <strong>the</strong> external segment of1 The substantia nigra pars reticulata (SNr) is equivalent to <strong>the</strong> GPi<strong>in</strong> this circuitry, except that <strong>the</strong> former receives from <strong>the</strong> caudate, and<strong>the</strong> latter from <strong>the</strong> putamen. For this reason I consider <strong>the</strong> two to beone functional entity, but for simplicity only refer to GPi


excitatory<strong>in</strong>hibitorymodulatoryVTASNcGO NOGOD1striatum D2Directpremotor/prefrontalcortexGPeGPi/SNrIndirectthalamusFigure 1: The cortico-striato-thalamo-cortical loops, <strong>in</strong>clud<strong>in</strong>g<strong>the</strong> direct and <strong>in</strong>direct pathways of <strong>the</strong> basal ganglia. Thecells of <strong>the</strong> striatum are divided <strong>in</strong>to two sub-classes based ondifferences <strong>in</strong> biochemistry and efferent projections. The “Go”cells project directly to <strong>the</strong> GPi, and have <strong>the</strong> effect of dis<strong>in</strong>hibit<strong>in</strong>g<strong>the</strong> thalamus, <strong>the</strong>reby facilitat<strong>in</strong>g <strong>the</strong> execution of anaction represented <strong>in</strong> cortex. The “NoGo” cells are part of <strong>the</strong><strong>in</strong>direct pathway to <strong>the</strong> GPi, and have an oppos<strong>in</strong>g effect, suppress<strong>in</strong>gactions from gett<strong>in</strong>g executed. <strong>Dopam<strong>in</strong>e</strong> from <strong>the</strong>SNc projects to <strong>the</strong> dorsal striatum, differentially modulat<strong>in</strong>gactivity <strong>in</strong> <strong>the</strong> direct and <strong>in</strong>direct pathways by activat<strong>in</strong>g differentreceptors: The Go cells express <strong>the</strong> D1 receptor, and <strong>the</strong>NoGo cells express <strong>the</strong> D2 receptor. <strong>Dopam<strong>in</strong>e</strong> from <strong>the</strong> VTAprojects to ventral striatum (not shown) and frontal cortex. GPi:<strong>in</strong>ternal segment of globus pallidus; GPe: external segment ofglobus pallidus; SNc: substantia nigra pars compacta; SNr:substantia nigra pars reticulata; VTA: ventral tegmental area.<strong>the</strong> globus pallidus (GPe), which tonically <strong>in</strong>hibits <strong>the</strong>GPi. 2 The net effect of <strong>in</strong>direct MSN excitation is <strong>the</strong>nto fur<strong>the</strong>r <strong>in</strong>hibit <strong>the</strong> thalamus. See figure 1 for a pictorialdescription of this circuitry.When striatal cells <strong>in</strong> <strong>the</strong> direct pathway dis<strong>in</strong>hibit<strong>the</strong> thalamus, excitatory thalamocortical projections enhance<strong>the</strong> activity of <strong>the</strong> motor command that is currentlyrepresented <strong>in</strong> motor cortex so that its execution is facilitated.Thus, direct pathway cells send a “Go” signal toselect a given response. Indirect pathway activity, withits opposite effect on <strong>the</strong> thalamus, sends a “NoGo” signalto suppress <strong>the</strong> response. In support of this model,dist<strong>in</strong>ct neuronal activation was found <strong>in</strong> monkey striatum<strong>in</strong> response to two cues that <strong>in</strong>dicated whe<strong>the</strong>r <strong>the</strong>monkey had to execute (Go) or suppress (NoGo) armmovements (Apicella, Scarnati, Ljungberg, & Schultz,1992).Whe<strong>the</strong>r <strong>the</strong> direct and <strong>in</strong>direct pathways competewith each o<strong>the</strong>r or function <strong>in</strong>dependently is controversial.Because <strong>the</strong>y both ultimately converge on <strong>the</strong> GPibefore ei<strong>the</strong>r dis<strong>in</strong>hibit<strong>in</strong>g or fur<strong>the</strong>r <strong>in</strong>hibit<strong>in</strong>g <strong>the</strong> thalamus,it would seem <strong>the</strong>se two pathways act competi-2 The GPe <strong>in</strong>hibition of GPi actually <strong>in</strong>volves two sub-pathways,both directly and <strong>in</strong>directly through <strong>the</strong> subthalamic nucleus, but I donot consider <strong>the</strong> role of <strong>the</strong> latter structure here.tively to control BG output (Percheron & Filion, 1991).However, subregions of neurons <strong>in</strong> striatum term<strong>in</strong>ate<strong>in</strong> dist<strong>in</strong>ct subregions with<strong>in</strong> GPi, suggest<strong>in</strong>g <strong>the</strong> existenceof <strong>in</strong>dependent parallel sub-loops with<strong>in</strong> <strong>the</strong> overallthalamocortical circuit (Alexander & Crutcher, 1990b),ra<strong>the</strong>r than a competitive dynamic. It is plausible that<strong>the</strong> direct and <strong>in</strong>direct pathways aris<strong>in</strong>g from a subegion<strong>in</strong> striatum converge <strong>in</strong> <strong>the</strong> GPi and act competitivelyto facilitate/suppress a particular response, but that thiscompetitive dynamic occurs <strong>in</strong> parallel for multiple responses.This may allow for selective control of differentresponses, so that one response may be enabled, whileo<strong>the</strong>rs are suppressed (M<strong>in</strong>k, 1996; Frank et al., 2001;Beiser & Houk, 1998). Selection of a given responsemay <strong>in</strong>volve a Go signal to one area of thalamus, <strong>in</strong> conjunctionwith a NoGo signal to thalamic areas <strong>in</strong>volved<strong>in</strong> compet<strong>in</strong>g responses.Cellular Mechanisms of DA <strong>in</strong> <strong>the</strong> BGThe dynamics of BG circuitry are importantly modulatedby phasic changes <strong>in</strong> DA. <strong>Dopam<strong>in</strong>e</strong> is primarilyexcitatory to D1 receptors and <strong>in</strong>hibitory to D2 receptors(see below). The functional consequences ofDA release can be deduced from <strong>the</strong> relative segregationof D1 and D2 receptor expression <strong>in</strong> two ma<strong>in</strong> populationsof striatal MSN’s. The D1 receptor is predom<strong>in</strong>antlyexpressed <strong>in</strong> striatal cells of <strong>the</strong> direct pathway,whereas <strong>the</strong> D2 receptor predom<strong>in</strong>ates <strong>in</strong> <strong>the</strong> <strong>in</strong>directpathway (Gerfen, 1992; Gerfen & Keefe, 1994; Gerfen,Keefe, & Gauda, 1995; Bloch & LeMo<strong>in</strong>e, 1994;Ince, Ciliax, & Levey, 1997). Even those who cautionthat <strong>the</strong>re is D1/D2 colocalization <strong>in</strong> both BG pathways(Aizman et al., 2000; Surmeier et al., 1996) never<strong>the</strong>lessconcede that <strong>the</strong> relative levels of expression is asymmetrical.Thus, <strong>in</strong>creased levels of DA activate <strong>the</strong> direct/Gopathway and suppress <strong>the</strong> <strong>in</strong>direct/NoGo pathway(e.g., Gurney, Prescott, & Redgrave, 2001; Brown,Bullock, & Grossberg, 2004). DA depletion of <strong>the</strong> striatum(as <strong>in</strong> PD) has <strong>the</strong> opposite effect, bias<strong>in</strong>g <strong>the</strong> <strong>in</strong>directpathway to be overactive (Gerfen, 2000; Sal<strong>in</strong>, Hajji,& Kerkerian-Le Goff, 1996).D1 Excites / Enhances Contrast of Go CellsMany studies show an excitatory effect of D1 stimulation(e.g., Kitai, Sugimori, & Kocsis, 1976), butconflict<strong>in</strong>g data also exist (Hernandez-Lopez, Bargas,Surmeier, Reyes, & Galarraga, 1997). One <strong>in</strong>trigu<strong>in</strong>gpossibility raised by several researchers is that DA effectivelyenhances contrast or <strong>in</strong>creases <strong>the</strong> signal-to-noiseratio, by amplify<strong>in</strong>g activity of <strong>the</strong> most active cells while<strong>in</strong>hibit<strong>in</strong>g <strong>the</strong> least active ones (Cohen, Braver, & Brown,2002; Cohen & Servan-Schreiber, 1992; Foote & Morrison,1987; Rolls, Thorpe, Boytim, Szabo, & Perrett,1984). Recently, cellular mechanisms <strong>in</strong> <strong>the</strong> BG were


discovered that could potentially support this function(Nicola et al., 2000; Hernandez-Lopez et al., 1997, figure2). Specifically, <strong>the</strong> excitatory/<strong>in</strong>hibitory effect ofdopam<strong>in</strong>e on D1 receptors <strong>in</strong> <strong>the</strong> rat basal ganglia dependson <strong>the</strong> rest<strong>in</strong>g membrane potential of <strong>the</strong> targetcell. In <strong>the</strong> presence of a D1 agonist, spontaneous fir<strong>in</strong>gwas reduced <strong>in</strong> cells that were held at a low membranepotential, but was <strong>in</strong>creased <strong>in</strong> those held at a moredepolarized potential. This is biologically relevant, becausemedium sp<strong>in</strong>y cells of <strong>the</strong> BG are bi-stable: <strong>the</strong>yoscillate between two levels of rest<strong>in</strong>g membrane potential(Wilson, 1993; Wilson & Kawaguchi, 1996). The“down-state” refers to a membrane potential of around-80 mV, thought to result from <strong>in</strong>ward rectify<strong>in</strong>g K + currents.The “up-state” refers to a membrane potential ofaround -50 mV, and results from <strong>the</strong> open<strong>in</strong>g of voltagedependent Na + and Ca 2+ channels, upon be<strong>in</strong>g excitedfrom temporally coherent, convergent excitatory synaptic<strong>in</strong>put.Taken toge<strong>the</strong>r, <strong>the</strong> above observations suggest thatdopam<strong>in</strong>e, act<strong>in</strong>g via D1 receptors, can sharpen contrastby amplify<strong>in</strong>g activity of cells that are <strong>in</strong> <strong>the</strong>ir upstatesand <strong>in</strong>hibit<strong>in</strong>g those <strong>in</strong> <strong>the</strong>ir down-states from fir<strong>in</strong>gspontaneously. This may have <strong>the</strong> effect of <strong>in</strong>creas<strong>in</strong>g<strong>the</strong> signal-to-noise ratio, because cells encod<strong>in</strong>g <strong>the</strong>relevant signal receive temporally coherent synaptic <strong>in</strong>putfrom multiple cortical afferents and are <strong>the</strong>refore <strong>in</strong><strong>the</strong>ir up-state, whereas those reflect<strong>in</strong>g biological noiseor o<strong>the</strong>r irrelevant background signals may be <strong>in</strong> <strong>the</strong>irdown-state and only fir<strong>in</strong>g spuriously. 3 The <strong>in</strong>creasedsignal-to-noise ratio <strong>in</strong> <strong>the</strong> direct pathway may help todeterm<strong>in</strong>e which among several responses is most appropriateto select.At a molecular level, D1 activation enhances L-typeCa 2+ channel currents <strong>in</strong> striatal medium sp<strong>in</strong>y neurons(Surmeier, Bargas, Hemm<strong>in</strong>gs, Nairn, & Greengard,1995). It appears that this is <strong>the</strong> mechanism by which <strong>the</strong>DA contrast sharpen<strong>in</strong>g is operat<strong>in</strong>g, because both <strong>the</strong>excitatory and <strong>in</strong>hibitory DA effects were blocked by L-type Ca 2+ channel antagonists (Hernandez-Lopez et al.,1997).D2 Inhibits NoGo Cells — “Releas<strong>in</strong>g <strong>the</strong> Brakes”In <strong>the</strong> case of <strong>the</strong> D1 receptor, <strong>the</strong> contrast achievedby its activation is thought to be by way of enhancementof L-type Ca 2+ channel currents, described above. Recently,it was shown that D2 activation reduces <strong>the</strong>sesame currents, <strong>the</strong>reby reduc<strong>in</strong>g neuronal excitability3 Note that some argue that noise suppression <strong>in</strong> <strong>the</strong> direct pathwayis accomplished not by D1 <strong>in</strong>hibition, but by low amounts of D2heteroreceptors act<strong>in</strong>g presynaptically to decrease glutamate release <strong>in</strong>corticostriatal synapses (O’Donnell, 2003; Maura, Giardi, & Raiteri,1988). This proposal does not conflict <strong>in</strong> any important way with <strong>the</strong>implementations of <strong>the</strong> model described later <strong>in</strong> <strong>the</strong> paper.Figure 2: Medium sp<strong>in</strong>y cells <strong>in</strong> <strong>the</strong> basal ganglia are bistable,spontaneously switch<strong>in</strong>g between two levels of rest<strong>in</strong>gmembrane potential, commonly labeled “up-state” (V m ≈−60mV ) and “down-state” (V m ≈ −80mV ), depend<strong>in</strong>g onamount of afferent drive. Cells are more likely to fire when<strong>in</strong> <strong>the</strong> up-state, but may still fire spontaneously <strong>in</strong> <strong>the</strong> downstate.<strong>Dopam<strong>in</strong>e</strong> D1 receptor activity may <strong>in</strong>crease <strong>the</strong> signalto-noiseratio. In <strong>the</strong> presence of D1 agonist SKF 81297, fir<strong>in</strong>gis A) reduced for cells that are <strong>in</strong> <strong>the</strong>ir “down-state”, but B)<strong>in</strong>creased for cells <strong>in</strong> <strong>the</strong>ir “up-state”. Reproduced with permissionfrom Hernandez-Lopez et al. (1997), Copyright 1997by <strong>the</strong> Society for Neuroscience.(Hernandez-Lopez, Tkatch, Perez-Garci, Galarraga, Bargas,Hamm, & Surmeier, 2000). Unlike <strong>the</strong> D1 modulatoryeffects, <strong>the</strong> D2 <strong>in</strong>hibition effect on neuronal excitationwas not found to be dependent on <strong>the</strong> membrane potentialof <strong>the</strong> target cell. These results confirmed a longheld assumption that DA suppresses activity <strong>in</strong> <strong>the</strong> <strong>in</strong>directpathway via D2 receptors, and that this is disrupted<strong>in</strong> PD (Alb<strong>in</strong>, Young, & Penney, 1989).Recall that striatal cells express<strong>in</strong>g D2 receptors predom<strong>in</strong>ate<strong>in</strong> <strong>the</strong> <strong>in</strong>direct/NoGo pathway, which is thoughtto act as a brake on a particular action or set of actions.DA can <strong>the</strong>n aid <strong>in</strong> releas<strong>in</strong>g <strong>the</strong> brake, by <strong>in</strong>hibit<strong>in</strong>gNoGo activity via D2 receptors, and allow<strong>in</strong>g<strong>the</strong> Go pathway to exert more <strong>in</strong>fluence on BG output.By this account, DA shifts <strong>the</strong> balance <strong>in</strong> <strong>the</strong> BG frombe<strong>in</strong>g “hesitant” to a more responsive state, effectivelylower<strong>in</strong>g <strong>the</strong> threshold for select<strong>in</strong>g/gat<strong>in</strong>g a response tobe executed. This expla<strong>in</strong>s why Park<strong>in</strong>son’s patients,who have a lack of DA <strong>in</strong> <strong>the</strong> BG, have difficulty <strong>in</strong>itiat<strong>in</strong>gmotor commands — without a reasonable amountof basal dopam<strong>in</strong>e release, <strong>the</strong> system is <strong>in</strong> a tonic state of“NoGo” because an overactive <strong>in</strong>direct pathway leads toexcessive cortical <strong>in</strong>hibition (Filion & Tremblay, 1991;Jell<strong>in</strong>ger, 2002). With enough DA, <strong>the</strong> balance is shiftedto “Go”, and <strong>the</strong> particular response that is executedmay depend on levels of activity of different subpopu-


lations — represent<strong>in</strong>g different responses — <strong>in</strong> <strong>the</strong> direct/Gocells. The D1 contrast enhancement mechanismdescribed above would aid <strong>in</strong> select<strong>in</strong>g <strong>the</strong> most appropriateresponse by boost<strong>in</strong>g its associated neural activity,while suppress<strong>in</strong>g that of all o<strong>the</strong>r Go cells.DA <strong>in</strong> <strong>the</strong> BG: Summary and Effects on Synaptic PlasticityIn summary, <strong>in</strong>creases <strong>in</strong> DA result <strong>in</strong> (a) <strong>in</strong>creasedcontrast enhancement <strong>in</strong> <strong>the</strong> direct pathway; and (b) suppressionof <strong>the</strong> <strong>in</strong>direct pathway. Phasic dips <strong>in</strong> DA have<strong>the</strong> opposite effect, releas<strong>in</strong>g <strong>the</strong> <strong>in</strong>direct pathway fromsuppression.An important consequence of DA performance effectson Go/NoGo activity levels is that <strong>the</strong>y driveactivity-dependent learn<strong>in</strong>g to synaptic <strong>in</strong>put. A well establishedpr<strong>in</strong>ciple should hold across both Go and NoGocells: more active cells undergo LTP whereas less activecells undergo LTD (e.g., Bear & Malenka, 1994).Once we account for differential effects of DA on excitability<strong>in</strong> <strong>the</strong> two BG pathways, this pr<strong>in</strong>ciple makesstraightforward predictions on <strong>the</strong>ir effects on plasticity.If DA bursts dur<strong>in</strong>g re<strong>in</strong>forcement are adaptive, <strong>the</strong>yshould have <strong>the</strong> complementary effects of <strong>in</strong>creas<strong>in</strong>g Golearn<strong>in</strong>g while decreas<strong>in</strong>g NoGo learn<strong>in</strong>g so that re<strong>in</strong>forcedresponses are more likely to be facilitated <strong>in</strong> <strong>the</strong>future. Because DA enhances activity <strong>in</strong> <strong>the</strong> direct pathway,bursts may <strong>in</strong>deed <strong>in</strong>duce LTP <strong>in</strong> Go cells. Fur<strong>the</strong>r,<strong>the</strong> <strong>in</strong>hibitory effects of DA <strong>in</strong> <strong>the</strong> <strong>in</strong>direct pathway may<strong>in</strong>duce LTD <strong>in</strong> NoGo cells so that <strong>the</strong>y learn to becomeless active. This hypo<strong>the</strong>sis is supported by demonstrationsthat DA <strong>in</strong>duces LTP via D1 receptors and LTD viaD2 receptors (Kerr & Wickens, 2001; Calabresi, Saiardi,Pisani, Baik, Centonze, Mercuri, Bernardi, & Borrelli,1997).The same pr<strong>in</strong>ciple can be applied to predict <strong>the</strong> effectof DA dips, which, if <strong>the</strong>y are adaptive, should enhanceNoGo learn<strong>in</strong>g so that non-re<strong>in</strong>forc<strong>in</strong>g responsesare actively suppressed <strong>in</strong> <strong>the</strong> future. Because DA dipsrelease NoGo cells <strong>in</strong> <strong>the</strong> <strong>in</strong>direct pathway from DA <strong>in</strong>hibition,<strong>the</strong> <strong>in</strong>creased NoGo activity should <strong>in</strong>duce LTP<strong>in</strong> NoGo cells. Although LTP has not been tested dur<strong>in</strong>gendogenous DA dips, this hypo<strong>the</strong>sis is <strong>in</strong>directly supportedby exam<strong>in</strong><strong>in</strong>g <strong>the</strong> effects of D2 receptor blockade,assum<strong>in</strong>g that DA dips decrease D2 stimulation andshould <strong>the</strong>refore have <strong>the</strong> same qualitative effects on <strong>the</strong><strong>in</strong>direct pathway as D2 blockade. When stimulated bycortical <strong>in</strong>puts, D2 blockade <strong>in</strong>creases burst<strong>in</strong>g activityand Fos expression of striatal cells <strong>in</strong> <strong>the</strong> <strong>in</strong>direct pathway(Robertson, V<strong>in</strong>cent, & Fibiger, 1992; F<strong>in</strong>ch, 1999),and results <strong>in</strong> enhanced corticostriatal LTP (Calabresiet al., 1997).Neural Model of BG and DAThe hypo<strong>the</strong>sis is that cognitive deficits <strong>in</strong> PD can beaccounted for by a reduced dynamic range of phasic DAsignals which reduces <strong>the</strong> ability to unconsciously learnGo/NoGo associations. This verbal explanation alone isnot sufficient, but may be substantially streng<strong>the</strong>ned bytest<strong>in</strong>g its feasibility <strong>in</strong> a computational model that <strong>in</strong>corporatesall <strong>the</strong> key elements. Such a model can generatenovel predictions because it gets at <strong>the</strong> underly<strong>in</strong>g sourceof cognitive dysfunction <strong>in</strong> PD. If validated, it can alsobe used as a tool to understand complex <strong>in</strong>volvement ofDA <strong>in</strong> <strong>the</strong> BG <strong>in</strong> o<strong>the</strong>r neurological disorders.The above anatomical and biochemical considerationsare syn<strong>the</strong>sized <strong>in</strong> a neural network model (figure3). The model learns to select one of two responsesto different <strong>in</strong>put stimuli. Direct and <strong>in</strong>direct pathwaysenable <strong>the</strong> model to learn conditions that are appropriatefor gat<strong>in</strong>g as well as those for suppress<strong>in</strong>g. Parallelsub-loops <strong>in</strong>dependently modulate each response, allow<strong>in</strong>gselective facilitation of one response with concurrentsuppression of <strong>the</strong> o<strong>the</strong>r. Projections from <strong>the</strong> substantianigra pars compacta (SNc) to <strong>the</strong> striatum <strong>in</strong>corporatemodulatory effects of DA. Phasic bursts and dips <strong>in</strong>SNc fir<strong>in</strong>g (and <strong>the</strong>refore simulated DA release) ensuefrom correct and <strong>in</strong>correct responses, respectively. Thesephasic changes drive learn<strong>in</strong>g by preferentially activat<strong>in</strong>g<strong>the</strong> direct pathway after a correct response and <strong>the</strong> <strong>in</strong>directpathway after an <strong>in</strong>correct response. The model istra<strong>in</strong>ed on simulated versions of <strong>the</strong> wea<strong>the</strong>r predictiontask and probabilistic reversal. Disruption to <strong>the</strong> DA systemas <strong>in</strong> PD and “overdose” cases produces results thatare qualitatively similar to those observed behaviorally.Mechanics of <strong>the</strong> ModelThe units <strong>in</strong> <strong>the</strong> model operate accord<strong>in</strong>g to a simplepo<strong>in</strong>t neuron function us<strong>in</strong>g rate-coded output activations,as implemented <strong>in</strong> <strong>the</strong> Leabra framework(O’Reilly, 1998; O’Reilly & Munakata, 2000). Thereare simulated excitatory and <strong>in</strong>hibitory synaptic <strong>in</strong>putchannels. Local <strong>in</strong>hibition <strong>in</strong> each of <strong>the</strong> layers is computedthrough a simple approximation to <strong>the</strong> effects of<strong>in</strong>hibitory <strong>in</strong>terneurons. Synaptic connection weightswere tra<strong>in</strong>ed us<strong>in</strong>g a re<strong>in</strong>forcement learn<strong>in</strong>g version ofLeabra. The learn<strong>in</strong>g algorithm <strong>in</strong>volves two phases, allow<strong>in</strong>gsimulation of feedback effects, and is more biologicallyplausible than standard error backpropagation.In <strong>the</strong> m<strong>in</strong>us phase, <strong>the</strong> network settles <strong>in</strong>to activity statesbased on <strong>in</strong>put stimuli and its synaptic weights, ultimately“choos<strong>in</strong>g” a response. In <strong>the</strong> plus phase, <strong>the</strong>network resettles <strong>in</strong> <strong>the</strong> same manner, with <strong>the</strong> only differencebe<strong>in</strong>g a change <strong>in</strong> simulated dopam<strong>in</strong>e: an <strong>in</strong>creasefor correct responses, and a dip for <strong>in</strong>correct re-


SNcStriatumGO−GP IntInputD1−GP Ext−D2−NOGOThalamusResp1 Resp2OutputPremotor CortexFigure 3: Neural network model of direct and <strong>in</strong>direct pathwaysof <strong>the</strong> basal ganglia, with differential modulation of <strong>the</strong>sepathways by DA <strong>in</strong> <strong>the</strong> SNc. The Premotor Cortex (PMC) selectsa response via direct projections from <strong>the</strong> Input. BG gat<strong>in</strong>gresults <strong>in</strong> bottom-up support from Thalamus, facilitat<strong>in</strong>g executionof <strong>the</strong> response <strong>in</strong> cortex. In <strong>the</strong> Striatum, <strong>the</strong> responsehas a Go representation (first column) that is stronger than itsNoGo representation (third column). This results <strong>in</strong> <strong>in</strong>hibitionof <strong>the</strong> left column of GPi and dis<strong>in</strong>hibition of <strong>the</strong> left Thalamusunit, ultimately facilitat<strong>in</strong>g <strong>the</strong> execution of Response1 <strong>in</strong>PMC. A tonic level of DA is shown here, dur<strong>in</strong>g <strong>the</strong> responseselection (“m<strong>in</strong>us”) phase. A burst or dip <strong>in</strong> DA ensues <strong>in</strong> <strong>the</strong>feedback (“plus”) phase (see figures 4 and 5), depend<strong>in</strong>g onwhe<strong>the</strong>r <strong>the</strong> response is correct or <strong>in</strong>correct for <strong>the</strong> particular<strong>in</strong>put stimulus.sponses. Connection weights are <strong>the</strong>n adjusted to learnon <strong>the</strong> difference between activity states <strong>in</strong> <strong>the</strong> m<strong>in</strong>us andplus phases.Overall Network Division of LaborThe network’s job is to select ei<strong>the</strong>r Response1 orResponse2, depend<strong>in</strong>g on <strong>the</strong> task and <strong>the</strong> sensory <strong>in</strong>put.At <strong>the</strong> beg<strong>in</strong>n<strong>in</strong>g of each trial, <strong>in</strong>com<strong>in</strong>g stimuli directlyactivate a response <strong>in</strong> premotor cortex (PMC). However,<strong>the</strong>se direct connections are not strong enough to elicit arobust response <strong>in</strong> and of <strong>the</strong>mselves; <strong>the</strong>y also requirebottom-up support from <strong>the</strong> thalamus. The job of <strong>the</strong> BGis to <strong>in</strong>tegrate stimulus <strong>in</strong>put with <strong>the</strong> dom<strong>in</strong>ant responseselected by PMC, and based on what it has learned <strong>in</strong> pastexperience, ei<strong>the</strong>r facilitate (Go) or suppress (NoGo) thatresponse.With<strong>in</strong> <strong>the</strong> overall thalamocortical circuit, <strong>the</strong>re aretwo parallel sub-loops that are isolated from each o<strong>the</strong>r,separately modulat<strong>in</strong>g <strong>the</strong> two responses. This allows for<strong>the</strong> BG to selectively gate one response, while cont<strong>in</strong>u<strong>in</strong>gto suppress <strong>the</strong> o<strong>the</strong>r(s). This was implemented <strong>in</strong> ourprevious model (Frank et al., 2001), and has been suggestedby o<strong>the</strong>rs (Beiser & Houk, 1998). The striatumis divided <strong>in</strong>to two distributed subpopulations. The twocolumns on <strong>the</strong> left are “Go” units for <strong>the</strong> two potentialresponses, and have simulated D1 receptors. The twocolumns on <strong>the</strong> right are “NoGo” units, and have simulatedD2 receptors. Thus, <strong>the</strong> four columns <strong>in</strong> <strong>the</strong> striatumrepresent, from left to right, “Go-Response1”; “Go-Response2”; “NoGo-Response1”; “NoGo-Response2”.The Go columns project only to <strong>the</strong> correspond<strong>in</strong>gcolumn <strong>in</strong> <strong>the</strong> GPi (direct pathway), and <strong>the</strong> NoGocolumns to <strong>the</strong> GPe (<strong>in</strong>direct pathway). Both GPecolumns <strong>in</strong>hibit <strong>the</strong> associated column <strong>in</strong> GPi, so thatstriatal Go and NoGo activity have oppos<strong>in</strong>g effects onGPi. F<strong>in</strong>ally, each column <strong>in</strong> GPi tonically <strong>in</strong>hibits <strong>the</strong>associated column of <strong>the</strong> thalamus, which is reciprocallyconnected to premotor cortex. Thus, if Go activity isstronger than NoGo activity for Response1, <strong>the</strong> left columnof GPi will be <strong>in</strong>hibited, remov<strong>in</strong>g tonic <strong>in</strong>hibition(i.e. dis<strong>in</strong>hibit<strong>in</strong>g) of <strong>the</strong> correspond<strong>in</strong>g thalamus unit,and facilitat<strong>in</strong>g its execution <strong>in</strong> premotor cortex.The above parallel and convergent connectivity issupported by anatomical evidence discussed above. Thenetwork architecture simply supports <strong>the</strong> existence ofconnections, but how <strong>the</strong>se ultimately <strong>in</strong>fluence behaviordepends on <strong>the</strong>ir relative strengths. The network startsoff with random weights and representations <strong>in</strong> both <strong>the</strong>BG and cortical layers are learned. Distributed activitywith<strong>in</strong> each striatal column enables different Go andNoGo representations to develop for various stimulusconfigurations dur<strong>in</strong>g <strong>the</strong> course of tra<strong>in</strong><strong>in</strong>g.Simulated Effects of DATo simulate differential effects of DA on D1 and D2receptors <strong>in</strong> <strong>the</strong> two populations of striatal cells, separateexcitatory and <strong>in</strong>hibitory projections were assigned from<strong>the</strong> SNc to <strong>the</strong> direct and <strong>in</strong>direct pathways <strong>in</strong> <strong>the</strong> striatum.Thus, <strong>the</strong> D1 projection only connects to <strong>the</strong> Gocolumns of <strong>the</strong> striatum, whereas <strong>the</strong> D2 projection connectsonly to <strong>the</strong> NoGo columns. Besides be<strong>in</strong>g excitatory,<strong>the</strong> effects of D1 activity <strong>in</strong>volve contrast enhancement.This was accomplished by <strong>in</strong>creas<strong>in</strong>g <strong>the</strong> striatalunits’ activation ga<strong>in</strong> (mak<strong>in</strong>g it more nonl<strong>in</strong>ear), <strong>in</strong> conjunctionwith <strong>in</strong>creas<strong>in</strong>g <strong>the</strong> activation threshold (so thatweakly active units do not exceed fir<strong>in</strong>g threshold andare suppressed). The effects of D2 activity are <strong>in</strong>hibitory,suppress<strong>in</strong>g <strong>the</strong> NoGo cells. Thus, for a high amount ofsimulated DA, contrast enhancement <strong>in</strong> <strong>the</strong> direct path-


way supports <strong>the</strong> enabl<strong>in</strong>g of a particular Go responsewhile <strong>the</strong> <strong>in</strong>direct pathway is suppressed.DA Modulates Learn<strong>in</strong>gIncreases <strong>in</strong> DA dur<strong>in</strong>g positive feedback lead to re<strong>in</strong>forc<strong>in</strong>g<strong>the</strong> selected response, whereas decreases <strong>in</strong> DAdur<strong>in</strong>g negative feedback lead to learn<strong>in</strong>g not to selectthat respose. A tonic level of DA is simulated by sett<strong>in</strong>g<strong>the</strong> SNc units to be semi-active (activation value 0.5) at<strong>the</strong> start of each trial, <strong>in</strong> <strong>the</strong> m<strong>in</strong>us phase. In <strong>the</strong> <strong>in</strong>itialstages of tra<strong>in</strong><strong>in</strong>g, <strong>the</strong> network selects a random response,dictated by random <strong>in</strong>itial weights toge<strong>the</strong>r with a smallamount of random noise <strong>in</strong> premotor cortex activity. If<strong>the</strong> response is correct, a phasic <strong>in</strong>crease <strong>in</strong> SNc fir<strong>in</strong>goccurs <strong>in</strong> <strong>the</strong> plus phase, with all SNc units set to havean activation value of 1.0 (i.e., high fir<strong>in</strong>g rate). Thisburst of DA causes a more coherent Go representation<strong>in</strong> <strong>the</strong> striatum to be associated with <strong>the</strong> reward<strong>in</strong>g responsethat was just selected. For an <strong>in</strong>correct responsea phasic dip of DA occurs, with all SNc units set to zeroactivation. In this case, <strong>the</strong> NoGo cells are released fromsuppression, enabl<strong>in</strong>g <strong>the</strong> network to learn NoGo to <strong>the</strong>selected <strong>in</strong>correct response.Note that an explicit supervised tra<strong>in</strong><strong>in</strong>g signal isnever presented; <strong>the</strong> model simply learns based on <strong>the</strong>difference between activity states <strong>in</strong> <strong>the</strong> m<strong>in</strong>us and plusphases, which only differ due to phasic changes <strong>in</strong> DA.Weights from <strong>the</strong> <strong>in</strong>put layer and premotor cortex are adjustedso that over time, <strong>the</strong> striatum learns which responsesto facilitate and which to suppress <strong>in</strong> <strong>the</strong> contextof <strong>in</strong>com<strong>in</strong>g sensory <strong>in</strong>put. In addition, <strong>the</strong> premotor cortexitself learns to favor a given response for a particular<strong>in</strong>put stimulus, via Hebbian learn<strong>in</strong>g from <strong>the</strong> <strong>in</strong>put layer.Thus, <strong>the</strong> BG <strong>in</strong>itially learns which response to gate viaphasic changes <strong>in</strong> DA ensu<strong>in</strong>g from random cortical responses,and <strong>the</strong>n this learn<strong>in</strong>g transfers to cortex onceit starts to select <strong>the</strong> correct response more often thannot. This reflects <strong>the</strong> idea that <strong>the</strong> BG is not a stimulusresponsemodule, but ra<strong>the</strong>r modulates <strong>the</strong> gat<strong>in</strong>g of responsesthat are selected <strong>in</strong> cortex.two responses if its associated “Go” representation isstrong enough, facilitat<strong>in</strong>g its execution and suppress<strong>in</strong>gthat of <strong>the</strong> alternative response. If <strong>the</strong> probabilisticallydeterm<strong>in</strong>ed feedback to <strong>the</strong> selected response is positive,a phasic DA burst is applied <strong>in</strong> <strong>the</strong> plus phase, result<strong>in</strong>g<strong>in</strong> an enhanced Go representation and associated learn<strong>in</strong>g.Negative feedback results <strong>in</strong> a phasic dip of DA <strong>in</strong><strong>the</strong> plus phase, releas<strong>in</strong>g NoGo cells from suppressionand allow<strong>in</strong>g <strong>the</strong> network to learn not to gate <strong>the</strong> selectedresponse. Over <strong>the</strong> course of tra<strong>in</strong><strong>in</strong>g, networks <strong>in</strong>tegrateGo and NoGo signals <strong>in</strong> <strong>the</strong> context of different cue comb<strong>in</strong>ationsto learn when is most appropriate to gate sunand ra<strong>in</strong> responses.Performance measures <strong>in</strong>volve percentage of optimalresponses, ra<strong>the</strong>r than percentage of responses thatwere associated with (probabilistically determ<strong>in</strong>ed) positivefeedback provided to <strong>the</strong> network. Thus, <strong>in</strong>dividualresponses that had negative outcomes, but were actually<strong>the</strong> best choice accord<strong>in</strong>g to <strong>the</strong> odds, were scored correctly.Similarly, positive outcome responses that weresuboptimal were scored <strong>in</strong>correctly. These optimal respond<strong>in</strong>gmeasures are consistently used <strong>in</strong> <strong>the</strong> behavioralparadigms (e.g., Gluck et al., 2002). Of course, networkswere not tra<strong>in</strong>ed with this error measure but wereprovided <strong>the</strong> same probabilistic feedback that wouldhave been given to <strong>the</strong> human participant.Fur<strong>the</strong>r implementational details of <strong>the</strong> wea<strong>the</strong>r predictiontask are described <strong>in</strong> <strong>the</strong> Methods section.Probabilistic Classification SimulationsThe wea<strong>the</strong>r prediction (WP) task (Knowlton et al.,1994) <strong>in</strong>volves present<strong>in</strong>g cards made up of four possiblecues that have different probabilities of be<strong>in</strong>g associatedwith “ra<strong>in</strong>” or “sun”. The predictability of <strong>the</strong> <strong>in</strong>dividualcues is 75.6%, 57.5%, 42.5% and 24.4%. Actual trials <strong>in</strong>volvepresent<strong>in</strong>g from one to three cues simultaneously,for a total of 14 cue comb<strong>in</strong>ations, mak<strong>in</strong>g it difficult tobecome explicitly aware of <strong>the</strong> probability structure.In <strong>the</strong> network, cues are presented <strong>in</strong> <strong>the</strong> <strong>in</strong>put, andpotential responses are immediately but weakly activated<strong>in</strong> premotor cortex (figure 4). The BG gates one of <strong>the</strong>


Wea<strong>the</strong>r Prediction Trial: Positive FeedbackIntact(choice)Intact(feedback)Resp1 Resp2Resp1 Resp2Tonic DAInputOutputInputOutputSNcSNcD1D2Premotor CortexDA BurstEnhances Go,Suppresses NogoD1D2Premotor CortexStriatumStriatumGO−NOGO−GO−NOGO−−GP Ext−GP ExtGP Int−GP Int−ThalamusThalamusPD(choice)PD(feedback)Resp1 Resp2Resp1 Resp2Low Tonic DAInputOutputInputOutputSNcSNcD1D2Premotor CortexSmaller Burst:Less Go/NogoDifferentiationD1D2Premotor CortexStriatumStriatumGO−NOGO−GO−NOGO−−GP Ext−GP ExtGP Int−GP Int−ThalamusThalamusFigure 4: A positive feedback trial <strong>in</strong> <strong>the</strong> wea<strong>the</strong>r prediction task, for both <strong>in</strong>tact and “Park<strong>in</strong>son” networks. This trial consistsof two cues, represented by <strong>the</strong> two columns of active units <strong>in</strong> <strong>the</strong> Input layer. Intact(choice): Response selection (m<strong>in</strong>us phase)activity <strong>in</strong> <strong>the</strong> <strong>in</strong>tact network. Early <strong>in</strong> tra<strong>in</strong><strong>in</strong>g, <strong>the</strong> BG has not learned to gate ei<strong>the</strong>r response, as shown by an active GPi and<strong>in</strong>hibited thalamus. Premotor cortex (PMC) is weakly active due to direct connections from sensory <strong>in</strong>put. The most active (left)unit <strong>in</strong> PMC, correspond<strong>in</strong>g to “sun”, determ<strong>in</strong>es <strong>the</strong> Output response. Intact(feedback): Because <strong>the</strong> model “guessed” correctly,a phasic burst of DA fir<strong>in</strong>g occurs <strong>in</strong> <strong>the</strong> SNc. This has <strong>the</strong> effect of activat<strong>in</strong>g Go units associated with <strong>the</strong> selected response (viaD1 contrast enhancement), and suppress<strong>in</strong>g NoGo units (via D2 <strong>in</strong>hibition). Weights are adjusted based on differences <strong>in</strong> networkactivity between <strong>the</strong> m<strong>in</strong>us and plus phases. The enhanced Go representation <strong>in</strong> <strong>the</strong> plus phase drives learn<strong>in</strong>g to gate <strong>the</strong> “sun”response. PD(choice): Response selection (m<strong>in</strong>us phase) activity <strong>in</strong> <strong>the</strong> PD network, for <strong>the</strong> same trial. Note <strong>the</strong> reduced numberof <strong>in</strong>tact SNc units, which causes <strong>the</strong> NoGo units to be more active. Aga<strong>in</strong>, <strong>the</strong>re is no BG gat<strong>in</strong>g early <strong>in</strong> tra<strong>in</strong><strong>in</strong>g and <strong>the</strong> modelselects a random response from sensory-motor projections. PD(feedback): Correct guess<strong>in</strong>g leads to a phasic burst of DA <strong>in</strong> <strong>the</strong>plus phase. However, this phasic burst is not as effective because it applies to only one SNc unit, and <strong>the</strong>refore only weakly activatesmore Go units while some NoGo activity persists. Reduced dynamic range of DA <strong>in</strong> <strong>the</strong> PD network results <strong>in</strong> less difference <strong>in</strong>activity levels between <strong>the</strong> two phases of network settl<strong>in</strong>g, caus<strong>in</strong>g less learn<strong>in</strong>g.


Wea<strong>the</strong>r Prediction Trial: Negative FeedbackIntact(choice)Intact(feedback)Resp1 Resp2Resp1 Resp2Tonic DAInputOutputInputOutputSNcSNcD1D2Premotor CortexDA DipActivatesNogoD1D2Premotor CortexGP IntStriatumGO−Gat<strong>in</strong>gof Incorrect"ra<strong>in</strong>" Response−NOGO−GP Ext−ThalamusStriatumGO−GP Int−NOGO−GP Ext−ThalamusPD(choice)PD(feedback)Resp1 Resp2Resp1 Resp2Low Tonic DAInputOutputInputOutputSNcSNcD1D2Premotor CortexSmaller Dip:Less NogoActivationD1D2Premotor CortexPrior Go Learn<strong>in</strong>gSufficient toGate ResponseStriatumStriatumGO−NOGO−GO−NOGO−−GP Ext−GP ExtGP Int−GP Int−ThalamusThalamusFigure 5: A negative feedback trial <strong>in</strong> <strong>the</strong> wea<strong>the</strong>r prediction task for both <strong>in</strong>tact and “Park<strong>in</strong>son” networks. Intact(choice): As<strong>in</strong>gle cue is presented. Based on previous learn<strong>in</strong>g, <strong>the</strong> Go units for “ra<strong>in</strong>” are sufficiently active to gate that response, <strong>in</strong>dicated by<strong>the</strong> <strong>in</strong>hibition of right GPi units and dis<strong>in</strong>hibition of <strong>the</strong> right thalamus unit. Intact(feedback): The feedback on this particular trialis negative (due to probabilistic outcomes), shown by a phasic dip of DA fir<strong>in</strong>g <strong>in</strong> <strong>the</strong> <strong>the</strong> SNc. The lack of DA removes suppressionof NoGo units via D2 receptors, which are <strong>the</strong>n more active than <strong>the</strong> Go units. The DA dip <strong>the</strong>refore drives NoGo learn<strong>in</strong>g to <strong>the</strong><strong>in</strong>correct response selected for this cue. Note <strong>the</strong> Output layer displays <strong>the</strong> target response for <strong>the</strong> trial, but this is not used as atra<strong>in</strong><strong>in</strong>g signal: <strong>the</strong> only signal driv<strong>in</strong>g learn<strong>in</strong>g is <strong>the</strong> change <strong>in</strong> SNc DA. PD(choice): The PD network has also learned to gate<strong>the</strong> “ra<strong>in</strong>” response for this same trial, based on previous learn<strong>in</strong>g. PD(feedback): Feedback is <strong>in</strong>correct, and <strong>the</strong> phasic dip of DA<strong>in</strong> <strong>the</strong> SNc leads to activation of some NoGo units. However, <strong>the</strong> PD network already had low amounts of tonic DA, caus<strong>in</strong>g anoverall propensity for NoGo learn<strong>in</strong>g, so this phasic dip is smaller and <strong>the</strong>refore not as effective. Reduced dynamic range of DA <strong>in</strong><strong>the</strong> PD network results <strong>in</strong> less difference <strong>in</strong> activity levels between <strong>the</strong> two phases of network settl<strong>in</strong>g.


Probabilistic Reversal Trial <strong>in</strong> Reversal StageIntact(choice)Intact(feedback)Resp1 Resp2Resp1 Resp2InputOutputInputOutputSNcSNcD1D2Premotor CortexPrepotentResponseFromAcquisitionStageDA DipActivatesNogoD1D2Premotor CortexStriatumStriatumGO−NOGO−GO−NOGO−−GP Ext−GP ExtGP Int−GP Int−ThalamusThalamusOD(choice)OD(feedback)Resp1 Resp2Resp1 Resp2InputOutputInputOutputSNcSNcD1D2Premotor CortexSmaller Dip:Less NogoActivationD1D2Premotor CortexSame PrepotentResponse Learned<strong>in</strong> Acquisition StageStriatumStriatumGO−NOGO−GO−NOGO−−GP Ext−GP ExtGP Int−GP Int−ThalamusThalamusFigure 6: A trial <strong>in</strong> <strong>the</strong> probabilistic reversal task. Two cues are presented at <strong>the</strong> <strong>in</strong>put, and <strong>the</strong> model has to select one of<strong>the</strong>m (see Methods for details). The trial shown here is <strong>in</strong> <strong>the</strong> reversal stage, dur<strong>in</strong>g which <strong>the</strong> model has to learn “NoGo” to<strong>the</strong> prepotent response before it can switch to select<strong>in</strong>g <strong>the</strong> alternative. Reduced dynamic range of DA <strong>in</strong> <strong>the</strong> “overdosed” (OD)network causes degraded ability to learn NoGo. Intact(choice): Based on learn<strong>in</strong>g <strong>in</strong> <strong>the</strong> acquisition stage, <strong>the</strong> network chooses <strong>the</strong>stimulus on <strong>the</strong> right (Resp2). Intact(feedback): In <strong>the</strong> reversal stage, this choice is <strong>in</strong>correct. Phasic dips <strong>in</strong> SNc release NoGounits from suppression, so that <strong>the</strong> network can subsequently learn not to perseverate. OD(choice): The same trial is presented to<strong>the</strong> OD network. Unimpaired Go learn<strong>in</strong>g <strong>in</strong> <strong>the</strong> acquisition stage results <strong>in</strong> selection of <strong>the</strong> same response as <strong>the</strong> <strong>in</strong>tact network.OD(feedback): A phasic dip is applied to SNc on <strong>in</strong>correct trials <strong>in</strong> <strong>the</strong> reversal stage, but a residual level of activation due tosimulated medication results <strong>in</strong> weaker activation of NoGo units. The network <strong>the</strong>n takes longer to unlearn <strong>the</strong> <strong>in</strong>itial response,caus<strong>in</strong>g reversal deficits.


Simulated Park<strong>in</strong>sonismPark<strong>in</strong>son’s disease was simulated by “lesion<strong>in</strong>g”three out of four SNc units so that <strong>the</strong>y were tonically<strong>in</strong>active, represent<strong>in</strong>g <strong>the</strong> cell death of approximately 75-80% of dopam<strong>in</strong>ergic neurons <strong>in</strong> this area (bottom offigures 4 and 5). This has <strong>the</strong> effect of reduc<strong>in</strong>g tonicDA <strong>in</strong> <strong>the</strong> m<strong>in</strong>us phase, as well as phasic DA dur<strong>in</strong>gfeedback <strong>in</strong> <strong>the</strong> plus phase. Although <strong>the</strong> percentage <strong>in</strong>crease/decrease<strong>in</strong> phasic fir<strong>in</strong>g relative to basel<strong>in</strong>e is <strong>the</strong>same for <strong>in</strong>tact and Park<strong>in</strong>son networks, <strong>the</strong> total amountof DA is reduced by a factor of four, result<strong>in</strong>g <strong>in</strong> reduceddynamic range of <strong>the</strong> DA signal. <strong>Dynamic</strong> rangeis critical for learn<strong>in</strong>g appropriate Go/NoGo representationsfrom error feedback, as network weights are adjustedbased on difference <strong>in</strong> activity states <strong>in</strong> <strong>the</strong> twophases of network settl<strong>in</strong>g. NoGo learn<strong>in</strong>g is degraded<strong>in</strong> PD networks because tonic levels of DA are alreadylow, so <strong>the</strong> phasic dip dur<strong>in</strong>g negative feedback has lesseffect. Go learn<strong>in</strong>g is degraded because limited amountsof available DA reduce <strong>the</strong> potency of phasic bursts, activat<strong>in</strong>gless of a Go representation dur<strong>in</strong>g positive feedback.Less DA <strong>in</strong> PD also dim<strong>in</strong>ishes <strong>the</strong> contrast enhancementeffects of D1 receptor stimulation, fur<strong>the</strong>r weaken<strong>in</strong>g<strong>the</strong> learn<strong>in</strong>g of Go signals. Smaller bursts of DA <strong>in</strong>PD nets led to less contrast enhancement dur<strong>in</strong>g positivefeedback, by reduc<strong>in</strong>g <strong>the</strong> change <strong>in</strong> unit activation ga<strong>in</strong>and threshold by a factor of four (see Methods). Thus,degraded Go learn<strong>in</strong>g is exacerbated because of reducedcontrast enhancement that would normally amplify <strong>the</strong>Go signal dur<strong>in</strong>g positive feedback.Test<strong>in</strong>g <strong>the</strong> Contribution of <strong>the</strong> Indirect PathwayS<strong>in</strong>ce o<strong>the</strong>r BG models <strong>in</strong>clude <strong>the</strong> direct, but notnecessarily <strong>the</strong> <strong>in</strong>direct pathway, <strong>the</strong> contribution of <strong>the</strong>latter was evaluated <strong>in</strong> two different conditions. First,<strong>the</strong> <strong>in</strong>direct pathway was disconnected: NoGo units <strong>in</strong><strong>the</strong> striatum no longer projected to <strong>the</strong> GPe. In <strong>the</strong>senetworks, NoGo units were still activated by synaptic<strong>in</strong>put and modulated by DA, but had no effect on BGoutput. Instead, GPe units tonically <strong>in</strong>hibited <strong>the</strong> GPi.This manipulation elim<strong>in</strong>ates <strong>the</strong> effects of <strong>the</strong> <strong>in</strong>directpathway so that all discrim<strong>in</strong>ation learn<strong>in</strong>g must be accomplishedby compar<strong>in</strong>g Go associations <strong>in</strong> <strong>the</strong> directpathway. Although it is technically possible that this manipulationsimply lowers <strong>the</strong> threshold for gat<strong>in</strong>g <strong>in</strong> <strong>the</strong>direct pathway by provid<strong>in</strong>g more tonic <strong>in</strong>hibition to GPi(i.e., less overall NoGo), this possibility was accountedfor by vary<strong>in</strong>g <strong>the</strong> strength of GPe-GPi <strong>in</strong>hibitory projectionsfrom zero to maximal <strong>in</strong>hibition. Results reportedbelow are for <strong>the</strong> best of <strong>the</strong>se cases, which is still substantiallyworse than <strong>the</strong> full BG model.A second test of <strong>the</strong> <strong>in</strong>direct pathway was conductedto evaluate <strong>the</strong> role of response-specific NoGo representations.The hypo<strong>the</strong>sis advocated <strong>in</strong> this paper is thateach response develops both Go and NoGo representationsas a result of positive and negative feedback, andthat <strong>the</strong>se representations compete <strong>in</strong> order to facilitateor suppress <strong>the</strong> response. However, it is also possiblethat only Go representations <strong>in</strong> <strong>the</strong> striatum are responsespecific,and that <strong>the</strong> <strong>in</strong>direct pathway represents a moreglobal NoGo signal. In <strong>the</strong> model, this condition wastested by mak<strong>in</strong>g GPe units <strong>in</strong> each column project toboth columns of GPi (ra<strong>the</strong>r than to just <strong>the</strong> correspond<strong>in</strong>gcolumn), so that NoGo units <strong>in</strong> <strong>the</strong> striatum had <strong>the</strong>same effect on both responses. Because this may amountto more overall <strong>in</strong>hibition from GPe to GPi, once aga<strong>in</strong><strong>the</strong> strength of <strong>the</strong>se <strong>in</strong>hibitory projections was variedfrom zero to maximal and <strong>the</strong> best case results reported.Probabilistic Reversal SimulationIn <strong>the</strong> probabilistic reversal (PR) task (Swa<strong>in</strong>sonet al., 2000), <strong>the</strong> participant is presented with two stimulion a touch-sensitive computer screen and has to chooseone of <strong>the</strong>m (by touch<strong>in</strong>g it). Feedback provided aftereach response is probabilistic, with a 80:20% ratio of re<strong>in</strong>forcementfor <strong>the</strong> ‘correct’ stimulus. After a number oftrials, <strong>the</strong> probabilities of correct feedback are suddenlyreversed, unbeknownst to <strong>the</strong> participant.In <strong>the</strong> model, tra<strong>in</strong><strong>in</strong>g <strong>in</strong>volves two stages: acquisitionand reversal. In <strong>the</strong> acquisition stage, <strong>the</strong> networkhad to learn which of two stimuli to select. The probabilitiesassociated with correct response were 80% for select<strong>in</strong>gstimulus 1 and 20% for select<strong>in</strong>g stimulus 2. Positivefeedback was associated with a DA burst, and negativefeedback was associated with a dip. After 50 blocksof trials, <strong>the</strong>se probabilities were reversed, and <strong>the</strong> feedbackeffects of DA were necessary to learn NoGo to <strong>the</strong>prepotent learned response (figure 6). Once NoGo representationsare strong enough to suppress gat<strong>in</strong>g, randomcortical activity leads to sometimes choos<strong>in</strong>g <strong>the</strong> oppositeresponse and DA re<strong>in</strong>forcement of <strong>the</strong> correspond<strong>in</strong>gGo representation, enabl<strong>in</strong>g reversal.Fur<strong>the</strong>r implementational details of <strong>the</strong> PR task aredescribed <strong>in</strong> <strong>the</strong> Methods section.Simulated DA MedicationTo model DA overdose <strong>in</strong> <strong>the</strong> ventral striatum ofmedicated patients (Gotham et al., 1988; Swa<strong>in</strong>son et al.,2000; Cools et al., 2001, 2003), all SNc units rema<strong>in</strong>ed<strong>in</strong>tact. This reflects <strong>the</strong> fact that <strong>the</strong> ventral striatum,which is recruited <strong>in</strong> this task, is relatively spared fromdopam<strong>in</strong>ergic damage <strong>in</strong> moderate PD. The differencebetween <strong>in</strong>tact and “overdosed” networks was simply an<strong>in</strong>crease <strong>in</strong> overall level of DA. In <strong>the</strong> m<strong>in</strong>us phase, <strong>the</strong>tonic level of DA was <strong>in</strong>creased from an SNc unit activationvalue of 0.5 to 0.65, reflect<strong>in</strong>g <strong>the</strong> greater basel<strong>in</strong>e


level of DA. Negative feedback <strong>in</strong> <strong>the</strong> plus phase resulted<strong>in</strong> an activation value of 0.25, <strong>in</strong>stead of zero SNc activation.This is still a phasic dip relative to <strong>the</strong> tonic level,but is meant to simulate <strong>the</strong> possibility that DA releasehas less dynamic range <strong>in</strong> <strong>the</strong> overdose case (see abovefor elaboration and justification). Positive feedback resulted<strong>in</strong> SNc unit activation of 1.0.The DA overdose manipulations degraded networks’ability to learn NoGo representations dur<strong>in</strong>g negativefeedback, as NoGo units were suppressed by <strong>the</strong> <strong>in</strong>creasedlevels of DA. This selectively impairs reversallearn<strong>in</strong>g, <strong>in</strong> which “NoGo” must be learned to a prepotentresponse.Simulation ResultsProbabilistic ClassificationDespite not hav<strong>in</strong>g an explicit supervised tra<strong>in</strong><strong>in</strong>gsignal, simulated phasic DA release dur<strong>in</strong>g error feedbackallowed <strong>in</strong>tact networks to extract <strong>the</strong> probabilitystructure, scor<strong>in</strong>g 77% optimal respond<strong>in</strong>g after 200 trialsof tra<strong>in</strong><strong>in</strong>g (figure 7). “Park<strong>in</strong>son” networks were impaired,only scor<strong>in</strong>g 64% optimal respond<strong>in</strong>g. Statisticalanalysis <strong>in</strong>dicated that this difference was highly significant(F(1,24) = 20.8, p = 0.0001). Two o<strong>the</strong>r conditionswere run to evaluate <strong>the</strong> contribution of responsespecificNoGo representations <strong>in</strong> <strong>the</strong> <strong>in</strong>direct pathway.Networks with a disconnected <strong>in</strong>direct pathway were significantlyimpaired relative to <strong>in</strong>tact networks (65% optimalrespond<strong>in</strong>g, F(1,24) = 11.9, p = 0.002). Similarly,networks that had both direct and <strong>in</strong>direct pathways butonly had global NoGo representations (i.e., NoGo units<strong>in</strong> <strong>the</strong> striatum affected both responses non-selectively)were also impaired ( 64% optimal respond<strong>in</strong>g, F(1, 24)= 7.13, p = 0.013). In both <strong>the</strong>se cases, parameters weresearched to ensure that impairments were not simply dueto an overall threshold for respond<strong>in</strong>g, by vary<strong>in</strong>g <strong>the</strong>strength of <strong>in</strong>hibitory connections from <strong>the</strong> GPe to <strong>the</strong>GPi – results reported here are for <strong>the</strong> best cases over <strong>the</strong>range from zero to maximal <strong>in</strong>hibition (which for bothcases <strong>in</strong>volved an <strong>in</strong>hibitory strength of approximately70% of that <strong>in</strong> <strong>the</strong> full model).Probabilistic ReversalThe results for <strong>the</strong> PR task were clearcut: “overdosed”networks were selectively impaired at revers<strong>in</strong>g<strong>the</strong> probabilistic discrim<strong>in</strong>ation (figure 8). Both <strong>in</strong>tactand overdosed networks were able to acquire <strong>the</strong> <strong>in</strong>itial80:20% probabilistic discrim<strong>in</strong>ation, with no significantdifferences between performance <strong>in</strong> <strong>the</strong> acquisitionphase (97.8 and 98.2 % optimal respond<strong>in</strong>g after 200 trials,F(1, 24) = 0.4, n.s.). Intact networks consistentlyPercent ErrorIntactPDNo IndirGlobal Nogo50Probabilistic Classification45403530252015100 50 100 150 200TrialFigure 7: Wea<strong>the</strong>r prediction task learn<strong>in</strong>g curves, averagedover 25 networks for each condition. Intact: Full BG modelwith direct and <strong>in</strong>direct pathways modulated by phasic changes<strong>in</strong> simulated DA dur<strong>in</strong>g error feedback; PD: simulated Park<strong>in</strong>son’sdisease, modeled by lesion<strong>in</strong>g 75% of dopam<strong>in</strong>ergic units<strong>in</strong> SNc; No Indir: BG model with <strong>the</strong> <strong>in</strong>direct pathway disconnectedfrom <strong>the</strong> striatum to <strong>the</strong> GPe; Global NoGo: Full BGmodel <strong>in</strong> which NoGo representations globally suppress all responsesnon-selectively. PD networks are impaired at learn<strong>in</strong>g<strong>the</strong> probabilistic structure, due to impoverished phasic changes<strong>in</strong> DA <strong>in</strong> response to feedback. Models without <strong>the</strong> <strong>in</strong>directpathway or with global NoGo representations have reduced discrim<strong>in</strong>abilitybecause <strong>the</strong>y can only compare <strong>the</strong> strength ofGo representations to decide which response to facilitate. Incontrast, <strong>in</strong>tact models can use response-specific Go and NoGorepresentations that develop over tra<strong>in</strong><strong>in</strong>g <strong>in</strong> order to more selectivelyfacilitate and suppress responses.learned to reverse this discrim<strong>in</strong>ation after a fur<strong>the</strong>r 200trials of tra<strong>in</strong><strong>in</strong>g, with 78% optimal respond<strong>in</strong>g. In contrast,overdosed networks were slower to reverse <strong>the</strong> <strong>in</strong>itialdiscrim<strong>in</strong>ation, only atta<strong>in</strong><strong>in</strong>g 64% optimal respond<strong>in</strong>gafter <strong>the</strong> same amount of tra<strong>in</strong><strong>in</strong>g. These reversallearn<strong>in</strong>g differences were significant (F(1, 24) = 4.80, p= 0.038). DA depletion, as is <strong>the</strong> case for severe PD <strong>in</strong><strong>the</strong> ventral striatum, resulted <strong>in</strong> non-selective impairment<strong>in</strong> both stages (not shown).DiscussionThis work presents a <strong>the</strong>oretical basis for cognitiveprocedural learn<strong>in</strong>g functions of <strong>the</strong> basal ganglia. Aneural network model <strong>in</strong>corporat<strong>in</strong>g known biologicalconstra<strong>in</strong>ts provides a mechanistic account of cognitivedeficits observed <strong>in</strong> PD patients. A key aspect of <strong>the</strong>


Percent ErrorIntact’Overdosed’100Probabilistic Reversal90807060504030201000 10 20 30 40BlockFigure 8: Probabilistic reversal results for <strong>in</strong>tact networksand for those with simulated dopam<strong>in</strong>e “overdose”, averagedover 25 networks for each condition. Each block consists of10 trials; reversal of stimulus-outcome probabilities occurredat block 20. Overdosed networks were selectively impaired atlearn<strong>in</strong>g this reversal, despite perform<strong>in</strong>g as well as <strong>in</strong>tact networks<strong>in</strong> <strong>the</strong> acquisition phase. A smaller phasic dip <strong>in</strong> DAdur<strong>in</strong>g negative feedback resulted <strong>in</strong> dim<strong>in</strong>ished ability to learn“NoGo” to <strong>the</strong> prepotent response that was learned <strong>in</strong> <strong>the</strong> <strong>in</strong>itialacquisition.model is that phasic changes <strong>in</strong> dopam<strong>in</strong>e dur<strong>in</strong>g errorfeedback are critical for <strong>the</strong> implicit learn<strong>in</strong>g of stimulusreward-responsecont<strong>in</strong>gencies, as <strong>in</strong> probabilistic classificationand reversal.In brief, <strong>the</strong> model <strong>in</strong>cludes competitive dynamicsbetween striatal cells <strong>in</strong> <strong>the</strong> direct and <strong>in</strong>direct pathwaysof <strong>the</strong> BG that facilitate or suppress a given response.The cells that detect conditions to facilitate a responseprovide a “Go” signal, whereas those that suppress responsesprovide a “NoGo” signal. Habit learn<strong>in</strong>g is supportedby this circuitry because DA release dynamicallymodulates <strong>the</strong> excitability and synaptic plasticity of <strong>the</strong>sepathways so that <strong>the</strong> most re<strong>in</strong>forc<strong>in</strong>g responses are subsequentlyfacilitated, while those that are more ambiguousare suppressed.Simulated Park<strong>in</strong>sonism, by reduc<strong>in</strong>g <strong>the</strong> amount ofDA <strong>in</strong> <strong>the</strong> model and thus its modulatory effects on Goand NoGo representations, produced qualitatively similarresults to those observed <strong>in</strong> PD patients learn<strong>in</strong>g <strong>the</strong>wea<strong>the</strong>r prediction task (Knowlton et al., 1994; Knowltonet al., 1996). Less DA led to less contrast enhancementand lower ability to resolve Go/NoGo associationdifferences needed for discrim<strong>in</strong>at<strong>in</strong>g between subtly differentresponse re<strong>in</strong>forcement histories.Although it could be argued that <strong>the</strong> simulated <strong>in</strong>directpathway is superfluous and that discrim<strong>in</strong>ation learn<strong>in</strong>gcan happen <strong>in</strong> <strong>the</strong> direct pathway alone, networkswith disrupted <strong>in</strong>direct pathways were substantially impaired.These results held even when <strong>the</strong> threshold forfacilitat<strong>in</strong>g responses <strong>in</strong> <strong>the</strong> direct pathway was systematicallyvaried, ensur<strong>in</strong>g that <strong>the</strong> <strong>in</strong>direct pathway doesnot simply set an overall threshold that feasibly couldbe implemented <strong>in</strong> <strong>the</strong> direct pathway alone. Ra<strong>the</strong>r, <strong>the</strong><strong>in</strong>direct pathway makes a genu<strong>in</strong>e contribution by develop<strong>in</strong>gresponse-specific NoGo representations that competewith Go representations to enhance discrim<strong>in</strong>ability.Without response-specific NoGo representations, <strong>the</strong> BGis likely to signal “Go” to whichever response happens tobe slightly more active <strong>in</strong> premotor cortex.Medication-Dependent DeficitsThe model also offers some <strong>in</strong>sight as to why patientson medication are impaired at probabilistic reversal.Simulated dopam<strong>in</strong>e overdose produced qualitativelysimilar results to those observed <strong>in</strong> medicatedpatients <strong>in</strong> probabilistic reversal (Gotham et al., 1988;Swa<strong>in</strong>son et al., 2000; Cools et al., 2001). That is, “overdosed”networks were selectively impaired <strong>in</strong> <strong>the</strong> reversalstage, but performed as well or better than controlnetworks <strong>in</strong> <strong>the</strong> <strong>in</strong>itial discrim<strong>in</strong>ation.The model provides a mechanistic description of howDA medication may lead to reversal impairments that isgenerally consistent with <strong>the</strong> “overdose” hypo<strong>the</strong>sis advocatedby authors of <strong>the</strong> behavioral studies. In <strong>in</strong>tactnetworks, negative feedback <strong>in</strong> <strong>the</strong> reversal stage wasassociated with phasic dips <strong>in</strong> DA, which led to activationof striatal NoGo cells by transiently releas<strong>in</strong>g <strong>the</strong>mfrom <strong>the</strong> <strong>in</strong>hibitory <strong>in</strong>fluence of DA. The activation of<strong>the</strong>se cells led to suppress<strong>in</strong>g <strong>the</strong> execution of prepotentresponses, allow<strong>in</strong>g networks to learn to reverse respond<strong>in</strong>g.In “overdosed” networks, a residual level of DA dur<strong>in</strong>gnegative feedback cont<strong>in</strong>ued to suppress NoGo cells(via simulated D2 receptors), lead<strong>in</strong>g to response perseveration.This account predicts that tonic stimulation ofjust <strong>the</strong> D2 receptor should produce similar reversal impairments,which are <strong>in</strong>deed observed <strong>in</strong> both healthy humansand non-human primates adm<strong>in</strong>istered D2 agonists(Mehta, Swa<strong>in</strong>son, Ogilvie, Sahakian, & Robb<strong>in</strong>s, 2000;Smith, Neill, & Costall, 1999).The above account is also consistent with eventrelatedfMRI studies <strong>in</strong> humans show<strong>in</strong>g caudate activationdur<strong>in</strong>g <strong>the</strong> reception of negative feedback (Monchi,Petrides, Petre, Worsley, & Dagher, 2001), and ventralstriatum activation dur<strong>in</strong>g <strong>the</strong> f<strong>in</strong>al reversal error <strong>in</strong> aprobabilistic reversal task (Cools et al., 2002). Phasicdips <strong>in</strong> DA dur<strong>in</strong>g negative feedback should cause an <strong>in</strong>crease<strong>in</strong> fMRI signal due to <strong>the</strong> transient activation ofNoGo cells. In <strong>the</strong> probabilistic reversal task, striatal ac-


tivation was found specifically dur<strong>in</strong>g <strong>the</strong> trial that participantsused negative feedback to successfully reverse<strong>the</strong>ir behavior on subsequent trials.Alternative mechanisms are possible to expla<strong>in</strong> reversallearn<strong>in</strong>g deficits <strong>in</strong> patients tak<strong>in</strong>g dopam<strong>in</strong>ergic medication.For <strong>in</strong>stance, medication may simply prevent unlearn<strong>in</strong>g<strong>in</strong> direct pathway Go cells, ra<strong>the</strong>r than suppress<strong>in</strong>g<strong>the</strong> learn<strong>in</strong>g of NoGo cells <strong>in</strong> <strong>the</strong> current model. Insupport of this <strong>the</strong>ory, rats with L-Dopa-<strong>in</strong>duced dysk<strong>in</strong>esiahad a selective impairment <strong>in</strong> <strong>the</strong> depotentiation(i.e., reversal of LTP) of corticostriatal synapses (Picconiet al., 2003), ostensibly due to changes <strong>in</strong> <strong>the</strong> D1 receptorpathway. However, it is not clear whe<strong>the</strong>r this depotentiationimpairment alone can account for reversal deficits:normal depotentiation takes <strong>in</strong> <strong>the</strong> order of ten m<strong>in</strong>utesand <strong>the</strong>refore is not sufficient to <strong>in</strong>duce reversal <strong>in</strong> a matterof a few trials. Fur<strong>the</strong>rmore, <strong>the</strong> fact that D2 agonistsimpair reversal implicates a role of <strong>the</strong> <strong>in</strong>direct pathwayto activate “NoGo” representations and actively avoid situations.Through its push-pull circuitry, <strong>the</strong> model suggeststhat <strong>the</strong> BG is specialized to quickly learn changes<strong>in</strong> reward<strong>in</strong>g <strong>in</strong>formation.Relation to O<strong>the</strong>r Models of DA <strong>in</strong> BGO<strong>the</strong>r computational models of <strong>the</strong> BG have focusedmore on how response selection and reward <strong>in</strong>formationmay be implemented <strong>in</strong> biological circuitry (e.g., Brownet al., 2004; Gurney et al., 2001; Beiser & Houk, 1998),but to my knowledge have not attempted to model cognitiveimplicit learn<strong>in</strong>g tasks. Thus it is unclear howprior BG models would account for medicated and nonmedicatedcognitive impairments <strong>in</strong> PD. Never<strong>the</strong>less, acomparative analysis of <strong>the</strong> critical features of <strong>the</strong> currentmodel with that of o<strong>the</strong>rs may explicitly demonstrateboth consistencies across models as well as novel aspectsof <strong>the</strong> current model that account for behavioral phenomena.The model builds on earlier work on <strong>the</strong> <strong>in</strong>teractionsbetween <strong>the</strong> BG and prefrontal cortex <strong>in</strong> work<strong>in</strong>g memory(Frank et al., 2001), but differs from it <strong>in</strong> three keyaspects. First, <strong>the</strong> earlier model only <strong>in</strong>cluded <strong>the</strong> direct“Go” pathway, as its “NoGo” responses to taskirrelevantstimuli were assumed and hand-wired. Thecurrent model <strong>in</strong>cludes <strong>the</strong> compet<strong>in</strong>g processes of <strong>the</strong><strong>in</strong>direct pathway, and whe<strong>the</strong>r to gate (Go) or suppress(NoGo) a response is learned. Second, <strong>the</strong> current model<strong>in</strong>cludes <strong>the</strong> SNc/VTA so that <strong>the</strong> role of dopam<strong>in</strong>e canbe implemented, with simulated D1 and D2 receptors <strong>in</strong><strong>the</strong> striatum. Third, <strong>the</strong> current model does not <strong>in</strong>clude<strong>the</strong> prefrontal cortex or ma<strong>in</strong>tenance of <strong>in</strong>formation overtime, as <strong>the</strong> simulated tasks do not <strong>in</strong>volve work<strong>in</strong>g memory.Instead, <strong>the</strong> cortical layer <strong>in</strong> <strong>the</strong> model is a simplerpremotor cortex, represent<strong>in</strong>g just two different possibleresponses (although <strong>in</strong> pr<strong>in</strong>ciple it could be extended to<strong>in</strong>clude several responses).The model is consistent with o<strong>the</strong>r models of DA<strong>in</strong> <strong>the</strong> BG (Taylor & Taylor, 1999; Monchi, Taylor, &Dagher, 2000; Brown et al., 2004), <strong>in</strong> that dopam<strong>in</strong>e hasa performance effect, by differentially modulat<strong>in</strong>g excitability<strong>in</strong> <strong>the</strong> direct and <strong>in</strong>direct pathways. A notabledifference is that <strong>in</strong> <strong>the</strong> current model DA also enhancescontrast <strong>in</strong> <strong>the</strong> direct pathway by excit<strong>in</strong>g highly activeunits and suppress<strong>in</strong>g weakly active units, <strong>in</strong>stead of be<strong>in</strong>gglobally excitatory. This modulatory effect is importantfor select<strong>in</strong>g among compet<strong>in</strong>g responses, and is motivatedby <strong>the</strong> observed D1 receptor activation excitationof striatal cells <strong>in</strong> <strong>the</strong> “up-state”, but <strong>in</strong>hibition of those<strong>in</strong> <strong>the</strong> “down-state”(Hernandez-Lopez et al., 1997).Perhaps a more substantial difference is that whileprior models emphasize <strong>the</strong> tonic effects of DA, <strong>the</strong> currentmodel also <strong>in</strong>corporates phasic changes <strong>in</strong> DA releasedur<strong>in</strong>g positive and negative feedback. Positivefeedback results <strong>in</strong> a phasic burst of DA, transiently bias<strong>in</strong>g<strong>the</strong> direct pathway and suppress<strong>in</strong>g <strong>the</strong> <strong>in</strong>directpathway. Negative feedback results <strong>in</strong> a phasic dip <strong>in</strong>DA, and has <strong>the</strong> opposite effect. Learn<strong>in</strong>g is drivenby <strong>the</strong>se transient changes: weight values are modifiedbased on <strong>the</strong> difference between phases of response selection(hypo<strong>the</strong>sized to <strong>in</strong>volve moderate amounts ofDA) and error feedback (hypo<strong>the</strong>sized to <strong>in</strong>volve phasic<strong>in</strong>creases/decreases <strong>in</strong> DA). Re<strong>in</strong>forcement learn<strong>in</strong>g accountsof DA <strong>in</strong> <strong>the</strong> BG have been suggested by o<strong>the</strong>rs(e.g., Doya, 2000; Suri & Schultz, 1999), and allow flexiblelearn<strong>in</strong>g of reward<strong>in</strong>g and non-reward<strong>in</strong>g behaviorsthat may change over time.Ano<strong>the</strong>r dist<strong>in</strong>guish<strong>in</strong>g feature <strong>in</strong> <strong>the</strong> current modelis that a particular response is selected by premotor cortex— <strong>the</strong> BG simply gates this response if it detects<strong>the</strong> conditions to be appropriate (i.e. predictive of reward).Thus, <strong>the</strong> BG is not a stimulus-response module,but <strong>in</strong>stead modulates <strong>the</strong> efficacy of responses be<strong>in</strong>gselected <strong>in</strong> cortex. This is consistent with observationsthat striatal fir<strong>in</strong>g occurs after that <strong>in</strong> premotor cortexand supplementary motor area (e.g., Crutcher & Alexander,1990; Alexander & Crutcher, 1990a, see also M<strong>in</strong>k,1996). 4 Thus, <strong>in</strong> contrast to <strong>the</strong> long held assumptionthat <strong>the</strong> BG <strong>in</strong>itiates motor responses, this model suggeststhat it facilitates or gates responses that are be<strong>in</strong>gconsidered <strong>in</strong> premotor cortex. The model fur<strong>the</strong>r suggeststhat cortical learn<strong>in</strong>g of response selection is mediatedby way of DA reward system <strong>in</strong> <strong>the</strong> BG, but thatonce this learn<strong>in</strong>g is achieved, <strong>the</strong> cortex itself selects4 Some, but not all, striatal neurons even fire after <strong>the</strong> onset of movement.Note that this observation does not conflict with <strong>the</strong> hypo<strong>the</strong>sizedrole of <strong>the</strong> BG to gate or facilitate responses, because fir<strong>in</strong>g that occursafter <strong>the</strong> onset of movement could be associated with ei<strong>the</strong>r term<strong>in</strong>at<strong>in</strong>g<strong>the</strong> <strong>in</strong>itiated motor program or suppress<strong>in</strong>g o<strong>the</strong>r compet<strong>in</strong>g programs.


<strong>the</strong> response. This is consistent with <strong>the</strong> observation thatPark<strong>in</strong>son’s patients have specific trouble learn<strong>in</strong>g novelmotor actions, and with <strong>the</strong> hypo<strong>the</strong>sis that <strong>the</strong> BG isonly necessary for <strong>the</strong> learn<strong>in</strong>g of new categories but notfor categorization behavior <strong>in</strong> experts, which may be mediateddirectly from perceptual to motor areas (Ashbyet al., 1998).Implications for Frontal DeficitsThat PD patients have both implicit learn<strong>in</strong>g andfrontal deficits — which are not <strong>in</strong>tuitively related —suggests that a better understand<strong>in</strong>g of BG specializationwould <strong>in</strong>form us about how cognition operates asa functional system. The present work only modeled implicitprocesses <strong>in</strong> habit learn<strong>in</strong>g, which do not have aprefrontal component (Knowlton et al., 1996). However,<strong>the</strong> same general structure of <strong>the</strong> model may be extendedto <strong>in</strong>clude prefrontal cortex, provid<strong>in</strong>g <strong>in</strong>sight <strong>in</strong>to <strong>the</strong>roles of DA and <strong>the</strong> BG <strong>in</strong> executive and attentional processes.Indeed, <strong>the</strong>se roles may be very similar to those<strong>in</strong> implicit learn<strong>in</strong>g, with <strong>the</strong> major difference be<strong>in</strong>g <strong>the</strong>type of representations modulated <strong>in</strong> <strong>the</strong> targeted corticalarea — motor representations <strong>in</strong> premotor cortex, andwork<strong>in</strong>g memory <strong>in</strong> prefrontal cortex.Based on <strong>the</strong> general suggestions of basal ganglia<strong>in</strong>volvement <strong>in</strong> prefrontal circuits made by Alexanderand colleagues (Alexander et al., 1986; Alexander,Crutcher, & DeLong, 1990; Middleton & Strick, 2000),we developed a computational model that explicitly formulated<strong>the</strong> role of <strong>the</strong> BG <strong>in</strong> work<strong>in</strong>g memory (Franket al., 2001). We suggested that just as <strong>the</strong> BG facilitatesmotor command execution <strong>in</strong> premotor cortex by dis<strong>in</strong>hibit<strong>in</strong>gor “releas<strong>in</strong>g <strong>the</strong> brakes”, it may also facilitate<strong>the</strong> updat<strong>in</strong>g of work<strong>in</strong>g memory <strong>in</strong> prefrontal cortex.If a given stimulus was learned to be task-relevant and<strong>the</strong>refore suitable for ma<strong>in</strong>tenance <strong>in</strong> PFC, a “Go” signalwould be executed by activation of <strong>the</strong> BG direct pathway,<strong>the</strong>reby dis<strong>in</strong>hibit<strong>in</strong>g <strong>the</strong> thalamus and “gat<strong>in</strong>g” <strong>the</strong>updat<strong>in</strong>g of PFC.In <strong>the</strong> above work (Frank et al., 2001), we briefly discussed<strong>the</strong> potential role of dopam<strong>in</strong>e, suggest<strong>in</strong>g that itwould be important for <strong>the</strong> learn<strong>in</strong>g of task-relevant stimulivia its reward signal<strong>in</strong>g and modulation of synapticplasticity. In ongo<strong>in</strong>g work (O’Reilly & Frank, submitted),we are develop<strong>in</strong>g <strong>the</strong>se ideas <strong>in</strong> a computationalmodel that <strong>in</strong>tegrates ventral and dorsal striatum withprefrontal cortex ma<strong>in</strong>tenance to demonstrate how complexwork<strong>in</strong>g memory tasks may be learned. Consistentwith <strong>the</strong> present model, DA bursts <strong>in</strong> <strong>the</strong> BG preferentiallyactivate cells <strong>in</strong> <strong>the</strong> direct pathway via D1 receptors,while suppress<strong>in</strong>g cells <strong>in</strong> <strong>the</strong> <strong>in</strong>direct pathway viaD2 receptors. Thus, DA <strong>in</strong> <strong>the</strong> BG may have <strong>the</strong> effectof boost<strong>in</strong>g <strong>the</strong> updat<strong>in</strong>g of work<strong>in</strong>g memory by bias<strong>in</strong>g<strong>the</strong> direct pathway to w<strong>in</strong> <strong>the</strong> competition for BG output.A phasic dip <strong>in</strong> DA allows <strong>the</strong> BG to learn not to updatetask-irrelevant <strong>in</strong>formation. The role of DA <strong>in</strong> <strong>the</strong> PFCmay be quite different, help<strong>in</strong>g to robustly ma<strong>in</strong>ta<strong>in</strong> <strong>in</strong>formationover time and <strong>in</strong> <strong>the</strong> face of <strong>in</strong>terfer<strong>in</strong>g stimuli(Durstewitz, Seamans, & Sejnowski, 2000), depend<strong>in</strong>gon optimal levels of DA (Goldman-Rakic, 1996).With <strong>the</strong> above model <strong>in</strong> m<strong>in</strong>d, consider <strong>the</strong> effect ofdopam<strong>in</strong>ergic dysfunction <strong>in</strong> <strong>the</strong> BG or PFC. A lack ofDA <strong>in</strong> <strong>the</strong> BG would lead to too little updat<strong>in</strong>g of relevant<strong>in</strong>formation <strong>in</strong>to PFC, just as it leads to too littleexecution of motor commands. Conversely, too muchDA <strong>in</strong> <strong>the</strong> BG would lead to excessive updat<strong>in</strong>g of PFC,as observed <strong>in</strong> L-Dopa <strong>in</strong>duced motor tics and dysk<strong>in</strong>esia<strong>in</strong> Park<strong>in</strong>son’s disease. F<strong>in</strong>ally, a suboptimal level ofDA <strong>in</strong> <strong>the</strong> PFC would lead to <strong>in</strong>sufficient ma<strong>in</strong>tenance oftask-relevant <strong>in</strong>formation. Any of <strong>the</strong>se DA dysfunctionswould lead to “frontal-like” cognitive deficits.While it is well accepted that <strong>the</strong> <strong>in</strong>tegrity of <strong>the</strong>PFC is necessary for attentional processes, it is not clearwhe<strong>the</strong>r attentional deficits seen <strong>in</strong> PD patients are dueto dopam<strong>in</strong>ergic pathology with<strong>in</strong> <strong>the</strong> PFC itself, orwhe<strong>the</strong>r DA damage <strong>in</strong> <strong>the</strong> BG is sufficient to producefrontal-like deficits due to its <strong>in</strong>terconnections with PFC.In support of <strong>the</strong> latter possibility, a positive correlationwas found between measures of attention and work<strong>in</strong>gmemory and <strong>the</strong> level of L-Dopa accumulation <strong>in</strong><strong>the</strong> striatum of PD patients (Remy, Jackson, & Ribeiro,2000). In monkeys, D2 agents have effects on work<strong>in</strong>gmemory tasks when applied systemically, but not whendirectly <strong>in</strong>fused <strong>in</strong>to PFC (Granon, Passetti, Thomas,Dalley, Everitt, & Robb<strong>in</strong>s, 2000; Arnsten, Cai, Murphy,& Goldman-Rakic, 1994; Arnsten, Cai, Steere, &Goldman-Rakic, 1995), suggest<strong>in</strong>g that D2 receptors areonly <strong>in</strong>directly <strong>in</strong>volved <strong>in</strong> frontal processes.The current framework holds that D2 effects onwork<strong>in</strong>g memory are due to modulation of <strong>the</strong> BGthreshold for updat<strong>in</strong>g PFC. With high D2 stimulation<strong>the</strong>re is less “NoGo” so <strong>the</strong> threshold is lowered, andwith low D2 stimulation <strong>the</strong> threshold is raised. Notethat a raised threshold means that task-irrelevant stimuliare less likely to get updated. This is consistentwith observations that DA depletion to <strong>the</strong> BG (whichshould raise <strong>the</strong> threshold for updat<strong>in</strong>g PFC) actuallymakes monkeys less distractible to task-irrelevant stimulidur<strong>in</strong>g acquisition of an attentional set (Crofts, Dalley,Coll<strong>in</strong>s, Van Denderen, Everitt, Robb<strong>in</strong>s, & Roberts,2001). However, this higher threshold may also make<strong>the</strong>m more rigid <strong>in</strong> what to pay attention to, so that <strong>the</strong>yare impaired <strong>in</strong> task-set switch<strong>in</strong>g.


Model PredictionsThe ma<strong>in</strong> assumption built <strong>in</strong>to <strong>the</strong> model (supportedby data reviewed above) is that positive and negativefeedback lead to transient bursts and dips <strong>in</strong> DA.The model shows that <strong>the</strong>se phasic changes can lead tosystems-level effects that modulate <strong>the</strong> BG threshold forfacilitat<strong>in</strong>g/suppress<strong>in</strong>g cortical commands. Bursts ofDA suppress <strong>the</strong> NoGo pathway and sharpen representations<strong>in</strong> <strong>the</strong> Go pathway. Phasic dips of DA have <strong>the</strong>opposite effect, releas<strong>in</strong>g <strong>the</strong> <strong>in</strong>direct pathway from suppressionand allow<strong>in</strong>g <strong>the</strong> model to learn “NoGo” to <strong>the</strong><strong>in</strong>correct response. A number of testable predictions canbe derived from this model at both neural and behaviorallevels.At <strong>the</strong> neural level, <strong>the</strong> model predicts that phasicchanges <strong>in</strong> DA support “Hebbian” learn<strong>in</strong>g by modulat<strong>in</strong>gneuronal excitability <strong>in</strong> <strong>the</strong> <strong>in</strong>direct pathway via D2receptors. By transiently suppress<strong>in</strong>g NoGo cells, DAbursts should lead to LTD. Conversely, by transientlyexcit<strong>in</strong>g NoGo cells, DA dips should lead to LTP. Thisprediction has not yet been tested directly (dur<strong>in</strong>g endogenousDA bursts/dips), but is consistent with observationsthat selective stimulation of D2 receptors leads toLTD, whereas D2 blockade leads to LTP (Calabresi et al.,1997).The model suggests that a large dynamic range <strong>in</strong>DA release is necessary for learn<strong>in</strong>g subtle differencesbetween positive and negative re<strong>in</strong>forcement values ofresponses. <strong>Dopam<strong>in</strong>e</strong> agonists and antagonists may restrictthis range to be at <strong>the</strong> high and low ends of <strong>the</strong> DAspectrum, respectively. In a probabilistic re<strong>in</strong>forcementparadigm, participants adm<strong>in</strong>istered D2 agonists shouldeasily learn to respond to stimuli hav<strong>in</strong>g greater than50% re<strong>in</strong>forcement probabilities, whereas those tak<strong>in</strong>gD2 antagonists (and PD patients) should have an easiertime learn<strong>in</strong>g to avoid stimuli with lower re<strong>in</strong>forcementprobabilities. This is because D2 agonists bias <strong>the</strong> directpathway to be more active (by suppress<strong>in</strong>g <strong>the</strong> <strong>in</strong>directpathway), enhanc<strong>in</strong>g <strong>the</strong> learn<strong>in</strong>g and execution of Goresponses. Park<strong>in</strong>son’s disease or D2 antagonists shouldbias <strong>the</strong> <strong>in</strong>direct pathway, enabl<strong>in</strong>g <strong>the</strong> learn<strong>in</strong>g of NoGoresponses.In addition to modulat<strong>in</strong>g <strong>the</strong> threshold for learn<strong>in</strong>gand execut<strong>in</strong>g responses, DA should play a similar role <strong>in</strong>modulat<strong>in</strong>g <strong>the</strong> threshold for updat<strong>in</strong>g work<strong>in</strong>g memory,discussed <strong>in</strong> <strong>the</strong> previous section. D2 agonists shouldlower this threshold, <strong>in</strong>creas<strong>in</strong>g <strong>the</strong> amount of updat<strong>in</strong>g,whereas D2 antagonists should reduce <strong>the</strong> amount of updat<strong>in</strong>g.Whe<strong>the</strong>r <strong>the</strong>se drugs improve or worsen work<strong>in</strong>gmemory performance should depend on both <strong>the</strong> basel<strong>in</strong>elevel of updat<strong>in</strong>g <strong>in</strong> <strong>the</strong> <strong>in</strong>dividual (see Kimberg,D’Esposito, & Farah, 1997), and <strong>the</strong> amount of conflict/<strong>in</strong>terference<strong>in</strong> <strong>the</strong> particular task. Specifically, ifa work<strong>in</strong>g memory task <strong>in</strong>volves distract<strong>in</strong>g <strong>in</strong>formation,a lower threshold for updat<strong>in</strong>g may result <strong>in</strong> <strong>in</strong>creaseddistractibility and impulsiveness because <strong>the</strong> participantmay have trouble ignor<strong>in</strong>g task-irrelevant stimuli. Conversely,if <strong>the</strong> task simply <strong>in</strong>volves recall<strong>in</strong>g a previouslystored memory <strong>in</strong> <strong>the</strong> absence of distract<strong>in</strong>g <strong>in</strong>formation,D2 agonists should improve performance because <strong>the</strong>yshould cause more updat<strong>in</strong>g and subsequent ma<strong>in</strong>tenanceof work<strong>in</strong>g memory.Model Limitations and Future DirectionsThe model does not differentiate between differentparts of <strong>the</strong> striatum. In fact, <strong>the</strong> same model is usedto simulate probabilistic classification and reversal tasks,which are thought to depend on <strong>the</strong> dorsal and ventralstriatum, respectively. It is at present unclear why <strong>the</strong>setwo tasks, which both <strong>in</strong>volve learn<strong>in</strong>g response selectionvia trial-and-error feedback, should <strong>in</strong>volve separatestriato-cortical circuits. However, one possibility isthat <strong>the</strong> differences lie <strong>in</strong> <strong>the</strong> content of cortical targets:dorsal striatum modulates motor <strong>in</strong>formation <strong>in</strong> premotorcortex, whereas ventral striatum targets reward <strong>in</strong>formation<strong>in</strong> orbitofrontal cortex (OFC) (Alexander et al.,1986; Gottfried, ODoherty, & Dolan, 2003). In reversallearn<strong>in</strong>g, a stimulus that has a prepotent reward valuesuddenly becomes non-reward<strong>in</strong>g, and OFC representationsmay be especially important to support top-downactivation of “NoGo” representations <strong>in</strong> <strong>the</strong> ventral striatum.In probabilistic classification, response selectionprocesses for discrim<strong>in</strong>at<strong>in</strong>g among multiple cues maymore heavily tax <strong>the</strong> dorsal striatum. The functional contributionsof <strong>the</strong>se two circuits work<strong>in</strong>g <strong>in</strong> tandem will bemore explicitly explored <strong>in</strong> future work.The present model highlights <strong>the</strong> importance of dynamicDA modulation <strong>in</strong> <strong>the</strong> BG. However, it does notaddress <strong>the</strong> bra<strong>in</strong> mechanisms which cause phasic burstsand dips <strong>in</strong> DA dur<strong>in</strong>g positive and negative feedback.Instead, this was assumed, and phasic changes <strong>in</strong> DAwere simply set, depend<strong>in</strong>g on probabilistic feedback.This implementation does not capture <strong>the</strong> fact that aslearn<strong>in</strong>g progresses and rewards become expected, phasicbursts of DA no longer occur dur<strong>in</strong>g reward but are<strong>in</strong>stead transferred to an earlier stimulus that predictsreward (Ljungberg, Apicella, & Schultz, 1992), as implemented<strong>in</strong> temporal differences (TD) re<strong>in</strong>forcementlearn<strong>in</strong>g algorithms (Sutton, 1988). The simple implementationdescribed <strong>in</strong> this paper is sufficient for two reasons:a) <strong>the</strong> tasks are probabilistic, so that positive feedbackis never fully predicted (and may <strong>the</strong>refore alwaysresult <strong>in</strong> a DA burst), and b) even if phasic changes <strong>in</strong> DAare reduced as outcomes are more expected, this wouldsimply lead to stable performance once <strong>the</strong> probabilisticstructure is learned. A TD-like mechanism is critical


<strong>in</strong> more complex work<strong>in</strong>g memory tasks, <strong>in</strong> which <strong>the</strong>model has to update and ma<strong>in</strong>ta<strong>in</strong> <strong>in</strong>formation <strong>in</strong> one trialto obta<strong>in</strong> positive re<strong>in</strong>forcement a few time steps later.The above discussion makes it clear that <strong>the</strong> role ofDA <strong>in</strong> <strong>the</strong> BG <strong>in</strong> modulat<strong>in</strong>g prefrontal function is morecomplex and needs to be fur<strong>the</strong>r <strong>in</strong>vestigated. Whileit was briefly considered how DA might modulate <strong>the</strong>threshold for updat<strong>in</strong>g <strong>in</strong>formation <strong>in</strong> PFC, it has not yetbeen tested whe<strong>the</strong>r this would allow for selective updat<strong>in</strong>gof task-relevant <strong>in</strong>formation, but not that of irrelevantdistract<strong>in</strong>g <strong>in</strong>formation. By merg<strong>in</strong>g aspects of our <strong>in</strong>itialBG-PFC model of work<strong>in</strong>g memory (Frank et al., 2001)with that of <strong>the</strong> current model, we are currently explor<strong>in</strong>g<strong>the</strong>se issues (O’Reilly & Frank, submitted).ConclusionWhen systems level <strong>in</strong>teractions of multiple bra<strong>in</strong> regionsare <strong>in</strong>volved, computational <strong>in</strong>vestigations providea valuable complement to experimental bra<strong>in</strong> research.The current model of DA <strong>in</strong> <strong>the</strong> BG provides a work<strong>in</strong>ghypo<strong>the</strong>sis that can be tested experimentally and behaviorally.MethodsImplementational DetailsThe model is implemented us<strong>in</strong>g a subset of<strong>the</strong> Leabra framework (O’Reilly & Munakata, 2000;O’Reilly, 1998). The two relevant properties of thisframework for <strong>the</strong> present model are a) <strong>the</strong> use of a po<strong>in</strong>tneuron activation function; and b) <strong>the</strong> k-W<strong>in</strong>ners-Take-All (kWTA) <strong>in</strong>hibition function that models <strong>the</strong> effectsof <strong>in</strong>hibitory neurons. These two properties are described<strong>in</strong> detail <strong>in</strong> <strong>the</strong> above references, and also <strong>in</strong> Frank et al.(2001). Only specific methods related to <strong>the</strong> presentmodel are described below.Probabilistic Classification SimulationFor <strong>the</strong> wea<strong>the</strong>r prediction task, <strong>the</strong> same probabilisticstructure was used as <strong>in</strong> <strong>the</strong> orig<strong>in</strong>al study (Knowltonet al., 1994), <strong>in</strong> terms of both frequency of presentationof <strong>in</strong>dividual cue comb<strong>in</strong>ations, and <strong>the</strong>ir probabilityof be<strong>in</strong>g associated with an outcome of “ra<strong>in</strong>” or “sun”.Thus, patterns were presented to <strong>the</strong> network consist<strong>in</strong>gof one to three cues <strong>in</strong> blocks of 100 trials. Each cuewas represented by a s<strong>in</strong>gle column of units <strong>in</strong> <strong>the</strong> <strong>in</strong>putlayer. Thus, a trial that <strong>in</strong>cludes cues 1 and 3 toge<strong>the</strong>rwas simulated by activat<strong>in</strong>g <strong>the</strong> first and third column ofunits <strong>in</strong> <strong>the</strong> <strong>in</strong>put (figure 4). This cue comb<strong>in</strong>ation waspresented <strong>in</strong> six out of 100 trials (for frequency of 6 %),of which five of <strong>the</strong>m would <strong>in</strong>volve positive feedbackfor a ra<strong>in</strong> response, and negative feedback for a sun response(for a probability of 83.3 % ra<strong>in</strong>). The two potentialresponses <strong>in</strong> premotor cortex were left and right, correspond<strong>in</strong>gto buttons pushed by participants to respond“sun” or “ra<strong>in</strong>”.Probabilistic Reversal SimulationJust as <strong>in</strong> <strong>the</strong> WP task, each of <strong>the</strong> two stimuli <strong>in</strong><strong>the</strong> PR task were represented by a column of units <strong>in</strong><strong>the</strong> <strong>in</strong>put layer. Unlike <strong>the</strong> WP task, <strong>the</strong> potential responses<strong>in</strong>volve directly select<strong>in</strong>g one of <strong>the</strong> two stimuli<strong>in</strong> <strong>the</strong> <strong>in</strong>put (i.e. two alternative forced choice). Theactual motor response (i.e., left/right) is not as relevant<strong>in</strong> this case, because <strong>the</strong> correct stimulus appears just asoften on <strong>the</strong> left and right sides of <strong>the</strong> screen. Instead,responses are likely selected relative to a particular stimulusthat is be<strong>in</strong>g considered; <strong>the</strong> participant can ei<strong>the</strong>rselect it, or switch to <strong>the</strong> o<strong>the</strong>r stimulus. To address this<strong>in</strong> <strong>the</strong> model, a stimulus selection process was implemented.In any given trial, attention is randomly directedto one stimulus with only contextual <strong>in</strong>formation about<strong>the</strong> o<strong>the</strong>r. Potential responses were simply to “approach”<strong>the</strong> attended stimulus, or to “switch” to <strong>the</strong> context stimulus.This was modeled by mak<strong>in</strong>g one of <strong>the</strong> stimulimore salient: <strong>the</strong> attended stimulus had all five units <strong>in</strong>its column fully activated, whereas <strong>the</strong> context stimulushad only 3 (randomly selected) units weakly activated,with a mean activation of 0.25 and a variance of 0.35. Asimilar method was implemented to model a two alternativeforced choice task <strong>in</strong> previous work (Frank, Rudy, &O’Reilly, 2003).Parameters for D1 Contrast EnhancementA simplified version of <strong>the</strong> Leabra activation functionis presented here, to provide context for <strong>the</strong> parametersassociated with contrast enhancement.Activation communicated to o<strong>the</strong>r cells (y j ) is athresholded (Θ) sigmoidal function of <strong>the</strong> membrane potentialwith ga<strong>in</strong> parameter γ:1y j (t) = () (1)11 +γ[V m(t)−Θ] +where [x] + is a threshold function that returns 0 if x < 0and x if X > 0. In actual implementation, a less discont<strong>in</strong>uousdeterm<strong>in</strong>istic function with a softer threshold isused (see O’Reilly, 1998; O’Reilly & Munakata, 2000),but <strong>the</strong> differences do not effect <strong>the</strong> contrast enhancementmanipulations.The default activation ga<strong>in</strong>, γ, is 600. The defaultmembrane potential fir<strong>in</strong>g threshold, Θ, is 0.25. These


parameters were used for tonic levels of DA. For contrastenhancement dur<strong>in</strong>g phasic DA spikes, <strong>the</strong> activationga<strong>in</strong> was <strong>in</strong>creased to 10000*k, and <strong>the</strong> thresholdwas <strong>in</strong>creased to 0.25 + 0.04*k, where k is <strong>the</strong> percentageof <strong>in</strong>tact SNc units (k = 1 for control networks;k = 0.25 for PD networks). This has <strong>the</strong> effect of suppress<strong>in</strong>gunits that do not meet <strong>the</strong> higher threshold, butenhanc<strong>in</strong>g activity <strong>in</strong> units that are above this threshold.Dur<strong>in</strong>g phasic dips of DA, <strong>the</strong> activation ga<strong>in</strong> was reducedto 600 - k*300, and <strong>the</strong> threshold was 0.25.AcknowledgmentsI am very grateful to Randy O’Reilly, Yuko Munakataand Amy Santamaria for valuable commentsand constructive criticism on earlier versions of thismanuscript. Supported by ONR grants N00014-00-1-0246 and N00014-03-1-0428, and NIH grants MH64445and MH069597.ReferencesAgid, Y., Ruberg, M., Hirsch, E., Raisman-Vozari, R.,Vyas, S., Faucheux, B., Michel, P., Kastner, A., Blanchard,V., Damier, P., Villares, J., & Zhang, P. (1993).Are dopam<strong>in</strong>ergic neurons selectively vulnerable toPark<strong>in</strong>son’s disease? Advances <strong>in</strong> Neurology, 60,148–164.Aizman, O., Brismar, H., Uhlen, P., Zettergren, E., Levet,A., Forssberg, H., Greengrad, P., & Aperia, A. (2000).Anatomical and physiological evidence for D 1 and D 2dopam<strong>in</strong>e receptor colocalization <strong>in</strong> neostriatal neurons.Nature Neuroscience, 3, 226–230.Alb<strong>in</strong>, R., Young, A., & Penney, J. (1989). The functionalanatomy of basal ganglia disorders. Trends <strong>in</strong>Neurosciences, 12, 366–375.Alexander, G., & Crutcher, M. (1990a). Preparation formovement: Neural representations of <strong>in</strong>tended direction<strong>in</strong> three motor areas of <strong>the</strong> monkey. Journal ofNeurophysiology, 64, 133–50.Alexander, G., Crutcher, M., & DeLong, M. (1990).<strong>Basal</strong> ganglia-thalamocortical circuits: Parallel substratesfor motor, oculomotor, ”prefrontal” and ”limbic”functions. In H. Uyl<strong>in</strong>gs, C. Van Eden, J.De Bru<strong>in</strong>, M. Corner, & M. Feenstra (Eds.), The prefrontalcortex: Its structure, function, and pathology(pp. 119–146). Amsterdam: Elsevier.Alexander, G. E., & Crutcher, M. D. (1990b). Functionalarchitecture of basal ganglia cicuits: neural substratesof parallel process<strong>in</strong>g. Trends <strong>in</strong> Neuroscience, 13,266–271.Alexander, G. E., DeLong, M. R., & Strick, P. L. (1986).Parallel organization of functionally segregated circuitsl<strong>in</strong>k<strong>in</strong>g basal ganglia and cortex. Annual Reviewof Neuroscience, 9, 357–381.Apicella, P., Scarnati, E., Ljungberg, T., & Schultz, W.(1992). Neuronal activity <strong>in</strong> monkey striatum relatedto <strong>the</strong> expectation of predictable environmentalevents. Journal of Neurophysiology, 68, 945–960.Arnsten, A., Cai, J., Murphy, B., & Goldman-Rakic, P.(1994). <strong>Dopam<strong>in</strong>e</strong> D 1 receptor mechanisms <strong>in</strong> <strong>the</strong>cognitive performance of young adult aged monkeys.Psychopharmacology, 116, 143–151.Arnsten, A., Cai, J., Steere, J., & Goldman-Rakic, P.(1995). <strong>Dopam<strong>in</strong>e</strong> D 2 receptor mechanisms contributeto age-related cognitive decl<strong>in</strong>e: <strong>the</strong> effectsof qu<strong>in</strong>pirole on memory and motor performance <strong>in</strong>monkeys. Journal of Neuroscience, 15, 3429–3439.Ashby, F., Alfonso-Reese, L., Turken, A., & Waldron,E. (1998). A neuropsychological <strong>the</strong>ory of multiplesystems <strong>in</strong> category learn<strong>in</strong>g. Psychological Review,105, 442–481.Ashby, F., Maddox, W., & Bohil, C. (2002). Observationalversus feedback tra<strong>in</strong><strong>in</strong>g <strong>in</strong> rule-based and<strong>in</strong>formation-<strong>in</strong>tegration category learn<strong>in</strong>g. Memory &Cognition, 30, 666–77.Ashby, F., Noble, S., Ell, S., Filoteo, J., & Waldron, E.(2003). Category learn<strong>in</strong>g deficits <strong>in</strong> Park<strong>in</strong>son’s Disease.Neuropsychology, 17, 133–42.Ashby, F., Queller, S., & Berretty, P. (1999). On <strong>the</strong> dom<strong>in</strong>anceof unidimensional rules <strong>in</strong> unsupervised categorization.Perception and Psychophysics, 61, 1178–1199.Bear, M. F., & Malenka, R. C. (1994). Synaptic plasticity:LTP and LTD. Current Op<strong>in</strong>ion <strong>in</strong> Neurobiology,4, 389–399.Beiser, D. G., & Houk, J. C. (1998). Model of corticalbasalganglionic process<strong>in</strong>g: encod<strong>in</strong>g <strong>the</strong> serial orderof sensory events. Journal of Neurophysiology, 79,3168–3188.Bloch, B., & LeMo<strong>in</strong>e, C. (1994). Neostriatal dopam<strong>in</strong>ereceptors. Trends <strong>in</strong> Neurosciences, 17, 3–4.Brown, J., Bullock, D., & Grossberg, S. (2004). Howlam<strong>in</strong>ar frontal cortex and basal ganglia circuits <strong>in</strong>teractto control planned and reactive saccades. NeuralNetworks, 17, 471–510.Calabresi, P., Saiardi, A., Pisani, A., Baik, J., Centonze,D., Mercuri, N., Bernardi, G., & Borrelli, E. (1997).Abnormal synaptic plasticity <strong>in</strong> <strong>the</strong> striatum of micelack<strong>in</strong>g dopam<strong>in</strong>e D2 receptors. Journal of Neuroscience,17, 4536–44.


Centonze, D., Picconi, B., Gubell<strong>in</strong>i, P., Bernardi, G., &Calabresi, P. (2001). <strong>Dopam<strong>in</strong>e</strong>rgic control of synapticplasticity <strong>in</strong> <strong>the</strong> dorsal striatum. European Journalof Neuroscience, 13, 1071–1077.Chevalier, G., & Deniau, J. M. (1990). Dis<strong>in</strong>hibition asa basic process <strong>in</strong> <strong>the</strong> expression of striatal functions.Trends <strong>in</strong> Neurosciences, 13, 277–280.Cohen, J. D., Braver, T. S., & Brown, J. W. (2002). Computationalperspectives on dopam<strong>in</strong>e function <strong>in</strong> prefrontalcortex. Current Op<strong>in</strong>ion <strong>in</strong> Neurobiology, 12,223–229.Cohen, J. D., & Servan-Schreiber, D. (1992). Context,cortex, and dopam<strong>in</strong>e: A connectionist approach tobehavior and biology <strong>in</strong> schizophrenia. PsychologicalReview, 99, 45–77.Cools, R., Barker, R., Sahakian, B., & Robb<strong>in</strong>s, T.(2001). Enhanced or impaired cognitive function <strong>in</strong>Park<strong>in</strong>son’s disease as a function of dopam<strong>in</strong>ergicmedication and task demands. Cerebral Cortex, 11,1136–1143.Cools, R., Barker, R., Sahakian, B., & Robb<strong>in</strong>s, T.(2003). L-Dopa medication remediates cognitive <strong>in</strong>flexibility,but <strong>in</strong>creases impulsivity <strong>in</strong> patients withPark<strong>in</strong>son’s disease. Neuropsychologia, 41, 1431–1441.Cools, R., Clark, L., Owen, A. M., & Robb<strong>in</strong>s, T. W.(2002). Def<strong>in</strong><strong>in</strong>g <strong>the</strong> neural mechanisms of probabilisticreversal learn<strong>in</strong>g us<strong>in</strong>g event-related functionalmagnetic resonance imag<strong>in</strong>g. Journal of Neuroscience,22, 4563–4567.Crofts, H., Dalley, J., Coll<strong>in</strong>s, P., Van Denderen, J.,Everitt, B., Robb<strong>in</strong>s, T., & Roberts, A. (2001). Differentialeffects of 6-OHDA lesions of <strong>the</strong> frontal cortexand caudate nucleus on <strong>the</strong> ability to acquire anattentional set. Cerebral Cortex, 11, 1015–1026.Crutcher, M., & Alexander, G. (1990). Movementrelatedneuronal activity selectively cod<strong>in</strong>g ei<strong>the</strong>r directionor muscle pattern <strong>in</strong> three motor areas of <strong>the</strong>monkey. Journal of Neurophysiology, 64, 151–163.Dias, R., Robb<strong>in</strong>s, T. W., & Roberts, A. C. (1996). Dissociation<strong>in</strong> prefrontal cortex of affective and attentionalshifts. Nature, 380, 69.Doya, K. (2000). Complementary roles of <strong>the</strong> basal gangliaand cerebellum <strong>in</strong> learn<strong>in</strong>g and motor control.Current Op<strong>in</strong>ion <strong>in</strong> Neurobiology, 10, 732–739.Dubois, B., Malapani, C., Ver<strong>in</strong>, M., Rogelet, P., Deweer,B., & Pillon, B. (1994). Cognitive functions <strong>in</strong> <strong>the</strong>basal ganglia: <strong>the</strong> model of Park<strong>in</strong>son disease. RevueNeurologique (Paris), 150, 763–770.Durstewitz, D., Seamans, J. K., & Sejnowski, T. J.(2000). <strong>Dopam<strong>in</strong>e</strong>-mediated stabilization of delayperiodactivity <strong>in</strong> a network model of prefrontal cortex.Journal of Neurophysiology, 83, 1733.Eyny, Y. S., & Horvitz, J. C. (2003). Oppos<strong>in</strong>g roles ofD1 and D2 receptors <strong>in</strong> appetitive condition<strong>in</strong>g. Journalof Neuroscience, 23, 1584–1587.Filion, M., & Tremblay, L. (1991). Abnormal spontaneousactivity of globus pallidus neurons <strong>in</strong> monkeywith MPTP-<strong>in</strong>duced park<strong>in</strong>sonism. Bra<strong>in</strong> Research,547, 142–151.F<strong>in</strong>ch, D. (1999). Plasticity of responses to synaptic<strong>in</strong>puts <strong>in</strong> rat ventral striatal neurons after repeatedadm<strong>in</strong>istration of <strong>the</strong> dopam<strong>in</strong>e D2 antagonist raclopride.Synapse, 31, 297–301.Foote, S., & Morrison, J. (1987). Extrathalamic modulationof cortex. Annual Review of Neuroscience, 10,67–85.Frank, M., Rudy, J., & O’Reilly, R. (2003). Transitivity,flexibility, conjunctive representations and <strong>the</strong> hippocampus:II. A computational analysis. Hippocampus,13, 341–54.Frank, M. J., Loughry, B., & O’Reilly, R. C. (2001). Interactionsbetween <strong>the</strong> frontal cortex and basal ganglia<strong>in</strong> work<strong>in</strong>g memory: A computational model. Cognitive,Affective, and Behavioral Neuroscience, 1, 137–160.Gerfen, C. (1992). The neostriatal mosaic: multiple levelsof compartmental organization <strong>in</strong> <strong>the</strong> basal ganglia.Annual Review of Neuroscience, 15, 285–320.Gerfen, C. (2000). Molecular effects of dopam<strong>in</strong>e onstriatal projection pathways. Trends <strong>in</strong> Neurosciences,23, S64–S70.Gerfen, C., & Keefe, K. (1994). Neostriatal dopam<strong>in</strong>ereceptors. Trends <strong>in</strong> Neurosciences, 17, 2–3.Gerfen, C., Keefe, K., & Gauda, E. (1995). D 1 and D 2dopam<strong>in</strong>e receptor function <strong>in</strong> <strong>the</strong> striatum: Coactivationof D 1 - and D 2 -dopam<strong>in</strong>e receptors on separatepopulations of neurons results <strong>in</strong> potentiated immediateearly gene response <strong>in</strong> D 1 -conta<strong>in</strong><strong>in</strong>g neurons.Journal of Neuroscience, 15, 8167–8176.Gerfen, C., & Wilson, C. (1996). The basal ganglia. InL. Swanson, A. Bjorkland, & T. Hokfelt (Eds.), Handbookof chemical neuroanatomy. vol 12: Integratedsystems of <strong>the</strong> CNS. (pp. 371–468). Amsterdam: Elsevier.Gluck, M. A., Shohamy, D., & Myers, C. (2002). How dopeople solve <strong>the</strong> ”wea<strong>the</strong>r prediction” task?: Individualvariability <strong>in</strong> strategies for probabilistic categorylearn<strong>in</strong>g. Learn<strong>in</strong>g and Memory, 9, 408–418.


Goldman-Rakic, P. (1996). Regional and cellular fractionationof work<strong>in</strong>g memory. Proceed<strong>in</strong>gs of <strong>the</strong> NationalAcademy of Sciences, 93, 13473–13480.Gotham, A., Brown, R., & Marsden, C. (1988). ’Frontal’cognitive function <strong>in</strong> patients with Park<strong>in</strong>son’s disease’on’ and ’off’ levodopa. Bra<strong>in</strong>, 111, 299–321.Gottfried, J. A., ODoherty, J., & Dolan, R. J. (2003).Neuroscience: Encod<strong>in</strong>g predictive reward value <strong>in</strong>human amygdala and orbitofrontal cortex. Science,301, 1104–1106.Granon, S., Passetti, F., Thomas, K., Dalley, J., Everitt,B., & Robb<strong>in</strong>s, T. (2000). Enhanced and impaired attentionalperformance after <strong>in</strong>fusion of D1 dopam<strong>in</strong>ergicreceptor agents <strong>in</strong>to rat prefrontal cortex. Journalof Neuroscience, 20, 1208–15.Gurney, K., Prescott, T. J., & Redgrave, P. (2001). Acomputational model of action selection <strong>in</strong> <strong>the</strong> basalganglia. I. a new functional anatomy. Biological Cybernetics,84, 401–410.Hay, J., Moscovitch, M., & Lev<strong>in</strong>e, B. (2002). Dissociat<strong>in</strong>ghabit and recollection: evidence from Park<strong>in</strong>son’sdisease, amnesia and focal lesion patients. Neuropsychologia,40, 1324–1334.Hebb, D. O. (1949). The organization of behavior. NewYork: Wiley.Henik, A., S<strong>in</strong>gh, J., Beckley, D., & Rafal, R. (1993).Dis<strong>in</strong>hibtion of automatic word read<strong>in</strong>g <strong>in</strong> Park<strong>in</strong>son’sdisease. Cortex, 29, 589–99.Hernandez-Lopez, S., Bargas, J., Surmeier, D., Reyes,A., & Galarraga, E. (1997). D1 receptor activation enhancesevoked discharge <strong>in</strong> neostriatal medium sp<strong>in</strong>yneurons by modulat<strong>in</strong>g an L-type Ca 2+ conductance.Journal of Neuroscience, 17, 3334–42.Hernandez-Lopez, S., Tkatch, T., Perez-Garci, E.,Galarraga, E., Bargas, J., Hamm, H., & Surmeier,D. (2000). D2 dopam<strong>in</strong>e receptors <strong>in</strong> striatalmedium sp<strong>in</strong>y neurons reduce L-type Ca 2+ currentsand excitability via a novel PLCβ1-IP 3 -calc<strong>in</strong>eur<strong>in</strong>signal<strong>in</strong>gcascade. Journal of Neuroscience, 20,8987–95.Hikosaka, O. (1989). Role of basal ganglia <strong>in</strong> <strong>in</strong>itiationof voluntary movements. In M. A. Arbib,& S. Amari (Eds.), <strong>Dynamic</strong> <strong>in</strong>teractions <strong>in</strong> neuralnetworks: Models and data (pp. 153–167). Berl<strong>in</strong>:Spr<strong>in</strong>ger-Verlag.Hikosaka, O. (1998). Neural systems for control of voluntaryaction–a hypo<strong>the</strong>sis. Advances <strong>in</strong> Biophysics,35, 81–102.Hollerman, J., & Schultz, W. (1998). <strong>Dopam<strong>in</strong>e</strong> neuronsreport an error <strong>in</strong> <strong>the</strong> temporal prediction of rewarddur<strong>in</strong>g learn<strong>in</strong>g. Nature Neuroscience, 1, 304–309.Holroyd, C. B., & Coles, M. G. H. (2002). The neuralbasis of human error process<strong>in</strong>g: Re<strong>in</strong>forcementlearn<strong>in</strong>g, dopam<strong>in</strong>e, and <strong>the</strong> error-related negativity.Psychological Review, 109, 679–709.Ince, E., Ciliax, B., & Levey, A. (1997). Differential expressionof D 1 and D 2 dopam<strong>in</strong>e and m4 muscar<strong>in</strong>icacteylchol<strong>in</strong>e receptor prote<strong>in</strong>s <strong>in</strong> identified striatonigralneurons. Synapse, 27, 257–366.Jackson, G., Jackson, S., Harrison, J., Henderson, L., &Kennard, C. (1995). Serial reaction time learn<strong>in</strong>g andPark<strong>in</strong>son’s disease: evidence for a procedural learn<strong>in</strong>gdeficit. Neuropsychologia, 33, 577–593.Jell<strong>in</strong>ger, K. (2002). Recent developments <strong>in</strong> <strong>the</strong> pathologyof Park<strong>in</strong>son’s disease. Journal of Neural TransmissionSupplement, 62, 347–76.Kerr, J., & Wickens, J. (2001). <strong>Dopam<strong>in</strong>e</strong> D-1/D-5 receptoractivation is required for long-term potentiation<strong>in</strong> <strong>the</strong> rat neostriatum <strong>in</strong> vitro. Journal of Neurophysiology,85, 117–124.Kimberg, D. Y., D’Esposito, M., & Farah, M. J. (1997).Effects of bromocript<strong>in</strong>e on human subjects dependon work<strong>in</strong>g memory capacity. Neuroreport, 8, 3581–3585.Kish, S., Shannak, K., & Hornykiewicz, O. (1988). Unevenpattern of dopam<strong>in</strong>e loss <strong>in</strong> <strong>the</strong> striatum of patientswith idiopathic Park<strong>in</strong>son’s disease. New EnglandJournal of Medec<strong>in</strong>e, 318, 876–880.Kitai, S., Sugimori, M., & Kocsis, J. (1976). Excitatorynature of dopam<strong>in</strong>e <strong>in</strong> <strong>the</strong> nigro-caudate pathway. ExperimentalBra<strong>in</strong> Research, 24, 351–363.Knowlton, B. J., Mangels, J. A., & Squire, L. R. (1996). aneostriatal habit learn<strong>in</strong>g system <strong>in</strong> humans. Science,273, 1399.Knowlton, B. J., Squire, L. R., & Gluck, M. A. (1994).Probabilistic category learn<strong>in</strong>g <strong>in</strong> amnesia. Learn<strong>in</strong>gand Memory, 1, 1–15.Ljungberg, T., Apicella, P., & Schultz, W. (1992). Responsesof monkey dopam<strong>in</strong>e neurons dur<strong>in</strong>g learn<strong>in</strong>gof behavioral reactions. Journal of Neurophysiology,67, 145–163.Maddox, W., Ashby, F., & Bohil, C. (2003). Delayedfeedback effects on rule-based and <strong>in</strong>formation<strong>in</strong>tegrationcategory learn<strong>in</strong>g. Journal of ExperimentalPsychology: Learn<strong>in</strong>g, Memory and Cognition.Maddox, W., & Filoteo, J. (2001). Striatal contributionsto category learn<strong>in</strong>g: Quantitative model<strong>in</strong>g of simplel<strong>in</strong>ear and complex non-l<strong>in</strong>ear rule learn<strong>in</strong>g <strong>in</strong> patientswith Park<strong>in</strong>son’s Disease. Journal of <strong>the</strong> InternationalNeuropsychological Society, 7, 710–727.


Maruyama, W., Naoi, M., & Narabayashi, H.(1996). The metabolism of L-Dopa and L-threo-3,4-dihydroxyphenylser<strong>in</strong>e and <strong>the</strong>ir effects onmonoam<strong>in</strong>es <strong>in</strong> <strong>the</strong> human bra<strong>in</strong>: analysis of <strong>the</strong> <strong>in</strong>traventricularfluid from park<strong>in</strong>sonian patients. Journalof Neurological Sciences, 139, 141–148.Maura, G., Giardi, A., & Raiteri, M. (1988). Releaseregulat<strong>in</strong>gD-2 dopam<strong>in</strong>e receptors are located on striatalglutamatergic nerve term<strong>in</strong>als. Journal of Pharmacologyand Experimental Therapeutics, 247, 680–684.Mehta, M., Swa<strong>in</strong>son, R., Ogilvie, A., Sahakian, B.,& Robb<strong>in</strong>s, T. (2000). Improved short-term spatialmemory but impaired reversal learn<strong>in</strong>g follow<strong>in</strong>g <strong>the</strong>dopam<strong>in</strong>e D2 agonist bromocript<strong>in</strong>e <strong>in</strong> human volunteers.Psychopharmacology, 159, 10–20.Middleton, F. A., & Strick, P. L. (2000). <strong>Basal</strong> gangliaoutput and cognition: Evidence from anatomical, behavioral,and cl<strong>in</strong>ical studies. Bra<strong>in</strong> and Cognition,42, 183–200.M<strong>in</strong>k, J. (1996). The basal ganglia: Focused selectionand <strong>in</strong>hibition of compet<strong>in</strong>g motor programs.Progress <strong>in</strong> Neurobiology, 50, 381–425.Monchi, O., Petrides, M., Petre, V., Worsley, K., &Dagher, A. (2001). Wiscons<strong>in</strong> card sort<strong>in</strong>g revisited:Dist<strong>in</strong>ct neural circuits participat<strong>in</strong>g <strong>in</strong> differentstages of <strong>the</strong> task identified by event-related functionalmagnetic resonance imag<strong>in</strong>g. Journal of Neuroscience,21, 7733–7741.Monchi, O., Taylor, J. G., & Dagher, A. (2000). A neuralmodel of work<strong>in</strong>g memory processes <strong>in</strong> normal subjects,park<strong>in</strong>son’s disease and schizophrenia for fMRIdesign and predictions. Neural Networks, 13, 953.Nicola, S., Surmeier, J., & Malenka, R. (2000).<strong>Dopam<strong>in</strong>e</strong>rgic modulation of neuronal excitability <strong>in</strong><strong>the</strong> striatum and nucleus accumbens. Anuual Reviewof Neuroscience, 23, 185–215.Nieoullon, A. (2002). <strong>Dopam<strong>in</strong>e</strong> and <strong>the</strong> regulation ofcognition and attention. Progress <strong>in</strong> Neurobiology,67, 53–83.Nishi, A., Snyder, G., & Greengard, P. (1997). Bidirectionalregulation of DARPP-32 phosphorylation bydopam<strong>in</strong>e. Journal of Neuroscience, 17, 8147–8155.O’Donnell, P. (2003). <strong>Dopam<strong>in</strong>e</strong> gat<strong>in</strong>g of forebra<strong>in</strong> neuralensembles. European Journal of Neuroscience, 17,429–435.O’Reilly, R., & Frank, M. (submitted). Mak<strong>in</strong>g work<strong>in</strong>gmemory work: A computational model of learn<strong>in</strong>g <strong>in</strong><strong>the</strong> frontal cortex and basal ganglia.O’Reilly, R. C. (1998). Six pr<strong>in</strong>ciples for biologicallybasedcomputational models of cortical cognition.Trends <strong>in</strong> Cognitive Sciences, 2(11), 455–462.O’Reilly, R. C., & Munakata, Y. (2000). Computationalexplorations <strong>in</strong> cognitive neuroscience: Understand<strong>in</strong>g<strong>the</strong> m<strong>in</strong>d by simulat<strong>in</strong>g <strong>the</strong> bra<strong>in</strong>. Cambridge,MA: MIT Press.Packard, M., & Knowlton, B. (2002). Learn<strong>in</strong>g andmemory functions of <strong>the</strong> basal ganglia. Annual Reviewof Neuroscience, 25, 563–593.Partiot, A., Ver<strong>in</strong>, M., & Dubois, B. (1996). Delayedresponse tasks <strong>in</strong> basal ganglia lesions <strong>in</strong> man: Fur<strong>the</strong>revidence for a striato-frontal cooperation <strong>in</strong> behaviouraladaptation. Neuropsychologia, 34, 709.Percheron, G., & Filion, M. (1991). Parallel process<strong>in</strong>g<strong>in</strong> <strong>the</strong> basal ganglia: up to a po<strong>in</strong>t. Trends <strong>in</strong> Neurosciences,14, 55–56.Picconi, B., Centonze, D., Hakansson, K., Bernardi, G.,Greengard, P., Fisone, G., Cenci, M. A., & Calabresi,P. (2003). Loss of bidirectional striatal synaptic plasticity<strong>in</strong> l-dopa-<strong>in</strong>duced dysk<strong>in</strong>esia. Nature Neuroscience,6, 501–506.Poldrack, R., Prabakharan, V., Seger, C., & Gabrieli, J.(1999). Striatal activation dur<strong>in</strong>g cognitive skill learn<strong>in</strong>g.Neuropsychology, 13, 564–574.Poldrack, R. A., Clark, J., PareBlagoev, E. J., Shohamy,D., Moyano, J. C., Myers, C., & Gluck, M. A. (2001).Interactive memory systems <strong>in</strong> <strong>the</strong> human bra<strong>in</strong>. Nature,414, 546–549.Reber, P. J., & Squire, L. R. (1999). Intact learn<strong>in</strong>gof artificial grammars and <strong>in</strong>tact category learn<strong>in</strong>g bypatients with park<strong>in</strong>son’s disease. Behavioral Neuroscience,113, 235.Remy, P., Jackson, P., & Ribeiro, M. (2000). Relationshipsbetween cognitive deficits and dopam<strong>in</strong>ergicfunction <strong>in</strong> <strong>the</strong> striatum of Park<strong>in</strong>son’s disease patients:a PET study. Neurology, 54, A372.Robertson, G., V<strong>in</strong>cent, S., & Fibiger, H. (1992). D 1and D 2 dopam<strong>in</strong>e receptors differentially regulate c-fos expression <strong>in</strong> striatonigral and stiratopallidal neurons.Neuroscience, 49, 285–96.Rogers, R., Sahakian, B., Hodges, J., Polkey, C., Kennard,C., & Robb<strong>in</strong>s, T. (1998). Dissociat<strong>in</strong>g executivemechanisms of task control follow<strong>in</strong>g frontaldamage and Park<strong>in</strong>son’s disease. Bra<strong>in</strong>, 121, 815–842.Rolls, E., Thorpe, S., Boytim, M., Szabo, I., & Perrett,D. (1984). Responses of striatal neurons <strong>in</strong> <strong>the</strong> behav<strong>in</strong>gmonkey. 3. Effects of <strong>in</strong>tophoretically applieddopam<strong>in</strong>e on normal responsiveness. Neuroscience,12, 1201–12.


Sal<strong>in</strong>, P., Hajji, M., & Kerkerian-Le Goff, L. (1996).Bilateral 6-hydroxydopam<strong>in</strong>e-<strong>in</strong>duced lesion of <strong>the</strong>nigrostriatal dopam<strong>in</strong>e pathway reproduces <strong>the</strong> effectsof unilateral lesion on substance P but not onenkephal<strong>in</strong> expression <strong>in</strong> rat basal ganglia. EuropeanJournal of Neuroscience, 8, 1746–1757.Schultz, W. (1998). Predictive reward signal of dopam<strong>in</strong>eneurons. Journal of Neurophysiology, 80, 1.Schultz, W. (2002). Gett<strong>in</strong>g formal with dopam<strong>in</strong>e andreward. Neuron, 36, 241–263.Schultz, W., Apicella, P., & Ljungberg, T. (1993). Responsesof monkey dopam<strong>in</strong>e neurons to reward andconditioned stimuli dur<strong>in</strong>g successive steps of learn<strong>in</strong>ga delayed response task. Journal of Neuroscience,13, 900–913.Schultz, W., Dayan, P., & Montague, P. R. (1997). Aneural substrate of prediction and reward. Science,275, 1593.Smith, A., Neill, J., & Costall, B. (1999). The dopam<strong>in</strong>eD2/D3 receptor agonist 7-OH-DPAT <strong>in</strong>duces cognitiveimpairment <strong>in</strong> <strong>the</strong> marmoset. Pharmacology, Biochemistryand Behavior, 63, 201–211.Soliveri, P., Brown, R., Jahanshahi, M., Caraceni, T., &Marsden, C. (1997). Learn<strong>in</strong>g manual pursuit track<strong>in</strong>gskills <strong>in</strong> patients with Park<strong>in</strong>son’s disease. Bra<strong>in</strong>, 120,1325–1337.Stern, C., & Pass<strong>in</strong>gham, R. (1995). The nucleus accumbens<strong>in</strong> monkeys (macaca fascicularis). ExperimentalBra<strong>in</strong> Research, 106, 239–247.Suri, R. E., & Schultz, W. (1999). A neural networkmodel with dopam<strong>in</strong>e-like re<strong>in</strong>forcement signal thatlearns a spatial delayed response task. Neuroscience,91, 871.Surmeier, D., Bargas, J., Hemm<strong>in</strong>gs, H., Nairn, A., &Greengard, P. (1995). <strong>Modulation</strong> of calcium currentsby a D1 dopam<strong>in</strong>ergic prote<strong>in</strong> k<strong>in</strong>ase/phosphotasecascade <strong>in</strong> rat neostriatal neurons. Neuron, 14, 385–397.Surmeier, D., Song, W., & Yan, Z. (1996). Coord<strong>in</strong>atedexpression of dopam<strong>in</strong>e receptors <strong>in</strong> neostriatalmedium sp<strong>in</strong>y neurons. Journal of Neuroscience, 16,6579–91.Sutton, R. S. (1988). Learn<strong>in</strong>g to predict by <strong>the</strong> methodof temporal diferences. Mach<strong>in</strong>e Learn<strong>in</strong>g, 3, 9–44.Swa<strong>in</strong>son, R., Rogers, R. D., Sahakian, B., Summers,B., Polkey, C., & Robb<strong>in</strong>s, T. W. (2000). Probabilisticlearn<strong>in</strong>g and reversal deficits <strong>in</strong> patients withPark<strong>in</strong>son’s disease or frontal or temporal lobe lesions:Possible adverse effects of dopam<strong>in</strong>ergic medication.Neuropsychologia, 38, 596.Taylor, N., & Taylor, J. (1999). Modell<strong>in</strong>g <strong>the</strong> frontallobes <strong>in</strong> health and disease. Proceed<strong>in</strong>gs of <strong>the</strong> InternationalConference on Artificial Neural Networks,401–406.Thomas-Ollivier, V., Reymann, J., Le Moal, S., Schueck,S., Lieury, A., & Alla<strong>in</strong>, H. (1999). Procedural memory<strong>in</strong> recent-onset Park<strong>in</strong>son’s disease. Dementia andGeriatric Cognitive Disorders, 10, 172–180.Vriezen, E., & Moscovitch, M. (1990). Memory fortemporal order and conditional associative-learn<strong>in</strong>g <strong>in</strong>patients with Park<strong>in</strong>son’s disease. Neuropsychologia,28, 1283–1293.Wickens, J. (1997). <strong>Basal</strong> ganglia: Structure and computations.Network: Computation <strong>in</strong> Neural Systems, 8,R77–R109.Wilson, C., & Kawaguchi, Y. (1996). The orig<strong>in</strong>s of twostatespontaneous membrane potential fluctuations ofneostriatal sp<strong>in</strong>y neurons. Journal of Neuroscience,16, 2397–2410.Wilson, C. J. (1993). The generation of natural fir<strong>in</strong>gpatterns <strong>in</strong> neostriatal neurons. In G. W. Arbuthnott, &P. C. Emson (Eds.), Chemical signall<strong>in</strong>g <strong>in</strong> <strong>the</strong> basalganglia (progress <strong>in</strong> bra<strong>in</strong> research, vol. 99) (pp. 277–297). Amsterdam: Elsevier.Woodward, T. S., Bub, D. N., & Hunter, M. A. (2002).Task switch<strong>in</strong>g deficits associated with Park<strong>in</strong>son‘sdisease reflect depleted attentional resources. Neuropsychologia,40, 1948–1955.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!